Dealing with Georgian capitalization in programming languages
Martin J. Dürst via Unicode
unicode at unicode.org
Thu Oct 4 04:37:25 CDT 2018
Ken, Markus,
Many thanks for your ideas, which I noted at
https://bugs.ruby-lang.org/issues/14839.
Regards, Martin.
On 2018/10/03 06:43, Ken Whistler wrote:
>
> On 10/2/2018 12:45 AM, Martin J. Dürst via Unicode wrote:
>> My questions here are:
>> - Has this been considered when Georgian Mtavruli was discussed in the
>> UTC?
>>
> Not explicitly, that I recall. The whole issue of titlecasing came up
> very late in the preparation of case mapping tables for Mtavruli and
> Mkhedruli for 11.0.
>
> But it seems to me that the problem you are citing can be avoided if you
> simply rethink what your "capitalize" means. It really should be
> conceived of as first lowercasing the *entire* string, and then
> titlecasing the *eligible* letters -- i.e., usually the first letter.
> (Note that this allows for the concept that titlecasing might then be
> localized on a per-writing-system basis -- the issue would devolve to
> determining what the rules are for "eligible" letters.) But the simple
> default would just be to titlecase the initial letter of each "word"
> segment of a string.
>
> Note that conceived this way, for the Georgian mappings, where the
> titlecase mapping for Mkhedruli is simply the letter itself, this
> approach ends up with:
>
> capitalize(mkhedrulistring) --> mkhedrulistring
>
> capitalize(MTAVRULISTRING) ==> titlecase(lowercase(MTAVRULISTRING)) -->
> mkhedrulistring
>
> Thus avoiding any mixed case.
More information about the Unicode
mailing list