Dealing with Georgian capitalization in programming languages

Martin J. Dürst via Unicode unicode at unicode.org
Thu Oct 4 04:37:25 CDT 2018


Ken, Markus,

Many thanks for your ideas, which I noted at
https://bugs.ruby-lang.org/issues/14839.

Regards,   Martin.

On 2018/10/03 06:43, Ken Whistler wrote:
> 
> On 10/2/2018 12:45 AM, Martin J. Dürst via Unicode wrote:

>> My questions here are:
>> - Has this been considered when Georgian Mtavruli was discussed in the
>>   UTC?
>>
> Not explicitly, that I recall. The whole issue of titlecasing came up 
> very late in the preparation of case mapping tables for Mtavruli and 
> Mkhedruli for 11.0.
> 
> But it seems to me that the problem you are citing can be avoided if you 
> simply rethink what your "capitalize" means. It really should be 
> conceived of as first lowercasing the *entire* string, and then 
> titlecasing the *eligible* letters -- i.e., usually the first letter. 
> (Note that this allows for the concept that titlecasing might then be 
> localized on a per-writing-system basis -- the issue would devolve to 
> determining what the rules are for "eligible" letters.) But the simple 
> default would just be to titlecase the initial letter of each "word" 
> segment of a string.
> 
> Note that conceived this way, for the Georgian mappings, where the 
> titlecase mapping for Mkhedruli is simply the letter itself, this 
> approach ends up with:
> 
> capitalize(mkhedrulistring) --> mkhedrulistring
> 
> capitalize(MTAVRULISTRING) ==> titlecase(lowercase(MTAVRULISTRING)) --> 
> mkhedrulistring
> 
> Thus avoiding any mixed case.



More information about the Unicode mailing list