Dealing with Georgian capitalization in programming languages

Martin J. Dürst via Unicode unicode at unicode.org
Tue Oct 2 02:45:31 CDT 2018


Since the last discussion on Georgian (Mtavruli) on this mailing list, I 
have been looking into how to implement it in the Programming language Ruby.

Ruby has four case-conversion operations for its class String:

upcase:   convert all characters to upper case
downcase: convert all characters to lower case
swapcase: switch upper to lower and lower to upper case
capitalize:  uppercase (or title-case) the first character of the 
string, lowercase the rest

'upcase' and 'downcase' don't pose problems. 'swapcase' doesn't cause 
problems assuming the input doesn't have any problems. The only 
operation that can cause problems is 'capitalize'.

When I say "cause problems", I mean producing mixed-case output. I 
originally thought that 'capitalize' would be fine. It is fine for 
lowercase input: I stays lowercase because Unicode Data indicates that 
titlecase for lowercase Georgian letters is the letter itself. But it 
will produce the apparently undesirable Mixed Case for ALL UPPERCASE input.

My questions here are:
- Has this been considered when Georgian Mtavruli was discussed in the
   UTC?
- How have any other implementers (ICU,...) addressed this, in
   particular the operation that's called 'capitalize' in Ruby?

Many thanks in advance for your input,

Regards,   Martin.


More information about the Unicode mailing list