Question about Normalization in Unicode 16.0.0

Martin J. Dürst duerst at it.aoyama.ac.jp
Sun Apr 20 01:58:14 CDT 2025


Dear Unicoders,

At the recent RubyKaigi (https://rubykaigi.org/2025/), I helped upgrade 
Ruby from Unicode 15.1.0 to 16.0.0. The main issue there was new cases 
that were not yet handled by our implementation of Normalization.

I just want to check my understanding of these new cases. Although the 
following (eleven horizontal bars on top of a character) is completely 
hypothetical, it is my understanding that e.g. the sequence of
U+1611E U+16121 U+16121 U+16121 U+16121 U+16121 should be normalized to
U+16121 U+16121 U+16121 U+16121 U+16121 U+1611E. This would be expressed 
in Ruby with a test such as the following:

def test_gurung_khema
   assert_equal "\u{16121 16121 16121 16121 16121 1611E}",
        "\u{1611E 16121 16121 16121 16121 16121}".unicode_normalize(:nfc)
end

It would be good if a few examples like this would be added to the 
NormalizationTest.txt file in the future. I can help with this if needed.

Regards,   Martin.


More information about the Unicode mailing list