Why does the spelling (capitalization) of decomposition types differ in DerivedDecompositionType.txt from UnicodeData.txt?
prospero
prospero at cyber-wizard.com
Tue Feb 18 10:59:49 CST 2025
In: https://www.unicode.org/Public/16.0.0/ucd/UnicodeData.txt
the decomposition type names are cammel-cased (surrounded by brackets), like this:
00A0;NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;NON-BREAKING SPACE;;;;
and:
00A8;DIAERESIS;Sk;0;ON;<compat> 0020 0308;;;;N;SPACING DIAERESIS;;;;
Whereas in: https://www.unicode.org/Public/16.0.0/ucd/extracted/DerivedDecompositionType.txt
the decomposition type names are capitalized on the first letter only, like this:
00A0 ; Nobreak # Zs NO-BREAK SPACE
and:
FB54 ; Initial # Lo ARABIC LETTER BEEH INITIAL FORM
> Sent: Tuesday, February 18, 2025 at 11:04 AM
> From: "Phil Smith III via Unicode" <unicode at corp.unicode.org>
> To: "'prospero'" <prospero at cyber-wizard.com>, unicode at corp.unicode.org
> Subject: RE: Why does the spelling (capitalization) of decomposition types differ in DerivedDecompositionType.txt from UnicodeData.txt?
>
> This sounds interesting, but with no links or other references is a bit opaque. Can you add more information?
>
> -----Original Message-----
> From: Unicode <unicode-bounces at corp.unicode.org> On Behalf Of prospero via Unicode
> Sent: Monday, February 17, 2025 3:11 PM
> To: unicode at corp.unicode.org
> Subject: Why does the spelling (capitalization) of decomposition types differ in DerivedDecompositionType.txt from UnicodeData.txt?
>
> For example, "Nobreak" in DerivedDecompositionType.txt vs "noBreak" in UnicodeData.txt. If the former is derived from the latter, shouldn't the spelling be identical?
>
>
>
More information about the Unicode
mailing list