Why does the spelling (capitalization) of decomposition types differ in DerivedDecompositionType.txt from UnicodeData.txt?
Phil Smith III
lists at akphs.com
Tue Feb 18 11:33:35 CST 2025
Thanks. I tend to agree--things that refer to the same thing should be the same.
I then wonder, "In what context does this matter, beyond PoE*?" Not saying it can't/shouldn't -- consistency is good even if it only avoids someone wondering one day whether two things really are the same or not! -- but is there a specific place where this difference causes a problem? Having one should make the argument even stronger for fixing it.
...phsiii
*Purity of Essence--see "Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb", 1964
-----Original Message-----
From: prospero <prospero at cyber-wizard.com>
Sent: Tuesday, February 18, 2025 12:00 PM
To: lists at akphs.com
Cc: unicode at corp.unicode.org
Subject: Re: RE: Why does the spelling (capitalization) of decomposition types differ in DerivedDecompositionType.txt from UnicodeData.txt?
In: https://www.unicode.org/Public/16.0.0/ucd/UnicodeData.txt
the decomposition type names are cammel-cased (surrounded by brackets), like this:
00A0;NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;NON-BREAKING SPACE;;;;
and:
00A8;DIAERESIS;Sk;0;ON;<compat> 0020 0308;;;;N;SPACING DIAERESIS;;;;
Whereas in: https://www.unicode.org/Public/16.0.0/ucd/extracted/DerivedDecompositionType.txt
the decomposition type names are capitalized on the first letter only, like this:
00A0 ; Nobreak # Zs NO-BREAK SPACE
and:
FB54 ; Initial # Lo ARABIC LETTER BEEH INITIAL FORM
> Sent: Tuesday, February 18, 2025 at 11:04 AM
> From: "Phil Smith III via Unicode" <unicode at corp.unicode.org>
> To: "'prospero'" <prospero at cyber-wizard.com>, unicode at corp.unicode.org
> Subject: RE: Why does the spelling (capitalization) of decomposition types differ in DerivedDecompositionType.txt from UnicodeData.txt?
>
> This sounds interesting, but with no links or other references is a bit opaque. Can you add more information?
>
> -----Original Message-----
> From: Unicode <unicode-bounces at corp.unicode.org> On Behalf Of prospero via Unicode
> Sent: Monday, February 17, 2025 3:11 PM
> To: unicode at corp.unicode.org
> Subject: Why does the spelling (capitalization) of decomposition types differ in DerivedDecompositionType.txt from UnicodeData.txt?
>
> For example, "Nobreak" in DerivedDecompositionType.txt vs "noBreak" in UnicodeData.txt. If the former is derived from the latter, shouldn't the spelling be identical?
More information about the Unicode
mailing list