Why does the spelling (capitalization) of decomposition types differ in DerivedDecompositionType.txt from UnicodeData.txt?

Asmus Freytag asmusf at ix.netcom.com
Tue Feb 18 12:44:22 CST 2025


The spellings are equivalent under the naming rules. That's all that 
formally matters. Fixing this now, would break any literal-minded 
parsers for whichever file is changed, while not making a formal difference.

There are enough other idiosyncrasies in the way these files are 
organized, that this one is far from the worst.

The only rule that matters is that any of the values in 
PropertyValueAliases.txt, when matched without regard to case, hyphens, 
or underscore, matches all the other ones for the same property value.

For character names, spaces also don't count (but there are 2-3 odd 
exceptional names that need to be handled specially).

A./

On 2/18/2025 8:04 AM, Phil Smith III via Unicode wrote:
> This sounds interesting, but with no links or other references is a bit opaque. Can you add more information?
>
> -----Original Message-----
> From: Unicode <unicode-bounces at corp.unicode.org> On Behalf Of prospero via Unicode
> Sent: Monday, February 17, 2025 3:11 PM
> To: unicode at corp.unicode.org
> Subject: Why does the spelling (capitalization) of decomposition types differ in DerivedDecompositionType.txt from UnicodeData.txt?
>
> For example, "Nobreak" in DerivedDecompositionType.txt vs "noBreak" in UnicodeData.txt. If the former is derived from the latter, shouldn't the spelling be identical?
>
>



More information about the Unicode mailing list