Additional decompositions in decomps.txt

Eli Zaretskii eliz at gnu.org
Mon Feb 22 13:10:54 CST 2016


> Cc: unicode at unicode.org
> From: Ken Whistler <kenwhistler at att.net>
> Date: Mon, 22 Feb 2016 10:10:35 -0800
> 
> You're not missing anything. This is a bug in the documentation of
> decomps.txt. Initially, added decompositions for the DUCET default
> weights were all tagged as <sort>. This results in a distinct *tertiary*
> weight in the initial collation weight values in DUCET. Later on,
> there turned up cases where an added decomposition for the DUCET
> input worked better *without* a distinct tertiary weight. In
> particular, this applies to the large collection of combining marks
> whose secondary weights are now collapsed into a smaller set of
> distinct values. It also applies to the o with stroke character you
> cite below. The documentation for decomps.txt just needs to be
> updated to reflect that new pattern.

OK, thanks.  So conceptually, all those additional decompositions are
all in the same class as those tagged "<sort>", in that they don't
originate from the UCD, but were added for collation purposes, is that
correct?


More information about the Unicode mailing list