UCA question / Produce Collation Element Arrays

Mon Dec 4 05:59:21 CST 2017

On Sun, 3 Dec 2017 19:23:51 +0000
Richard Wordingham via CLDR-Users <cldr-users at unicode.org> wrote:

> But adding the fix does not preserve the order of all strings in
> the Tibetan script, only the order of linguistically plausible
> strings.
> To create a well-formed collation equivalent to DUCET, one has to add
> many more contractions - about 650 by my reckoning.

I've checked my calculations, and it's actually about 970 NFD entries.
They are:

CE(0FB2 x)           = CE(0FB2) CE(x)
CE(0FB2 x 0F80)      = CE(0FB2 0F80) CE(x)
CE(0FB2 x 0F71 0F80) = CE(0FB2 0F71 0F80) CE(x)

CE(0FB3 x)           = CE(0FB3) CE(x)
CE(0FB3 x 0F80)      = CE(0FB3 0F80) CE(x)
CE(0FB3 x 0F71 0F80) = CE(0FB3 0F71 0F80) CE(x)

wherever ccc(x) < ccc(0F71), i.e. ccc(x) < 129.

The first set undoes the changes wrought by adding the contraction
CE(0FB2 0F71) for the sake of WF5.  The second and third sets undo the
changes wrought by the first set.

Richard.