UCA question / Produce Collation Element Arrays
Kip Cole via CLDR-Users
cldr-users at unicode.org
Sat Dec 2 05:52:43 CST 2017
Markus, probably another dumb question but I’m making progress. In section 7.2 or TR10 the algorithm for producing a CE array says:
S2.1 Find the longest initial substring S at each point that has a match in the collation element table.
S2.1.1 If there are any non-starters following S, process each non-starter C.
S2.1.2 If C is an unblocked non-starter with respect to S, find if S + C has a match in the collation element table.
Note: This condition is specific to non-starters, and is not precisely the same as the concept of blocking in normalization, since it is dealing with look ahead for a discontiguous match, rather than with normalization forms. Hangul jamos and other starters are only supported with contiguous matches .
S2.1.3 If there is a match, replace S by S + C, and remove C.
For s2.1.1 I’m trying to confirm what “process each non-starter C” means. Best I understand so far it means “ignore” or “skip” all C that are non-starters. is that the correct interpretation?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20171202/307e3e63/attachment-0001.html>
More information about the CLDR-Users
mailing list