Non-primary Weights of U+FFFE

Sun Mar 30 11:17:44 CDT 2014

On Sun, Mar 30, 2014 at 5:24 AM, Richard Wordingham <
richard.wordingham at ntlworld.com> wrote:

> Is there any reason that a CLDR-compliant collation algorithm should
> particularly care about the non-primary weights of U+FFFE?  So long as
> they satisfy the well-formedness conditions, all I can see is that
> having unique values *may* simplify sort key formation for reversed
> levels.
>

The non-primary weights need to be greater than the level separator(s) and
less than the weights of CEs that are ignorable on previous levels. It is
also important to generate the special weights on primary to tertiary
levels for shifted CEs, so that alternate=shifted works properly.

In ICU, we have test code that expects the same sort keys generated from
concatenating two strings with U+FFFE vs. calling ucol_mergeSortkeys() on
the two separate sort keys. The latter merges sort keys by copying each
level (separated by byte 01) from each sort key and inserting a byte 02
between the bytes from different sort keys. (see
ucol.h<http://www.icu-project.org/apiref/icu4c/ucol_8h.html>
)

markus
-- 
Google Internationalization Engineering
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140330/dbce961e/attachment-0001.html>