Sorting notation

Richard Wordingham richard.wordingham at
Mon Feb 24 13:38:21 CST 2014

On Sun, 23 Feb 2014 23:13:53 +0100
Philippe Verdy <verdy_p at> wrote:

> 2014-02-23 22:32 GMT+01:00 Richard Wordingham <
> richard.wordingham at>:
> > On Sun, 23 Feb 2014 20:49:24 +0100
> > Philippe Verdy <verdy_p at> wrote:*At least, referring to
> > Version 24 of the LFML specification, I assume
> > Part 5 Section 3.5, which defines "&..<<", also applies to Section
> > 3.9, which purports to define the meaning of "&[before 2]..<<".
> > It's conceivable that I am wrong, and the meaning of "&[before 2]á
> > << ạ" is undefined.

> This looks like a cryptic notation anyway. If we assume that there's
> an implicit reset at start of a collation rule, and that collation
> does not define any relative order for the empty string, you could
> simply write this reset at level 2 as:
>   << á << ạ
> instead of the mysterious notation (and in fact verbose and probably
> inconsistant in the way the same level 2 is further used with "<<"):
>   &[before 2]á << ạ

My understanding of the meaning of the notation is that:

1) ạ is to have the same number and type of collation elements as á
currently has;
2) The last collation element of ạ that has a positive weight at level
2 is to be immediately before the corresponding collation element of
á at the secondary level;
3) No collation element is to be ordered between these two collation
elements; and
4) Their other collation elements are to be the same.

Thus, before the operation we have a̓ << á << à << ạ.  After it, we
have a̓ << ạ << á << à.  Is this really what your notation "<< á << ạ"
is intended to mean?  If we are looking for a brief notation, I think
"&á >> ạ" would be better.


More information about the Unicode mailing list