ISO 14651/14652 vs Unicode sorting
Ilya Zakharevich
nospam-abuse at ilyaz.org
Thu May 28 03:42:25 CDT 2020
I have been informed that according to the tables distributed with ISO
14651/14652, the following strings should be sorted in this order:
> foobar
> foo baz
Moreover, this is how glibc (and, as a corollary, all utilities) do
this in European locales on contemporary Linuxes.
I checked COBUILT, American Heritage, and Le Petit Robert II — and it
seems that they do indeed use this (brain damaged?) order. (Although
not, apparently, Le Petit Robert I — which SEEMS TO HAVE compound
words tackled at the end of the main record.)
However, this definitely contradicts what
https://icu4c-demos-7hxm2n5zgq-uc.a.run.app/icu-bin/collation.html
does with the default locale, and with `en´.
So what is the intended behavior: of ICU, or of ISO?!
Thanks,
Ilya
More information about the Unicode
mailing list