Compatibility decomposition for Hebrew and Greek final letters
eliz at gnu.org
Fri Feb 20 09:28:36 CST 2015
> Date: Fri, 20 Feb 2015 15:01:34 +0000
> From: Richard Wordingham <richard.wordingham at ntlworld.com>
> > Sorry, I don't think I follow: what is "processing for search orders"
> > to which you allude here?
> The examples in the CLDR root locale and in DUCET are the massive sets
> of 'contractions' of consonants with vowels written before the
> associated consonant in the scripts where spacing characters are stored
> in the order written, namely Thai, Lao, Tai Viet and, soon, New Tai
> Lue. When customised collations are applied, there are enormous sets
> for Burmese (in CLDR) and New Tai Lue (not published in CLDR). The
> latter two have 'logical order exception' final consonants. (The
> exception here is that the logical order of characters in a word is not
> the order one wants for sorting.)
OK, thanks for explaining that. Still, the DUCET data is not
> > I'm not talking about localized features, like for "å" to match "aa"
> > in Danish locales. I'm talking about matching strings that are
> > equivalent under canonical and compatibility decompositions.
> Nor was I. I was talking about the user interface - commands, menus
> and messages.
Ah, that's easy (for now): Emacs doesn't have a localized UI.
Everything in the UI is in US English. So this would be Someone
> > As for user sophistication, AFAIR, Microsoft Word finds "²" when you
> > search for "2" by default, so it sounds like Word considers all users
> > sophisticated enough for that. I think that's a solid enough
> > precedent to follow.
> But what switches the match off?
I'm not sure there _is_ a switch in Word. But my point is different:
the above example means an editor should have the capability of
matching such strings; whether it can or cannot be switched off is a
separate issue (in Emacs, I don't imagine users will settle for not
being able to switch it off and on as they see fit).
More information about the Unicode