Character folding in text editors
doug at ewellic.org
Sat Feb 20 15:43:15 CST 2016
Eli Zaretskii wrote:
> What about language-independent character-folding: where in the
> Unicode database is the data for that?
The OP kind of alluded to that: there is no such thing really as
language-independent character folding.
About the closest approximation you can get using Unicode data alone
(not CLDR) is to normalize to NFD, then ignore the combining diacritics.
But that still doesn't work for a character like ø, which doesn't
decompose to o + anything, and more importantly, it still won't meet
expectations because of the n/ñ and o/ö/ø language-dependency problems.
As Mark and Philippe said, the real solution is to use CLDR, because
that is where language-dependent information like this lives.
Doug Ewell | http://ewellic.org | Thornton, CO
More information about the Unicode