Character folding in text editors

Eli Zaretskii eliz at gnu.org
Sun Feb 21 10:23:04 CST 2016


> From: Philippe Verdy <verdy_p at wanadoo.fr>
> Date: Sun, 21 Feb 2016 00:19:19 +0100
> Cc: unicode Unicode Discussion <unicode at unicode.org>
> 
> It should also be noted that some kind of "folding" described/desired by
> Elias will likely fail his expectations, even when using collation data in
> CLDR tailored per language.

I don't think the issue at hand is how to implement the "ultimate"
character-folding feature.  As I wrote elsewhere in this thread, Emacs
has only made its first step on this long road; if the way to reach
the final goal is still foggy even for the experts, then Emacs is in
good company ;-)

What matters for us at this stage is whether what has been
implemented, however partial and incomplete, will be useful, and
whether it is deemed to be useful enough to be turned on by default.

Please keep in mind that Emacs currently doesn't even have
language-dependent case tables, and its sorting commands use
comparison by Unicode codepoints.  (A function that compares text by
locale-dependent collation rules was added only recently, and, since
it relies on the underlying libc for collation order, you must change
the locale to use the rules for another language.)  So these
capabilities are really only starting to emerge, and until there's a
reliable way of determining the language of a given chunk of text, the
solutions will continue to be clunky at best.  We are not looking for
the ultimate solutions, we are looking for useful evolutionary initial
steps.


More information about the Unicode mailing list