Character folding in text editors

Elias Mårtenson lokedhs at gmail.com
Sun Feb 21 00:35:29 CST 2016


On 21 February 2016 at 06:10, Asmus Freytag (t) <asmus-inc at ix.netcom.com>
wrote:

Unicode, even CLDR, doesn't nearly have enough data for the purpose.
> (and as a corollary of what Elias points out, it's likely to annoy users
> of every language, in that it would fold essential and non-essential
> distinctions indiscriminately).
>
> I've been working on this problem in the context of  international
> top-level domain names, where the aim of the project is to identify labels
> that are seen as "the same" by users of a given script (but, in cases of
> identical appearance, we also include those seen as identical by users
> across scripts).
>
> None of the working groups in this project has felt like turning to CLDR
> for this purpose, and so far, each has approached the issue in a way that
> is not linked to sorting.
>
> Finally, none has seen folding of diacritics as useful; however, in the
> case of Arabic, where optional combining marks simply are not supported (so
> as to avoid having to define a folding).
>
> (see
> https://www.icann.org/sites/default/files/lgr/lgr-1-arabic-script-01dec15-en.html
> )
>

Thank you, and everybody else who contributed information. This was very
useful to me.

I have never actually looked at the CLDR in detail, and I now realise that
I have some reading to do. We will see where this goes on the Emacs-devel
list.

Regards,
Elias
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20160221/0506546b/attachment.html>


More information about the Unicode mailing list