Annoyances from Implementation of Canonical Equivalence (was: Pure Regular Expression Engines and Literal Clusters)

Richard Wordingham via Unicode unicode at
Mon Oct 14 18:23:59 CDT 2019

On Mon, 14 Oct 2019 21:41:19 +0300
Eli Zaretskii via Unicode <unicode at> wrote:

> > Date: Mon, 14 Oct 2019 19:29:39 +0100
> > From: Richard Wordingham via Unicode <unicode at>

> > The official position is that text that is canonically
> > equivalent is the same.  There are problem areas where traditional
> > modes of expression require that canonically equivalent text be
> > treated differently.  For these, it is useful to have tools that
> > treat them differently.  However, the normal presumption should be
> > that canonically equivalent text is the same.  

> I'm well aware of the official position.  However, when we attempted
> to implement it unconditionally in Emacs, some people objected, and
> brought up good reasons.  You can, of course, elect to disregard this
> experience, and instead learn it from your own.

Is there a good record of these complaints anywhere?  It is annoying
when a text entry function does not keep the text as one enters it, but
it would be interesting to know what the other complaints were.  (It
would occasionally be useful to have an easily issued command like
'delete preceding NFD codepoint'.)  I did mention above that
occasionally one needs to know what codepoints were used and in what


More information about the Unicode mailing list