Annoyances from Implementation of Canonical Equivalence (was: Pure Regular Expression Engines and Literal Clusters)

Richard Wordingham via Unicode unicode at
Tue Oct 15 14:52:15 CDT 2019

On Tue, 15 Oct 2019 09:43:23 +0300
Eli Zaretskii via Unicode <unicode at> wrote:

> > Date: Tue, 15 Oct 2019 00:23:59 +0100
> > From: Richard Wordingham via Unicode <unicode at>
> >   
> > > I'm well aware of the official position.  However, when we
> > > attempted to implement it unconditionally in Emacs, some people
> > > objected, and brought up good reasons.  You can, of course, elect
> > > to disregard this experience, and instead learn it from your
> > > own.  
> > 
> > Is there a good record of these complaints anywhere?  
> You could look up these discussions:

These are complaints about primary-level searches, not canonical

> > (It would occasionally be useful to have an easily issued command
> > like 'delete preceding NFD codepoint'.)  
> I agree.  Emacs commands that delete characters backward (usually
> invoked by the Backspace key) do that automatically, if the text
> before cursor was produced by composing several codepoints.

That's pretty standard, though it looks as though GTK has chosen to
reject the principle that backwards deletion deletes the last character

> Sure.  There's an Emacs command (C-u C-x =) which shows that
> information for the text at a given position.

Or commands what-cursor-position and describe-char if an emulator
gets in the way.  Having forward-char-intrusive would make it perfect.


More information about the Unicode mailing list