Regular Expressions and Canonical Equivalence

Philippe Verdy verdy_p at
Sat May 16 11:29:18 CDT 2015

2015-05-16 17:02 GMT+02:00 Richard Wordingham <
richard.wordingham at>:

> There is an annoying error.  You appear to assume that U+0302 COMBINING
> CIRCUMFLEX ACCENT and U+0303 COMBINING TILDE commute, but they don't;
> they have the same combining class, namely 230.  I'm going to assume
> that 0303 is a typo for 0323.

Not a typo, and I did not made the assumption you suppose because I chose
then so that they were effectively using the **same** combining class, so
that they do not commute.
It was the key fact of my argument that destroys your argumentation.
Reread carefully and use the example string I gave and don't assume I
wanted to write u0323 instead of u0303.

And you'll see that backtracing is necessary for this case (EVEN if you
don't care about capture groups but you are only interested in the global
capture $0).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list