UAX44: loose matching of symbolic values and the `is` prefix

Doug Ewell doug at
Mon Jun 6 10:32:19 CDT 2016

Mathias Bynens wrote:

> Looking at implementations in the wild, Steven Levithan found
> (
> that some regex flavors use `Is` for scripts, some for blocks, some
> for scripts and blocks, some for neither. Since some script and block
> names collide, this causes problems, especially when porting regexes
> across flavors. 

Are script names and block names expected to share a common namespace?
If they don't, then there is no collision.

LM3 says to ignore initial (and non-final) "is" for all property aliases
and property value aliases, not just Script and Block values. There will
be a lot of "collisions" if you take all of those into consideration.

> The `is` prefix doesn’t provide any functionality that would otherwise
> be unavailable. It doesn’t add any value, yet causes incompatibility,
> author confusion, and it increases implementation complexity.

I don't see any evidence that it adds no value. Support for existing
implementations is value.

Doug Ewell | | Thornton, CO ����

More information about the Unicode mailing list