String Ranges in Unicode Sets

Mark Davis ☕️ mark at
Tue Sep 8 06:46:48 CDT 2015

On Tue, Sep 8, 2015 at 9:53 AM, Asmus Freytag (t) <asmus-inc at>

> it is implied the String Range formulation is a compact form.
> Can you prove that it doesn't create any set of strings that can't be
> specified in other ways (other than full enumeration of the strings?).

​t is simply a compact string representation, and is defined semantically
by what it expands to.
​ Just like character ranges, [a-z], etc. Of course, the underlying
implementation *could* differ, but that doesn't affect the semantics.

> What about set operations on sets with string ranges?

​Again, the range notation is just a formatting issue. Anything you can do
with [{ax}-{bz}​] you can also do with [{ax}{ay}{az}{bx}{by}{bz}​], and
vice versa, since the former is defined to be equivalent to the latter.
These are just string representations of the same *logical* underlying

> Can they be expressed (other than working them out and writing down the
> full enumeration of the resulting set)?

I'm not quite sure what you mean. That's like asking, "Can [a-z] be
expressed, ​other than by writing out the full enumeration [a b c d e ...
z]?". Well, yes. You could represent [a-z] in many ways:
[\p{ASCII}&\p{lu}], for example. Or [\u0061 \u0062 ...]. Or....

​But I'm probably misunderstanding what you are trying to say.​

Mark <>

*— Il meglio è l’inimico del bene —*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list