Unicode Sets in 'Unicode Regular Expressions'
Phillips, Addison
addison at lab126.com
Tue May 27 17:36:04 CDT 2014
A "Unicode set" in this context means "a set of code points". This is discussed in section 1.2:
--
This is done by providing syntax for sets of characters based on the Unicode character properties, and allowing them to be mixed with lists and ranges of individual code points.
--
More generally, there is no term "Unicode set" defined, although is it referred to in places such as RL1.3 as a shorthand. It merely means "the set of all code points selected" (by whatever selection, subtraction, intersection, or differencing has been applied beginning from the Universal Character Set as a whole). Or at least this is how I have already read it.
Addison
> -----Original Message-----
> From: Unicode [mailto:unicode-bounces at unicode.org] On Behalf Of Richard
> Wordingham
> Sent: Tuesday, May 27, 2014 3:18 PM
> To: unicode at unicode.org
> Subject: Unicode Sets in 'Unicode Regular Expressions'
>
> UTS#18 'Unicode Regular Expressions' Version 17 Requirement RL1.3
> 'Subtraction and Intersection' talks of Unicode sets. What is the relevant
> definition of a 'Unicode set'? Is it a finite set of non-empty strings? Other
> possibilities that occur to me, depending on context, include sets of codepoints
> and sets of indecomposable codepoints.
>
> Richard.
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
More information about the Unicode
mailing list