Unicode Regular Expressions, Surrogate Points and UTF-8
markus.icu at gmail.com
Sat May 31 21:28:27 CDT 2014
On Sat, May 31, 2014 at 1:59 AM, Richard Wordingham <
richard.wordingham at ntlworld.com> wrote:
> Bear in mind that a pattern \uD808 shall not match anything in a
> well-formed Unicode string.
Depends. See the definitions of Unicode strings vs. UTF strings.
\uD808\uDF45 specifies a sequence of two
Implementations that use Unicode 16-bit strings will usually treat this as
one supplementary code point.
In Java, there is no other way to escape one.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Unicode