Unicode Regular Expressions, Surrogate Points and UTF-8

Markus Scherer markus.icu at gmail.com
Sat May 31 21:24:09 CDT 2014


On Sat, May 31, 2014 at 6:41 AM, Mark Davis ☕️ <mark at macchiato.com> wrote:

> I think you have a point here. We should probably change to:
>
> To meet this requirement, an implementation shall supply a mechanism for
> specifying any Unicode scalar value (from U+0000 to U+D7FF and U+E000 to
> U+10FFFF), using the hexadecimal code point representation.
>
> and then in the notes say that the same notation can be used for
> codepoints that are not scalar values, for implementation that handle them
> in Unicode strings.
>

This combination sounds good.
markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20140531/a69cb621/attachment.html>


More information about the Unicode mailing list