Question about Perl5 extended UTF-8 design

Markus Scherer markus.icu at gmail.com
Thu Nov 5 12:15:28 CST 2015


On Thu, Nov 5, 2015 at 9:25 AM, Philippe Verdy <verdy_p at wanadoo.fr> wrote:

> (0xFF was reserved only in the old RFC version of UTF-8 when it allowed
> code points up to 31 bits, but even this RFC is obsolete and should no
> longer be used and it has never been approved by Unicode).
>

No, even in the original UTF-8 definition, "The octet values FE and FF
never appear." https://tools.ietf.org/html/rfc2279
The highest lead byte was 0xFD.

(For the "really original" version see
http://www.unicode.org/L2/Historical/wg20-n193-fss-utf.pdf)

In the current definition, "The octet values C0, C1, F5 to FF never
appear." https://tools.ietf.org/html/rfc3629 =
https://tools.ietf.org/html/std63

markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20151105/74425d35/attachment.html>


More information about the Unicode mailing list