Split a UTF-8 multi-octet sequence such that it cannot be unambiguously restored?
Doug Ewell via Unicode
unicode at unicode.org
Mon Jul 24 17:35:43 CDT 2017
J Decker wrote:
> I generally accepted any utf-8 encoding up to 31 bits though ( since
> I was going from the original spec, and not what was effective limit
> based on unicode codepoint space)
Hey, everybody: Don't do that.
UTF-8 has been constrained to the Unicode code space (maximum U+10FFFF,
four bytes) for almost fourteen years now.
Doug Ewell | Thornton, CO, US | ewellic.org
More information about the Unicode