Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8
Shawn Steele via Unicode
unicode at unicode.org
Tue May 16 16:15:53 CDT 2017
> Faster ok, privided this does not break other uses, notably for random access within strings…
Either way, this is a “recommendation”. I don’t see how that can provide for not-“breaking other uses.” If it’s internal, you can do what you will, so if you need the 1:1 seeming parity, then you can do that internally. But if you’re depending on other APIs/libraries/data source/whatever, it would seem like you couldn’t count on that. (And probably shouldn’t even if it was a requirement rather than a recommendation).
I’m wary of the idea of attempting random access on a stream that is also manipulating the stream at the same time (decoding apparently).
The U+FFFD emitted by this decoding could also require a different # of bytes to reencode. Which might disrupt the presumed parity, depending on how the data access was being handled.
-Shawn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20170516/799e6690/attachment.html>
More information about the Unicode
mailing list