Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8
Hans Åberg via Unicode
unicode at unicode.org
Wed May 17 16:21:54 CDT 2017
> On 17 May 2017, at 23:18, Doug Ewell <doug at ewellic.org> wrote:
>
> Hans Åberg wrote:
>
>>> Far from solving the stated problem, it would introduce a new one:
>>> conversion from the "bad data" Unicode code points, currently
>>> well-defined, would become ambiguous.
>>
>> Actually not: just translate the invalid UTF-8 sequences into invalid
>> UTF-32.
>
> Far from solving the stated problem, it would introduce TWO new ones...
There is no good solution to the problem of illegal UTF-8 sequences, as the intent of those is not known.
More information about the Unicode
mailing list