Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

Alastair Houghton via Unicode unicode at
Fri Jun 2 03:02:25 CDT 2017

On 1 Jun 2017, at 19:44, Asmus Freytag via Unicode <unicode at> wrote:
> What's not OK is to take an existing recommendation and change it to something else, just to make bug reports go away for one implementations. That's like two sleepers fighting over a blanket that's too short. Whenever one is covered, the other is exposed.

That’s *not* what’s happening, however many times you and Henri make that claim.

> (If that language is not in the standard already, a strong "an implementation MUST not depend on the use of a particular strategy for replacement of invalid code sequences", clearly ought to be added).

It already says (p.127, section 3.9):

  Although a UTF-8 conversion process is required to never consume well-formed
  subsequences as part of its error handling for ill-formed subsequences, such
  a process is not otherwise constrained in how it deals with any ill-formed
  subsequence itself.

which probably covers that, no?

Kind regards,



More information about the Unicode mailing list