Best practices for replacing UTF-8 overlongs

Martin J. Dürst duerst at
Mon Dec 19 21:19:43 CST 2016

On 2016/12/20 11:35, Tex Texin wrote:
> Shawn,
> Ok, but that begs the questions of what to do...
> "All bets are off" is not instructive.

Well, it may be instructive in that its difficult to get software to 
decide what happened. A human may be in a better position to analyze the 
error and the cause(s) of the error, and to fix these.

> How software behaves in the face of invalid bytes, what it does with them, what it does about them, and how it continues (or not) still needs to be determined.

Yes, but that will depend on circumstances. In a safety-critical 
application, you'll want to do something different than if you are 
sending the text to a printer just to have a look at it.

Regards,   Martin.

More information about the Unicode mailing list