What is the Unicode guidance regarding the use of a BOM as a UTF-8 encoding signature?

Eli Zaretskii eliz at gnu.org
Sat Jun 6 10:47:00 CDT 2020


> Date: Sat, 6 Jun 2020 09:43:34 -0600
> From: Doug Ewell via Unicode <unicode at unicode.org>
> 
> Eli Zaretskii wrote:
> 
> If you need to deal with an arbitrary set of encodings, such as CP1255 and CP1256 and 7-bit ISO 2022-based encodings, instead of just CP1252 versus UTF-8 as Karl stated, then auto-detection won't work without a fair amount of natural language context. Otherwise, the text really has to be tagged.

Yes, that's my experience as well.


More information about the Unicode mailing list