What is the Unicode guidance regarding the use of a BOM as a UTF-8 encoding signature?
Eli Zaretskii
eliz at gnu.org
Sat Jun 6 10:47:00 CDT 2020
> Date: Sat, 6 Jun 2020 09:43:34 -0600
> From: Doug Ewell via Unicode <unicode at unicode.org>
>
> Eli Zaretskii wrote:
>
> If you need to deal with an arbitrary set of encodings, such as CP1255 and CP1256 and 7-bit ISO 2022-based encodings, instead of just CP1252 versus UTF-8 as Karl stated, then auto-detection won't work without a fair amount of natural language context. Otherwise, the text really has to be tagged.
Yes, that's my experience as well.
More information about the Unicode
mailing list