What is the Unicode guidance regarding the use of a BOM as a UTF-8 encoding signature?
eliz at gnu.org
Sat Jun 6 01:39:44 CDT 2020
> CC: Alisdair Meredith <alisdairm at me.com>,
> Unicode Mail List
> <unicode at unicode.org>
> Date: Fri, 5 Jun 2020 22:33:23 +0000
> From: Shawn Steele via Unicode <unicode at unicode.org>
> I’ve been recommending that people assume documents are UTF-8. If the UTF-8 decoding fails, then
> consider falling back to some other codepage.
That strategy would fail with 7-bit ISO 2022 based encodings, no?
They look like plain 7-bit ASCII (which will not fail UTF-8), but
actually represent non-ASCII text.
More information about the Unicode