<div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Oct 14, 2020, 1:52 AM Andrew West via Unicode <<a href="mailto:unicode@unicode.org">unicode@unicode.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">It is just as good a way to identify UTF-8 data as a BOM in UTF-18<br>

data is for identifying UTF-16BE and UTF-16LE data.<br></blockquote></div></div><div dir="auto"><br></div><div dir="auto">No, it's not. UTF-16/32 is basically the only encodings to use more than 8 bits to encode all characters. It's expected to use a general purpose signature reader to identify UTF-16. UTF-8, on the other hand, was designed and is used in a world of ASCII extensions where it's often expected that the encoding can be named near the start of the file with no need for nonASCII characters before the encoding declaration. A UTF-8 BOM breaks that assumption.</div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

</blockquote></div></div></div>