<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body>
<div class="moz-cite-prefix">Great, here is the change I'm making to
address this:</div>
<blockquote>
<div class="moz-cite-prefix">Protocol designers:</div>
<div class="moz-cite-prefix">
<ul>
<li>If possible, mandate use of UTF-8 without a BOM; diagnose
the presence of a BOM in consumed text as an error, and
produce text without a BOM.</li>
<li>Otherwise, if possible, mandate use of UTF-8 with or
without a BOM; accept and discard a BOM in consumed text,
and produce text without a BOM.</li>
<li>Otherwise, if possible, use UTF-8 as the default encoding
with use of other encodings negotiated using information
other than a BOM; accept and discard a BOM in consumed text,
and produce text without a BOM.</li>
<li>Otherwise, require the presence of a BOM to differentiate
UTF-8 encoded text in both consumed and produced text<b><font
color="#009900"> unless the absence of a BOM would
result in the text being interpreted as an ASCII-based
encoding and the UTF-8 text contains no non-ASCII
characters (the exception is intended to avoid the
addition of a BOM to ASCII text thus rendering such text
as non-ASCII)</font></b>. This approach should be
reserved for scenarios in which UTF-8 cannot be adopted as a
default due to backward compatibility concerns.<br>
</li>
</ul>
</div>
</blockquote>
<div class="moz-cite-prefix">Tom.<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 10/12/20 8:40 AM, Alisdair Meredith
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:A708823E-26F0-4C4C-85F7-F24EB32215C4@me.com">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
That addresses my main concern. Essentially, best practice (for
UTF-8) would be no BOM unless the document contains code points
that require multiple code units to express.
<div class=""><br class="">
</div>
<div class="">AlisdairM<br class="">
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Oct 11, 2020, at 23:22, Tom Honermann <<a
href="mailto:tom@honermann.net" class=""
moz-do-not-send="true">tom@honermann.net</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252" class="">
<div class="">
<div class="moz-cite-prefix">On 10/10/20 7:58 PM,
Alisdair Meredith via SG16 wrote:<br class="">
</div>
<blockquote type="cite"
cite="mid:263C91E2-8EB6-4102-981D-80A1CC44F45D@me.com"
class="">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252" class="">
One concern I have, that might lead into rationale for
the current discouragement,
<div class="">is that I would hate to see a best
practice that pushes a BOM into ASCII files.</div>
<div class="">One of the nice properties of UTF-8 is
that a valid ASCII file (still very common) is</div>
<div class="">also a valid UTF-8 file. Changing best
practice would encourage updating those</div>
<div class="">files to be no longer ASCII.</div>
</blockquote>
<p class="">Thanks, Alisdair. I think that concern is
implicitly addressed by the suggested resolutions, but
perhaps that can be made more clear. One possibility
would be to modify the "protocol designer" guidelines
to address the case where a protocol's default
encoding is ASCII based and to specify that a BOM is
only required for UTF-8 text that contains non-ASCII
characters. Would that be helpful?<br class="">
</p>
<p class="">Tom.<br class="">
</p>
<blockquote type="cite"
cite="mid:263C91E2-8EB6-4102-981D-80A1CC44F45D@me.com"
class="">
<div class=""><br class="">
</div>
<div class="">AlisdairM<br class="">
<div class=""><br class="">
<blockquote type="cite" class="">
<div class="">On Oct 10, 2020, at 14:54, Tom
Honermann via SG16 <<a
href="mailto:sg16@lists.isocpp.org" class=""
moz-do-not-send="true">sg16@lists.isocpp.org</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<meta http-equiv="content-type"
content="text/html; charset=windows-1252"
class="">
<div class="">
<p class="">Attached is a draft proposal for
the Unicode standard that intends to
clarify the current recommendation
regarding use of a BOM in UTF-8 text.
This is follow up to <a
moz-do-not-send="true"
href="https://corp.unicode.org/pipermail/unicode/2020-June/008713.html"
class="">discussion on the Unicode
mailing list</a> back in June.</p>
<p class="">Feedback is welcome. I plan to
<a moz-do-not-send="true"
href="https://www.unicode.org/pending/docsubmit.html"
class="">submit</a> this to the UTC in a
week or so pending review feedback.<br
class="">
</p>
<p class="">Tom.<br class="">
</p>
</div>
<span
id="cid:958C9297-66AC-4D88-8F0B-577B8BA2589E@nyc.rr.com"
class=""><Unicode-BOM-guidance.pdf></span>--
<br class="">
SG16 mailing list<br class="">
<a href="mailto:SG16@lists.isocpp.org"
class="" moz-do-not-send="true">SG16@lists.isocpp.org</a><br
class="">
<a class="moz-txt-link-freetext"
href="https://lists.isocpp.org/mailman/listinfo.cgi/sg16"
moz-do-not-send="true">https://lists.isocpp.org/mailman/listinfo.cgi/sg16</a><br
class="">
</div>
</blockquote>
</div>
<br class="">
</div>
<br class="">
<fieldset class="mimeAttachmentHeader"></fieldset>
</blockquote>
<p class=""><br class="">
</p>
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</blockquote>
<p><br>
</p>
</body>
</html>