<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <div class="moz-cite-prefix">On 10/16/20 2:33 PM, Shawn Steele

      wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:DM6PR00MB0667C2AA54194ECB4B1006ED82031@DM6PR00MB0667.namprd00.prod.outlook.com">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <meta name="Generator" content="Microsoft Word 15 (filtered

        medium)">

      <style><!--

/* Font Definitions */

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:"Yu Gothic";

        panose-1:2 11 4 0 0 0 0 0 0 0;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

@font-face

        {font-family:"Segoe UI Emoji";

        panose-1:2 11 5 2 4 2 4 2 2 3;}

@font-face

        {font-family:"\@Yu Gothic";

        panose-1:2 11 4 0 0 0 0 0 0 0;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

span.EmailStyle20

        {mso-style-type:personal-reply;

        font-family:"Calibri",sans-serif;

        color:windowtext;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-size:10.0pt;}

@page WordSection1

        {size:8.5in 11.0in;

        margin:1.0in 1.0in 1.0in 1.0in;}

div.WordSection1

        {page:WordSection1;}

/* List Definitions */

@list l0

        {mso-list-id:1825581357;

        mso-list-template-ids:-1413206734;}

ol

        {margin-bottom:0in;}

ul

        {margin-bottom:0in;}

--></style><!--[if gte mso 9]><xml>

<o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

<o:shapelayout v:ext="edit">

<o:idmap v:ext="edit" data="1" />

</o:shapelayout></xml><![endif]-->

      <div class="WordSection1">

        <p class="MsoNormal">Nobody’s going to consider #1 regardless of

          what wordsmithing is done in Unicode, people have had too much

          difficulty with BOMs for it to be considered as a serious

          standards based solution.</p>

      </div>

    </blockquote>

    It isn't clear to me that everyone will agree with that

    perspective.  I've heard from people that continue to use BOMs in

    UTF-8 text in this thread.  We have strong consensus within SG16

    that we don't want #1; additional support from outside WG21 will

    help to make the case for a different approach.<br>

    <blockquote type="cite"

cite="mid:DM6PR00MB0667C2AA54194ECB4B1006ED82031@DM6PR00MB0667.namprd00.prod.outlook.com">

      <div class="WordSection1">

        <p class="MsoNormal">  #4 isn’t portable. 

        </p>

      </div>

    </blockquote>

    Correct, but WG21 may find it sufficient for all implementations to

    provide some implementation-defined means to identify UTF-8 source

    code without that means being a portable solution.<br>

    <blockquote type="cite"

cite="mid:DM6PR00MB0667C2AA54194ECB4B1006ED82031@DM6PR00MB0667.namprd00.prod.outlook.com">

      <div class="WordSection1">

        <p class="MsoNormal"><o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">The “right” approach would be to ensure

          that the languages have ways of declaring a codepage (like a

          pragma or other magic semantic, options 2 & 3).</p>

      </div>

    </blockquote>

    That matches my preference.<br>

    <blockquote type="cite"

cite="mid:DM6PR00MB0667C2AA54194ECB4B1006ED82031@DM6PR00MB0667.namprd00.prod.outlook.com">

      <div class="WordSection1">

        <p class="MsoNormal"><o:p></o:p></p>

        <p class="MsoNormal"><br>

          The time invested on this problem should be spent on getting

          agreement with WG21 about what the declaration should be and

          seeing if there are any “gotchas” to something like #pragma

          UTF8.  IMO, it’s not the effort to try to get effort to tweak

          Unicode’s guidance in order to support the common view the

          BOMs are bad, which WG21 won’t be considering anyway. 

          <br>

        </p>

      </div>

    </blockquote>

    My motivation is not solely to support the eventual WG21 proposal. 

    The responses I've seen to the paper (some of which have been

    private) have made it clear to me that there is not a common

    understanding of what the Unicode standard states about use of a BOM

    as an encoding signature in UTF-8.  I think it is worth clarifying.<br>

    <blockquote type="cite"

cite="mid:DM6PR00MB0667C2AA54194ECB4B1006ED82031@DM6PR00MB0667.namprd00.prod.outlook.com">

      <div class="WordSection1">

        <p class="MsoNormal">

          <br>

          The biggest thing I can think of is that very few codepages

          would lend themselves to being declared in a portable manner. 

          Different OS’s/software/vendors have different implementations

          of various codepages.  Even ones that are nominally similar

          often are mistagged or have subtle differences.  <br>

          <br>

          In other words, “UTF8” is about the only “safe” encoding that

          won’t have edge cases. Something like “shift-jis” has multiple

          legacy variations that mean everything won’t always be the

          same.

        </p>

      </div>

    </blockquote>

    <p>I agree.  If WG21 opts to standardize an encoding declaration, I

      suspect we would only mandate support for UTF-8, and maybe ASCII

      with any other supported encodings being implementation-defined.</p>

    <p>Tom.<br>

    </p>

    <blockquote type="cite"

cite="mid:DM6PR00MB0667C2AA54194ECB4B1006ED82031@DM6PR00MB0667.namprd00.prod.outlook.com">

      <div class="WordSection1">

        <p class="MsoNormal"><o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">-Shawn<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <div>

          <div style="border:none;border-top:solid #E1E1E1

            1.0pt;padding:3.0pt 0in 0in 0in">

            <p class="MsoNormal"><b>From:</b> Tom Honermann

              <a class="moz-txt-link-rfc2396E" href="mailto:tom@honermann.net"><tom@honermann.net></a> <br>

              <b>Sent:</b> Friday, October 16, 2020 6:23 AM<br>

              <b>To:</b> Shawn Steele

              <a class="moz-txt-link-rfc2396E" href="mailto:Shawn.Steele@microsoft.com"><Shawn.Steele@microsoft.com></a>; J Decker

              <a class="moz-txt-link-rfc2396E" href="mailto:d3ck0r@gmail.com"><d3ck0r@gmail.com></a><br>

              <b>Cc:</b> <a class="moz-txt-link-abbreviated" href="mailto:sg16@lists.isocpp.org">sg16@lists.isocpp.org</a><br>

              <b>Subject:</b> Re: [SG16] Draft proposal: Clarify

              guidance for use of a BOM as a UTF-8 encoding signature<o:p></o:p></p>

          </div>

        </div>

        <p class="MsoNormal"><o:p> </o:p></p>

        <div>

          <p class="MsoNormal">On 10/14/20 3:21 PM, Shawn Steele wrote:<o:p></o:p></p>

        </div>

        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

          <p class="MsoNormal">How are you going to #include differently

            encoded source files?  I don’t see anything in this document

            that would make it possible to #include a file in a

            different encoding.  It’s unclear to me how your proposed

            document could be utilized to enable the scenario you’re

            interested in.<o:p></o:p></p>

        </blockquote>

        <p>My intention is to present various options for WG21 to

          consider along with a recommendation.  The options that have

          been identified so far are listed below.  Combinations of some

          of these options is a possibility.<o:p></o:p></p>

        <ol type="1" start="1">

          <li class="MsoNormal"

            style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0

            level1 lfo1">

            Use of a BOM to indicate UTF-8 encoded source files.  This

            matches existing practice for the Microsoft compiler.<o:p></o:p></li>

          <li class="MsoNormal"

            style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0

            level1 lfo1">

            Use of a #pragma.  This matches <a

href="https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.cbclx01/zos_pragma_filetag.htm"

              moz-do-not-send="true">

              existing practice</a> for the IBM compiler.<o:p></o:p></li>

          <li class="MsoNormal"

            style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0

            level1 lfo1">

            Use of a "magic" or "semantic" comment.  This matches <a

href="https://docs.python.org/3/reference/lexical_analysis.html#encoding-declarations"

              moz-do-not-send="true">

              existing practice</a> in Python.<o:p></o:p></li>

          <li class="MsoNormal"

            style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0

            level1 lfo1">

            Use of filesystem meta data.  This is an option for some

            compilers and is being considered for Clang on z/OS.<o:p></o:p></li>

        </ol>

        <p>The goal of this paper is to clarify guidance in the Unicode

          standard in order to better inform and justify a

          recommendation.  If the UTC were to provide a strong

          recommendation either for or against use of a BOM in UTF-8

          files, that would be a point either in favor or in opposition

          to option 1 above.  As is, based on my reading and a number of

          the responses I've seen, the guidance is murky.<o:p></o:p></p>

        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

          <p class="MsoNormal"> <o:p></o:p></p>

          <p class="MsoNormal">For mixed-encoding behavior the only

            thing I could imagine is adding some sort of preprocessor

            #codepage or something to the standard.  (Which would again

            take a while to reach critical mass.)<o:p></o:p></p>

        </blockquote>

        <p>Yes, deployment will take time in any case.  A goal would be

          to choose an option that can be used as an extension for

          previous C++ standards.  This may rule out option 2 above

          since some compilers diagnose use of an unrecognized pragma.<o:p></o:p></p>

        <p>Tom.<o:p></o:p></p>

        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

          <p class="MsoNormal"> <o:p></o:p></p>

          <p class="MsoNormal">-Shawn<o:p></o:p></p>

          <p class="MsoNormal"> <o:p></o:p></p>

          <div>

            <div style="border:none;border-top:solid #E1E1E1

              1.0pt;padding:3.0pt 0in 0in 0in">

              <p class="MsoNormal"><b>From:</b> Tom Honermann <a

                  href="mailto:tom@honermann.net" moz-do-not-send="true">

                  <tom@honermann.net></a> <br>

                <b>Sent:</b> Tuesday, October 13, 2020 9:47 PM<br>

                <b>To:</b> Shawn Steele <a

                  href="mailto:Shawn.Steele@microsoft.com"

                  moz-do-not-send="true"><Shawn.Steele@microsoft.com></a>;

                J Decker

                <a href="mailto:d3ck0r@gmail.com" moz-do-not-send="true"><d3ck0r@gmail.com></a><br>

                <b>Cc:</b> <a href="mailto:sg16@lists.isocpp.org"

                  moz-do-not-send="true">sg16@lists.isocpp.org</a><br>

                <b>Subject:</b> Re: [SG16] Draft proposal: Clarify

                guidance for use of a BOM as a UTF-8 encoding signature<o:p></o:p></p>

            </div>

          </div>

          <p class="MsoNormal"> <o:p></o:p></p>

          <div>

            <p class="MsoNormal">On 10/13/20 5:19 PM, Shawn Steele

              wrote:<o:p></o:p></p>

          </div>

          <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

            <p class="MsoNormal">IMO this document doesn’t solve your

              problem.  The problem of encourage use of UTF-8 in C++

              source code is a goal that most compilers/source code

              authors/etc are totally onboard with.<o:p></o:p></p>

            <p class="MsoNormal"> <o:p></o:p></p>

            <p class="MsoNormal">The source is already in an

              indeterminate state.  The desired end state is to have

              UTF-8 source code (without BOM), which is typically

              supported.  The difficulty is therefore getting from point

              A to point B.  As far as “Use Unicode” goes, there’s no

              issue, but trying to specify BOM as a protocol doesn’t

              really solve the problem, particularly in complex

              environments.<o:p></o:p></p>

          </blockquote>

          <p class="MsoNormal">I think there is a misunderstanding.  The

            intent of the paper is to provide rationale for the existing

            discouragement for use of a BOM in UTF-8 while acknowledging

            that, in some cases, it may remain useful.  My intent is to

            discourage use of a BOM for UTF-8 encoded source files -

            thereby arguing against standardizing the behavior exhibited

            by Microsoft Visual C++ today.<br>

            <br>

            <br>

            <o:p></o:p></p>

          <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

            <p class="MsoNormal"> <o:p></o:p></p>

            <p class="MsoNormal">If the compiler doesn’t handle BOM as

              expected, then you’ll get errors.  This can be further

              complicated by preprocessors, #include, resources, etc. 

              If “specifying BOM behavior in Unicode” could help solve

              the problem, then all of the tooling used by everyone

              would have to be updated to handle that (new)

              requirement.  If you could get everyone on the same page,

              they’d all use UTF-8, so you wouldn’t need to update the

              tooling.  If you don’t need to update the tooling, you

              wouldn’t need to update the best practices for BOMs.<o:p></o:p></p>

          </blockquote>

          <p>This paper does not propose "specifying BOM behavior in

            Unicode".  If you feel that it does, please read it again

            and let me know what leads you to believe that it does.<o:p></o:p></p>

          <p>The tooling isn't the problem.  The problem is the existing

            source code that is not UTF-8 encoded or that is UTF-8

            encoded with a BOM.  The deployment challenge is with those

            existing source files.  Microsoft Visual C++ is going to

            continue consuming source files using the Active Code Page

            (ACP) and IBM compilers on EBCDIC platforms are going to

            continue consuming source files using EBCDIC code pages. 

            The goal is to provide a mechanism where a UTF-8 encoded

            source file can #include a source file in another encoding

            or vice versa.  Any solution for that will require tooling

            updates (and that is ok).<o:p></o:p></p>

          <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

            <p class="MsoNormal"> <o:p></o:p></p>

            <p class="MsoNormal">Personally, I’d prefer if cases like

              this ignore BOMs (or use them to switch to UTF-8); eg:

              treat BOMs like whitespace.  But this isn’t a problem

              solvable by any recommendation by Unicode.<o:p></o:p></p>

          </blockquote>

          <p class="MsoNormal">When consuming text as UTF-8, I agree

            that ignoring a BOM is usually the right thing to do and

            would be the right thing to do when consuming source code.<br>

            <br>

            <br>

            <o:p></o:p></p>

          <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

            <p class="MsoNormal"> <o:p></o:p></p>

            <p class="MsoNormal">As you noted, many systems provide

              mechanisms for indicating that code is UTF-8 or compiling

              with UTF-8, regardless of BOM.<o:p></o:p></p>

          </blockquote>

          <p class="MsoNormal">Yes, but there is no standard solution,

            not even a defacto one, for consuming differently encoded

            source files in the same translation unit.<br>

            <br>

            <br>

            <o:p></o:p></p>

          <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

            <p class="MsoNormal"> <o:p></o:p></p>

            <p class="MsoNormal">A rather large codebase I’ve been

              working with has been working to remove encoding

              confusion, and it’s a big task

              <span style="font-family:"Segoe UI

                Emoji",sans-serif">😁</span><o:p></o:p></p>

          </blockquote>

          <p>Yes, yes it is.<o:p></o:p></p>

          <p>Tom.<o:p></o:p></p>

          <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

            <p class="MsoNormal"> <o:p></o:p></p>

            <p class="MsoNormal">-Shawn<o:p></o:p></p>

            <p class="MsoNormal"> <o:p></o:p></p>

            <div>

              <div style="border:none;border-top:solid #E1E1E1

                1.0pt;padding:3.0pt 0in 0in 0in">

                <p class="MsoNormal"><b>From:</b> Unicode <a

                    href="mailto:unicode-bounces@unicode.org"

                    moz-do-not-send="true">

                    <unicode-bounces@unicode.org></a> <b>On

                    Behalf Of </b>Tom Honermann via Unicode<br>

                  <b>Sent:</b> Tuesday, October 13, 2020 1:47 PM<br>

                  <b>To:</b> J Decker <a href="mailto:d3ck0r@gmail.com"

                    moz-do-not-send="true"><d3ck0r@gmail.com></a>;

                  Unicode List

                  <a href="mailto:unicode@unicode.org"

                    moz-do-not-send="true"><unicode@unicode.org></a><br>

                  <b>Cc:</b> <a href="mailto:sg16@lists.isocpp.org"

                    moz-do-not-send="true">sg16@lists.isocpp.org</a><br>

                  <b>Subject:</b> Re: [SG16] Draft proposal: Clarify

                  guidance for use of a BOM as a UTF-8 encoding

                  signature<o:p></o:p></p>

              </div>

            </div>

            <p class="MsoNormal"> <o:p></o:p></p>

            <div>

              <p class="MsoNormal">On 10/12/20 8:09 PM, J Decker via

                Unicode wrote:<o:p></o:p></p>

            </div>

            <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

              <div>

                <div>

                  <p class="MsoNormal"> <o:p></o:p></p>

                </div>

                <p class="MsoNormal"> <o:p></o:p></p>

                <div>

                  <div>

                    <p class="MsoNormal">On Sun, Oct 11, 2020 at 8:24 PM

                      Tom Honermann via Unicode <<a

                        href="mailto:unicode@unicode.org"

                        moz-do-not-send="true">unicode@unicode.org</a>>

                      wrote:<o:p></o:p></p>

                  </div>

                  <blockquote style="border:none;border-left:solid

                    #CCCCCC 1.0pt;padding:0in 0in 0in

6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">

                    <div>

                      <div>

                        <p class="MsoNormal">On 10/10/20 7:58 PM,

                          Alisdair Meredith via SG16 wrote:<o:p></o:p></p>

                      </div>

                      <blockquote

                        style="margin-top:5.0pt;margin-bottom:5.0pt">

                        <p class="MsoNormal">One concern I have, that

                          might lead into rationale for the current

                          discouragement,

                          <o:p></o:p></p>

                        <div>

                          <p class="MsoNormal">is that I would hate to

                            see a best practice that pushes a BOM into

                            ASCII files.<o:p></o:p></p>

                        </div>

                        <div>

                          <p class="MsoNormal">One of the nice

                            properties of UTF-8 is that a valid ASCII

                            file (still very common) is<o:p></o:p></p>

                        </div>

                        <div>

                          <p class="MsoNormal">also a valid UTF-8 file. 

                            Changing best practice would encourage

                            updating those<o:p></o:p></p>

                        </div>

                        <div>

                          <p class="MsoNormal">files to be no longer

                            ASCII.<o:p></o:p></p>

                        </div>

                      </blockquote>

                      <p>Thanks, Alisdair.  I think that concern is

                        implicitly addressed by the suggested

                        resolutions, but perhaps that can be made more

                        clear.  One possibility would be to modify the

                        "protocol designer" guidelines to address the

                        case where a protocol's default encoding is

                        ASCII based and to specify that a BOM is only

                        required for UTF-8 text that contains non-ASCII

                        characters.  Would that be helpful?<o:p></o:p></p>

                    </div>

                  </blockquote>

                  <div>

                    <p class="MsoNormal"> <o:p></o:p></p>

                  </div>

                  <div>

                    <p class="MsoNormal">'and to specify that a BOM is

                      only required for UTF-8 '  this should NEVER be

                      'required' or 'must', it shouldn't even be

                      'suggested'; fortunately BOM is just a ZWNBSP, so

                      it's certainly a 'may' start with a such and such.<o:p></o:p></p>

                  </div>

                  <div>

                    <p class="MsoNormal">These days the standard

                      'everything IS utf-8' works really well, except in

                      firefox where the charset is required to be

                      specified for JS scripts (but that's a bug in

                      that)<o:p></o:p></p>

                  </div>

                  <div>

                    <p class="MsoNormal">EBCDIC should be converted on

                      the edge to internal ascii, since, thankfully,

                      this is a niche application and everything thinks

                      in ASCII or some derivative thereof.<o:p></o:p></p>

                  </div>

                  <div>

                    <p class="MsoNormal">Byte Order Mark is irrelatvent

                      to utf-8 since bytes are ordered in the correct

                      order.<o:p></o:p></p>

                  </div>

                  <div>

                    <p class="MsoNormal">I have run into several editors

                      that have insisted on emitted BOM for UTF8 when

                      initially promoted from ASCII, but subsequently

                      deleting it doesn't bother anything.<o:p></o:p></p>

                  </div>

                </div>

              </div>

            </blockquote>

            <p class="MsoNormal">I mostly agree.  Please note that the

              paper suggests use of a BOM only as a last resort.  The

              goal is to further discourage its use with rationale.<br>

              <br>

              <br>

              <br>

              <o:p></o:p></p>

            <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

              <div>

                <div>

                  <div>

                    <p class="MsoNormal"> <o:p></o:p></p>

                  </div>

                  <div>

                    <p class="MsoNormal">I am curious though, what was

                      the actual problem you ran into that makes you

                      even consider this modification? 

                      <o:p></o:p></p>

                  </div>

                </div>

              </div>

            </blockquote>

            <p>I'm working on improving support for portable C++ source

              code.  Today, there is no character encoding that is

              supported by all C++ implementations (not even ASCII). 

              I'd like to make UTF-8 that commonly supported character

              encoding.  For backward compatibility reasons, compilers

              cannot change their default source code character encoding

              to UTF-8.<o:p></o:p></p>

            <p>Most C++ applications are created from components that

              have different release schedules and that are maintained

              by different organizations.  Synchronizing a conversion to

              UTF-8 across dependent projects isn't feasible, nor is

              converting all of the source files used by an application

              to UTF-8 as simple as just running them through 'iconv'. 

              Migration to UTF-8 will therefore require an incremental

              approach for at least some applications, though many are

              likely to find success by simply invoking their compiler

              with the appropriate -everything-is-utf8 option since most

              source files are ASCII.<o:p></o:p></p>

            <p>Microsoft Visual C++ recognizes a UTF-8 BOM as an

              encoding signature and allows differently encoded source

              files to be used in the same translation unit.  Support

              for differently encoded source files in the same

              translation unit is the feature that will be needed to

              enable incremental migration.  Normative discouragement

              (with rationale) for use of a BOM by the Unicode standard

              would be helpful to explain why a solution other than a

              BOM (perhaps something like

              <a

href="https://docs.python.org/3/reference/lexical_analysis.html#encoding-declarations"

                moz-do-not-send="true">

                Python's encoding declaration</a>) should be

              standardized in favor of the existing practice

              demonstrated by Microsoft's solution.<o:p></o:p></p>

            <p>Tom.<o:p></o:p></p>

            <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

              <div>

                <div>

                  <div>

                    <p class="MsoNormal"> <o:p></o:p></p>

                  </div>

                  <div>

                    <p class="MsoNormal">J<o:p></o:p></p>

                  </div>

                  <blockquote style="border:none;border-left:solid

                    #CCCCCC 1.0pt;padding:0in 0in 0in

6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">

                    <div>

                      <p>Tom.<o:p></o:p></p>

                      <blockquote

                        style="margin-top:5.0pt;margin-bottom:5.0pt">

                        <div>

                          <p class="MsoNormal"> <o:p></o:p></p>

                        </div>

                        <div>

                          <p class="MsoNormal">AlisdairM<o:p></o:p></p>

                          <div>

                            <p class="MsoNormal"><br>

                              <br>

                              <br>

                              <br>

                              <o:p></o:p></p>

                            <blockquote

                              style="margin-top:5.0pt;margin-bottom:5.0pt">

                              <div>

                                <p class="MsoNormal">On Oct 10, 2020, at

                                  14:54, Tom Honermann via SG16 <<a

                                    href="mailto:sg16@lists.isocpp.org"

                                    target="_blank"

                                    moz-do-not-send="true">sg16@lists.isocpp.org</a>>

                                  wrote:<o:p></o:p></p>

                              </div>

                              <p class="MsoNormal"> <o:p></o:p></p>

                              <div>

                                <div>

                                  <p>Attached is a draft proposal for

                                    the Unicode standard that intends to

                                    clarify the current recommendation

                                    regarding use of a BOM in UTF-8

                                    text.  This is follow up to

                                    <a

                                      href="https://corp.unicode.org/pipermail/unicode/2020-June/008713.html"

                                      target="_blank"

                                      moz-do-not-send="true">

                                      discussion on the Unicode mailing

                                      list</a> back in June.<o:p></o:p></p>

                                  <p>Feedback is welcome.  I plan to <a

href="https://www.unicode.org/pending/docsubmit.html" target="_blank"

                                      moz-do-not-send="true">

                                      submit</a> this to the UTC in a

                                    week or so pending review feedback.<o:p></o:p></p>

                                  <p>Tom.<o:p></o:p></p>

                                </div>

                                <p class="MsoNormal"><Unicode-BOM-guidance.pdf>--

                                  <br>

                                  SG16 mailing list<br>

                                  <a href="mailto:SG16@lists.isocpp.org"

                                    target="_blank"

                                    moz-do-not-send="true">SG16@lists.isocpp.org</a><br>

                                  <a

                                    href="https://lists.isocpp.org/mailman/listinfo.cgi/sg16"

                                    target="_blank"

                                    moz-do-not-send="true">https://lists.isocpp.org/mailman/listinfo.cgi/sg16</a><o:p></o:p></p>

                              </div>

                            </blockquote>

                          </div>

                          <p class="MsoNormal"> <o:p></o:p></p>

                        </div>

                        <p class="MsoNormal"><br>

                          <br>

                          <br>

                          <br>

                          <o:p></o:p></p>

                      </blockquote>

                      <p> <o:p></o:p></p>

                    </div>

                  </blockquote>

                </div>

              </div>

            </blockquote>

            <p> <o:p></o:p></p>

          </blockquote>

          <p> <o:p></o:p></p>

        </blockquote>

        <p><o:p> </o:p></p>

      </div>

    </blockquote>

    <p><br>

    </p>

  </body>

</html>