2nd draft proposal: Clarify guidance for use of a BOM as a UTF-8 encoding signature

Tom Honermann tom at honermann.net
Sat Jan 2 22:15:02 CST 2021


Happy New Year!  And what better way to start off a new year than by 
discussing the utility (or lack thereof) of BOMs in UTF-8 text!

Attached is a 2nd draft of a paper intended to clarify guidance in the 
Unicode standard for when a BOM should or should not be used in UTF-8 
text.  Discussion of the prior draft can be found in the Unicode.org 
mail archives 
<https://corp.unicode.org/pipermail/unicode/2020-October/009070.html>.  
This draft contains the following changes:

 1. An abstract was added.
 2. The Introduction section was modified as follows:
     1. A link to the email thread with initial draft feedback was added.
     2. The text was modified to highlight inconsistent interpretation
        of the existing guidance as opposed to the intent.
     3. A quote from section 2.13, "Special Characters" regarding
        Unicode signatures was added.
 3. The Proposed Resolution section was modified as follows:
     1. The section was renamed from "Possible Resolutions".
     2. The previously discussed possible changes are now presented as
        two distinct options.
     3. Proposed wording was added for the first option.
     4. The proposed wording for the second option was directed to
        section 23.8.
     5. Option 2 was modified as follows:
         1. The guidance for protocol designers was updated to avoid
            adding a BOM to ASCII text thus rendering such text non-ASCII.
         2. The guidance for text authors regarding when to use a BOM
            was expanded to cover files that may be opened by
            applications with different encoding expectations.

Thank you to everyone that shared their thoughts on the prior draft.

Assuming no substantially new feedback, I plan to submit this paper in a 
week or so.

Tom.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20210102/d43faf41/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Unicode-BOM-guidance.pdf
Type: application/pdf
Size: 84875 bytes
Desc: not available
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20210102/d43faf41/attachment.pdf>


More information about the Unicode mailing list