Conflicts between UnicodeData.txt and EastAsianWidth.txt?

Laurentiu Iancu liancu at microsoft.com
Thu Nov 6 15:00:30 CST 2014


Hello,



It is not a contradiction.  The East_Asian_Width property values assigned to combining marks are described in Section 6.2 of UAX #11, at http://www.unicode.org/reports/tr11/#Combining:



“In particular, nonspacing marks do not possess actual advance width. Therefore, even when displaying combining marks, the East_Asian_Width property cannot be related to the advance width of these characters. However, it can be useful in determining the encoding length in a legacy encoding, or the choice of font for the range of characters including that nonspacing mark. The width of the glyph image of a nonspacing mark should always be chosen as the appropriate one for the width of the base character.”



The nonspacing kana voicing marks, U+3099 and U+309A, have the same classification: gc=Mn and ea=W.



Regards,

L.



-----Original Message-----
From: Unicode [mailto:unicode-bounces at unicode.org] On Behalf Of Mike FABIAN
Sent: Thursday, November 6, 2014 4:13 AM
To: unicode at unicode.org
Subject: Conflicts between UnicodeData.txt and EastAsianWidth.txt?





http://www.unicode.org/Public/7.0.0/ucd/EastAsianWidth.txt

contains:



    302A..302D;W     # Mn     [4] IDEOGRAPHIC LEVEL TONE MARK..IDEOGRAPHIC ENTERING TONE MARK



which gives us a width of 2 for these 4 characters (because of “W”).



But

http://www.unicode.org/Public/7.0.0/ucd/UnicodeData.txt

contains:



    302A;IDEOGRAPHIC LEVEL TONE MARK;Mn;218;NSM;;;;;N;;;;;

    302B;IDEOGRAPHIC RISING TONE MARK;Mn;228;NSM;;;;;N;;;;;

    302C;IDEOGRAPHIC DEPARTING TONE MARK;Mn;232;NSM;;;;;N;;;;;

    302D;IDEOGRAPHIC ENTERING TONE MARK;Mn;222;NSM;;;;;N;;;;;



Doesn’t “NSM” (non spacing mark) imply a with of 0?



Is that a contradition or is this on purpose?



--

Mike FABIAN <mfabian at redhat.com<mailto:mfabian at redhat.com>>

睡眠不足はいい仕事の敵だ。

_______________________________________________

Unicode mailing list

Unicode at unicode.org<mailto:Unicode at unicode.org>

http://unicode.org/mailman/listinfo/unicode
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20141106/d5849e8f/attachment.html>


More information about the Unicode mailing list