Incompleteness of Suzhou Numeral/FaMa encoding in Unicode

Harriet Riddle harjitmoe at outlook.com
Sun Jul 12 05:52:58 CDT 2020


> From: Unicode <unicode-bounces at unicode.org> on behalf of Phake Nick via Unicode <unicode at unicode.org>
> Sent: 12 July 2020 08:45
> To: Unicode Mailing List <unicode at unicode.org>
> Subject: Incompleteness of Suzhou Numeral/FaMa encoding in Unicode 
> […]
> […] The most important part is that, in most situation Suzhou numeral are supposed to be combined together. Most of the time there will be two lines, with the top line representing a string of numbers using Suzhou numerals, while the bottom lines represent their place value and unit. […]

Interesting.

It is probably worth noting that Unicode's current coverage of Suzhou numerals is essentially limited to what it inherited from Big5 and CSIC / CNS 11643.  Big5 and CNS 11643 also happen to be the main legacy charsets responsible for the weird and wonderful stylised underline characters in the CJK Compatibility Forms block, at U+FE34, and at U+FE49 through U+FE4F.  These, of course, are not of much use without specialised layout support either.  Although unlike the Suzhou numerals, the stylised underlines are basically pure legacy by this point.

For reference, the Suzhou numeral ranges in Big5, in CNS 11643 (as EUC-TW) and in Unicode:

– Big5 0xA2C3 through 0xA2CE
– EUC-TW 0xA4B5 though 0xA4C0 (with or without a prefixed 0x8EA1)
– Unicode U+3021 through U+3029, followed by U+3038 through U+303A.

(Self-pedantic note: the last three (ten, twenty and thirty) were, in Unicode prior to version 3.0, unified with their corresponding and homoglyphic hanzi.  Since Big5 mappings are nowadays mostly used for legacy compatibility, they are still more often than not implemented with their older mappings to U+5341, U+5344 and U+5345, rather than to U+3038 through U+303A.  Although, U+5341 and U+5345 also have Big5 representations in the hanzi section, and therefore do not round trip.)

-- Har.




More information about the Unicode mailing list