Unicode is universal, so how come that universality doesn’t apply to digits?

Richard Wordingham richard.wordingham at ntlworld.com
Wed Dec 16 13:57:46 CST 2020


On Wed, 16 Dec 2020 18:34:55 +0100
Frédéric Grosshans via Unicode <unicode at unicode.org> wrote:

> It’s quite easy to make a lbrary which parses UniccodeData.txt
> (version 13.0 here) and extract the digit ranges of the various
> scripts and convert the various strings into number for the 50
> scripts listed in table 22-3 of the standard plus the western digits
> (Unicode 13.0 pdf here), it should be reasonably furureproof, in the
> sense that parsing future unicode datafile should add stipts as they
> are encoded. However, do not forget to check the exceptions in the
> text around this table in in the relevant script pages: in Unicode
> 13.0, it concerns Arabic, which has to sets of digits, Myanmar (3
> sets), and Tai Tham (2 sets).

Or just scan UnicodeData.txt for decimal digits with the value 0.

Richard.



More information about the Unicode mailing list