Re: Unicode is universal, so how come that universality doesn’t apply to digits?

Kent Karlsson kent.b.karlsson at bahnhof.se
Tue Jan 19 13:28:28 CST 2021


>> 29 dec. 2020 kl. 22:08 skrev Asmus Freytag via Unicode <unicode at unicode.org>:

> It's also Unicode's (separate) business to provide the information needed to parse numbers, at least for decimal place-value systems,
> 
For digits used in decimal place-value systems where the most significant digit comes first; those have general category Nd.
> and included in that, to collect data on locale preference for number formatting.
> 
Well, CLDR’s business (though a Unicode consortium project); not as part of the Unicode standard.

Apart from place-value systems, CLDR also covers some other numbering systems that are still in modern use in certain contexts: armenian-lower, armenian-upper, cyrillic-lower, ethiopic, georgian, greek-lower, greek-upper, hebrew, hebrew-item, roman-lower, roman-upper, tamil (several of these have extensions for zero and negative; and some have been updated from my initial submission to CLDR). See https://github.com/unicode-org/cldr/blob/master/common/rbnf/root.xml or https://github.com/unicode-cldr/cldr-rbnf/blob/master/rbnf/root.json. Truely historical (i.e. no longer used) numbering systems are not (yet) covered. However, if you know Suzhou and counting rod numbering systems, you can help me confirm or improve the rules in CLDR ticket https://unicode-org.atlassian.net/browse/CLDR-4473 (look towards the end of the ticket, there are several versions of the rules there).

CJK numbering systems (not considering Suzhou and counting rod) are not in CLDR's ”root” but are covered in the respective locales as spell-out rules (I found that to be a more appropriate analysis, and the CLDR committee apparently agreed, at least technically). For instance, in https://github.com/unicode-org/cldr/blob/master/common/rbnf/zh.xml you find spellout-cardinal-financial. I will not list them all here, check them out in CLDR’s source (one problem is the different formats for the rules…).

You can check out the rules, and result samples, by using https://st.unicode.org/cldr-apps/numbers.jsp. It is not written by me, but it is a very handy tool for checking RBNF rules in CLDR as well at testing new RBNF rules (or rulesets actually). Caveat: To make new rules, or update/fix existing ones, requires some programming skills.

/Kent Karlsson

> And that's where the story, and this discussion, effectively ends.
> 
> A./

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20210119/5fb64a9b/attachment.htm>


More information about the Unicode mailing list