Unicode is universal, so how come that universality doesn’t apply to digits?

Richard Wordingham richard.wordingham at ntlworld.com
Mon Dec 21 05:27:44 CST 2020

On Sun, 20 Dec 2020 15:13:01 -0800
Asmus Freytag via Unicode <unicode at unicode.org> wrote:

> Those data may not support parsing or formatting arbitrary
> mixed-script digit combinations. That is also OK, because the data is
> geared towards getting the ordinary use of numbers correct for as
> many locales and languages, not to deal with fancyful stuff that
> doesn't have a real-life user community using it in daily life.

I can imagine a few situations where mixed sequences may occur.
Firstly, the early non-Indian Unicode usage of Tamil script place
notation would have required that the 'digit zero' come from another
script, as Unicode initially only supported Indian Tamil script usage,
which lacks a zero.

Secondly, but not strictly an example, it seems that the Lao-style of
the Tai Tham script will mix the use of the two digit sets.

I wouldn't be surprised at the use of eclectic mixes of Arabic digits
at the eastern end of the Arabic script domain.  The glyph shapes of
the EXTENDED ARABIC-INDIC digits are language-dependent, and
language-dependence has only recently hit mainstream rendering for the

I wouldn't be surprised to find mixed selections in use in the Union of
Burma.  That could be a big nuisance, because the three series of
digits provide some opportunity for digits to spoof digits!


More information about the Unicode mailing list