Re: Unicode is universal, so how come that universality doesn’t apply to digits?
Karl Williamson
public at khwilliamson.com
Thu Mar 18 21:12:59 CDT 2021
On 12/16/20 2:32 PM, Bill Poser via Unicode wrote:
> It seems to me that, in spite of the superficial similarity of the way
> numbers are written in many languages, this is NOT, in general, a matter
> of encoding conversion or even transliteration but rather one of
> translation and therefore not part of Unicode for the same reason that
> Unicode does not handle the translation of text from, say, Japanese to
> English.
>
> There is, actually, a library, which I have written, that handles
> conversions between Unicode strings and integers for most systems of
> writing numbers. (I have yet to update it to handle some of the more
> recently encoded systems.) It is a C library which also has a TCL binding:
>
> http://billposer.org/Software/libuninum.html
> <http://billposer.org/Software/libuninum.html>
>
> It handles a number of systems that require algorithms rather different
> from that of atoi/strtol.
>
> Bill
>
Another tool option is that recent versions of Perl come with the
function num() in the Unicode::UCD module. If its input is a string
consisting of a single character, and that character has a defined
numeric value, it will return that value, converted to floating point if
necessary; it returns undef for characters without a numeric value
If called with a string consisting entirely of characters with category
Nd, all from the same block of 10 consecutive code points, it will
return the value they represent, assuming left-to-right positional
notation, so that the right-most digit is the one's position, next is
the 10's, etc. It returns undef for any other string longer than one
character.
More information about the Unicode
mailing list