<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <div class="moz-cite-prefix">Le 16/12/2020 à 18:34, Frédéric

      Grosshans a écrit :<br>

    </div>

    <blockquote type="cite"

      cite="mid:a356bb22-4cc4-e769-57b6-d6a33a947625@gmail.com">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <div class="moz-cite-prefix">Le 16/12/2020 à 14:47, Roger L

        Costello via Unicode a écrit :<br>

      </div>

      <blockquote type="cite"

cite="mid:SA0PR09MB6907DE4090F17A22E382CF4CC8C50@SA0PR09MB6907.namprd09.prod.outlook.com">

        <meta http-equiv="Content-Type" content="text/html;

          charset=UTF-8">

        <meta name="Generator" content="Microsoft Word 15 (filtered

          medium)">

        <style><!--

/* Font Definitions */

@font-face

        {font-family:Wingdings;

        panose-1:5 0 0 0 0 0 0 0 0 0;}

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

@font-face

        {font-family:"Nirmala UI";

        panose-1:2 11 5 2 4 2 4 2 2 3;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin-top:0in;

        margin-right:0in;

        margin-bottom:8.0pt;

        margin-left:0in;

        line-height:106%;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;}

p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph

        {mso-style-priority:34;

        margin-top:0in;

        margin-right:0in;

        margin-bottom:8.0pt;

        margin-left:.5in;

        mso-add-space:auto;

        line-height:106%;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;}

p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst

        {mso-style-priority:34;

        mso-style-type:export-only;

        margin-top:0in;

        margin-right:0in;

        margin-bottom:0in;

        margin-left:.5in;

        mso-add-space:auto;

        line-height:106%;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;}

p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle

        {mso-style-priority:34;

        mso-style-type:export-only;

        margin-top:0in;

        margin-right:0in;

        margin-bottom:0in;

        margin-left:.5in;

        mso-add-space:auto;

        line-height:106%;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;}

p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast

        {mso-style-priority:34;

        mso-style-type:export-only;

        margin-top:0in;

        margin-right:0in;

        margin-bottom:8.0pt;

        margin-left:.5in;

        mso-add-space:auto;

        line-height:106%;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;}

span.EmailStyle17

        {mso-style-type:personal-compose;

        font-family:"Calibri",sans-serif;

        color:windowtext;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-family:"Calibri",sans-serif;}margin-bottom:0in;}

ul

        {margin-bottom:0in;}</style><!--[if gte mso 9]><xml>

<o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

<o:shapelayout v:ext="edit">

<o:idmap v:ext="edit" data="1" />

</o:shapelayout></xml><![endif]-->Setting aside the Bengali/Oriya

        problem I stress above, your critics should be addressed

        somewhere else, since the Unicode standard is specifically

        organized to make this possible and easy, down to variants of

        this “hack”</blockquote>

    </blockquote>

    <p>Just a small complement: a function in Python which reads base 10

      numbers encoded in Unicode (and fails if given something else,

      including mixture of several digits) is as simple as the

      following:</p>

    <p><font face="monospace">Python 3.8.6 (default, Sep 25 2020,

        09:36:53) </font><br>

    </p>

    <pre>᧚</pre>

    <p><font face="monospace">[GCC 10.2.0] on linux<br>

        Type "help", "copyright", "credits" or "license" for more

        information.<br>

        >>> def unicodeint(s):<br>

        ...    

sdigitszero="0٠۰߀०০੦૦୦௦౦೦൦෦๐໐༠၀႐០᠐᥆᧐᪀᪐᭐᮰᱀᱐꘠꣐꤀꧐꧰꩐꯰０𐒠𐴰𑁦𑃰𑄶𑇐𑋰𑑐𑓐𑙐𑛀𑜰𑣠𑥐𑱐𑵐𑶠𖩠𖭐𝟎𝟘𝟢𝟬𝟶𞅀𞋰𞥐🯰"<br>

        ...     #/!\ contains RTL characters, notably 𞥐 U+1E950 ADLAM

        DIGIT ZERO towards the end<br>

        ...     #Extracted from UnicodeData.txt for Unicode 13.0.0<br>

        ...     ofirst=ord(s[0])<br>

        ...     if ofirst > ord(sdigitszero[-1])+9 : raise ValueError

        #1st char not a digit<br>

        ...     for zx in sdigitszero:<br>

        ...         z=ord(zx)<br>

        ...         if ofirst<z : raise ValueError #1st char not a

        digit<br>

        ...         if z<=ofirst<=z+9 : break #z is the zero<br>

        ...     z-=ord('0')<br>

        ...     return int(''.join(chr(ord(c)-z) for c in s))<br>

        ... <br>

        >>> unicodeint('৪২')<br>

        42<br>

      </font><br>

      Of course, it‘s a quick hack, which probably should be optimized

      and also should take special cases into account, notably ᧚<span

        style="left: 60px; top: 772.473px; font-size: 16.6px;

        font-family: sans-serif; transform: scaleX(0.829451);"> U+19DA </span><span

        style="left: 60px; top: 772.473px; font-size: 16.6px;

        font-family: sans-serif; transform: scaleX(0.829451);">NEW TAI

        LUE THAM DIGIT ONE , Braille numbers, etc.</span> <font

        face="monospace"><br>

      </font></p>

    <p>But the point is: Unicode indeed makes the parsing of many

      base-10 numerals used for many scripts easy, since the small code

      snippet above works for almost 50 scripts. Note that the hardcoded

      string <font face="monospace">sdigitszero</font> is simply the

      unicode characters of category <font face="monospace">Nd</font>

      with <font face="monospace">Decimal_Digit_Value==0</font> and

      nothing else.<font face="monospace"><br>

      </font></p>

    <p><font face="monospace"><br>

      </font></p>

    <p><font face="monospace">  Frédéric<br>

      </font></p>

    <p>   <br>

      <span style="left: 60px; top: 772.473px; font-size: 16.6px;

        font-family: sans-serif; transform: scaleX(0.829451);"></span><span

        style="left: 253.201px; top: 767.891px; font-size: 16.6px;

        font-family: sans-serif; transform: scaleX(0.940292);"></span><span

        style="left: 458.501px; top: 772.473px; font-size: 16.6px;

        font-family: sans-serif; transform: scaleX(0.822329);"></span></p>

  </body>

</html>