<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <div class="moz-cite-prefix">In looking at this question,

      identifiers are useful to consider, if for a different reason than

      the one given below for variable names.</div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">For domain names, for example, many

      script communities support both ASCII digits and native ones, some

      only support ASCII digits, even if a native system exists. (See

      <a class="moz-txt-link-freetext" href="https://icann.org/idn">https://icann.org/idn</a> and look for Second Level Reference LGRs)<br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">That seems to reflect an idea that

      there are usage domains (note, "domain" used in a separate sense

      here from "domain") where native digits would seem to be out of

      place. In terms of programming languages, it's not a leap to treat

      numerical expressions as a form of mathematical notation - which

      then would inherit the bias towards ASCII digits.</div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">Database fields (or their serialization

      into XML) are a different matter. You would expect users to be

      able to enter strings based on the definition of the field. If the

      field type is "text", "string" etc. you would expect to be able to

      enter any number string, up to and including mixed-script

      numerics.<br>

      <br>

      If the field type is "domain name" then certain strings might not

      be valid in certain zones. Just as mixed-script labels are

      discouraged.<br>

      <br>

      Now, if the field type is "decimal number" it would depend on the

      consumer whether this is intended for "mathematical" use, or

      whether to support any number system. <br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">Unicode makes the latter possible by

      guaranteeing that all decimal number systems (that are so marked)

      can be parsed using the same "subtract zero" method for each

      digit. That's where Unicode's responsibility ends (from the point

      of the Unicode Standard). The CLDR locale data repository (also

      maintained by the Unicode Consortium) may have additional data for

      number parsing and formatting.</div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">Those data may not support parsing or

      formatting arbitrary mixed-script digit combinations. That is also

      OK, because the data is geared towards getting the ordinary use of

      numbers correct for as many locales and languages, not to deal

      with fancyful stuff that doesn't have a real-life user community

      using it in daily life.</div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">If you like such playful stuff, you are

      welcome, but on your own. And that is as it should be.<br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">A./<br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">On 12/20/2020 1:40 PM, Doug Ewell via

      Unicode wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:20201220144014.665a7a7059d7ee80bb4d670165c8327d.10caafc26a.wbe@email15.godaddy.com">

      <pre class="moz-quote-pre" wrap="">Zach Lym wrote:

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">I don't think it's fair to dismiss this as "not a unicode problem."

As the OP pointed out, support for non-latin variable names is largely

due to Unicode's identity standard and extensive implementation

advice.

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap=""> 

I don't recall Roger saying anything about non-Latin variable names. He

wrote:

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">Why, for example, can’t a Bengali-speaking person create XML such as

this:

<সংখ্যা_ছাত্র>৪୨</সংখ্যা_ছাত্র>

or write a program assignment statement like this:

            সংখ্যা_ছাত্র = ৪୨

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap=""> 

This doesn't claim that the Bengali variable name

সংখ্যা_ছাত্র is not supported, but rather the

mixed Bengali/Oriya constant ৪୨. In fact, a few lines earlier Roger

wrote:

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">a Bengali-speaking person can write this:

             সংখ্যা_ছাত্র = 42

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap=""> 

so variable names aren't the issue.

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">The section on numbering (5.5) is only a page long and essentially

recommends handling decimal based numbering systems.  There isn't

nearly as much care given to this topic.

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap=""> 

Bengali and Oriya are decimal-based. (Whether they should be used

together in a single number is another matter.) The first paragraph of

Section 5.5 specifically discusses interpreting Devanagari digits as one

would interpret Basic Latin digits. I don't know what needs to be added

here.

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">There is a standard annex on mathematics, but that is in PDF form and

is largely concerned with parsing and display of mathematical

formulas.

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap=""> 

UTR #25 (a Technical Report, not a Standard Annex) does focus on Basic

Latin digits, at one point (2.2) claiming that Basic Latin digits are

essentially the only digits used in math, but it's true that the UTR is

about math notation and that isn't really in scope here. The fact that

the UTR is a PDF document doesn't seem pertinent.

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">However, as is the answer to most questions, it is a matter of time

and money. If someone is willing to spend the time expanding 5.5

writing a new annex, I am sure the Unicode committee would be happy to

review it.  Would you be interested in doing that legwork?

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap=""> 

Again, I don't see what is lacking in Section 5.5, especially

considering its Devanagari example. The legwork that needs to be done is

to make implementations more internationalized and more Unicode-aware.

--

Doug Ewell, CC, ALB | Thornton, CO, US | ewellic.org

</pre>

    </blockquote>

    <p><br>

    </p>

  </body>

</html>