<!DOCTYPE html>

<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <div class="moz-cite-prefix">On 11/6/2024 3:41 AM, Christoph Päper

      via Unicode wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:B811892F-AF60-47EE-A7B8-8A031AD56245@crissov.de">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      Thanks to Asmus Freytag for a very good synopsis of the current

      state of affairs. <br>

      <blockquote type="cite">

        <div dir="ltr">

          <p>In mathematical notation, any character can be a super or

            subscript, …</p>

          <p>There is generic use of (mostly) superscript numbers in

            text, …<br>

          </p>

          <p>There are other notations, mainly phonetic, that have

            super/subscript forms but do not<i> </i>need recursive

            subscripting (…), the super or subscript form often acts

            pretty much like any other letter in the notation, except

            for its shape. Common to these notations is that there's a

            fixed set of such shapes; they don't even cover a full basic

            alphabet; (…).</p>

        </div>

      </blockquote>

      In other words, linguists need to provide proof of prior use for

      superscript and subscript (and also small capital) letters (mostly

      Latin, but also several Greek and some Cyrillic) for them to be

      encoded individually. <br>

    </blockquote>

    Correct, it needs to be proven (supported by evidence) that the

    forms are unique elements of the notation, because each has a unique

    purpose and meaning.<br>

    <blockquote type="cite"

      cite="mid:B811892F-AF60-47EE-A7B8-8A031AD56245@crissov.de">

      <blockquote type="cite">

        <div dir="ltr">

          <p>(…) In text, the plain text does not carry font information

            and it is fully acceptable to render the result in any font

            that supports the letters in question. (…)</p>

        </div>

      </blockquote>

      <blockquote type="cite">

        <div dir="ltr">

          <p>In math notation, you have the situation that

            mathematicians have used the contrast between different font

            shapes to carry meaning. (…)</p>

        </div>

      </blockquote>

      <blockquote type="cite">

        <div dir="ltr">

          <p>Having the character for all shape variants used for

            variables encoded directly makes this near plaintext form

            very powerful. (…)</p>

          <p> (…): the additions for phonetic notations will never cover

            the generic use of math, while the few styled alphabets for

            math do nothing for general text use. (…)</p>

        </div>

      </blockquote>

      <style>@font-face { font-family: "Cambria Math"; }@font-face { font-family: Calibri; }@font-face { font-family: Aptos; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0in; font-size: 12pt; font-family: Aptos, sans-serif; }a:link, span.MsoHyperlink { color: rgb(70, 120, 134); text-decoration: underline; }span.EmailStyle18 { font-family: Calibri, sans-serif; color: rgb(10, 47, 65); }.MsoChpDefault { font-size: 11pt; }div.WordSection1 { page: WordSection1; }</style>

      <div>Mathematicians, on the other hand, did not need to prove that

        each and every Latin letter, in upper and lower cases, had

        already been used in all of the typographic styles. They simply

        got encoded as complete sets (i.e. “math alphabets”) under the

        mere <i>assumption</i> that there was existing usage. </div>

    </blockquote>

    <p>Correct. It had been an established fact of mathematical notation

      that a full (Basic Latin) set of these form part of conventional

      mathematical notation. It's nothing to do with "assumption"; the

      documented nature of the use is as a set.<br>

    </p>

    <blockquote type="cite"

      cite="mid:B811892F-AF60-47EE-A7B8-8A031AD56245@crissov.de">

      <div>However, Unicode still implausibly claims that it won’t

        encode something – the “missing” Latin superscript, subscript

        and smallcaps letters in particular – just for “completeness”. <br>

      </div>

    </blockquote>

    There is no phonetic use that is a "set". Most of the desire for

    "completeness" comes from users who have an interest in using these

    to spell out words, rather than to have a more complete rendition of

    existing phonetic text.<br>

    <blockquote type="cite"

      cite="mid:B811892F-AF60-47EE-A7B8-8A031AD56245@crissov.de">

      <div><br>

      </div>

      <div>That’s a bit frustrating and inefficient. So much discussion

        and confusion could have been avoided if Unicode had just

        pragmatically added full basic (i.e. 26-letter) Latin alphabets

        in superscript, subscript and smallcaps early on. One practical

        disadvantage, with the missing ones being added gradually and

        only after sufficient proof of existing usage has been provided,

        is that fonts need to be updated over time and fallbacks to

        other fonts need to be employed in the meantime, which leads to

        unaesthetic results. </div>

    </blockquote>

    <p>This is as maybe.</p>

    <p>One advantage of encoding only characters in actual use is that

      they can be given the correct and specific properties at time of

      encoding. For phonetic and other "alphabetic" use, there is no

      inherent guarantee that shapes that are derived from the basic

      alphabetic form are all mutually consistent in their use. Which is

      another way the mathematical sets are distinct.</p>

    <p>A./<br>

    </p>

    <p><br>

    </p>

  </body>

</html>