<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">On 11/6/2024 3:41 AM, Christoph Päper
via Unicode wrote:<br>
</div>
<blockquote type="cite"
cite="mid:B811892F-AF60-47EE-A7B8-8A031AD56245@crissov.de">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
Thanks to Asmus Freytag for a very good synopsis of the current
state of affairs. <br>
<blockquote type="cite">
<div dir="ltr">
<p>In mathematical notation, any character can be a super or
subscript, …</p>
<p>There is generic use of (mostly) superscript numbers in
text, …<br>
</p>
<p>There are other notations, mainly phonetic, that have
super/subscript forms but do not<i> </i>need recursive
subscripting (…), the super or subscript form often acts
pretty much like any other letter in the notation, except
for its shape. Common to these notations is that there's a
fixed set of such shapes; they don't even cover a full basic
alphabet; (…).</p>
</div>
</blockquote>
In other words, linguists need to provide proof of prior use for
superscript and subscript (and also small capital) letters (mostly
Latin, but also several Greek and some Cyrillic) for them to be
encoded individually. <br>
</blockquote>
Correct, it needs to be proven (supported by evidence) that the
forms are unique elements of the notation, because each has a unique
purpose and meaning.<br>
<blockquote type="cite"
cite="mid:B811892F-AF60-47EE-A7B8-8A031AD56245@crissov.de">
<blockquote type="cite">
<div dir="ltr">
<p>(…) In text, the plain text does not carry font information
and it is fully acceptable to render the result in any font
that supports the letters in question. (…)</p>
</div>
</blockquote>
<blockquote type="cite">
<div dir="ltr">
<p>In math notation, you have the situation that
mathematicians have used the contrast between different font
shapes to carry meaning. (…)</p>
</div>
</blockquote>
<blockquote type="cite">
<div dir="ltr">
<p>Having the character for all shape variants used for
variables encoded directly makes this near plaintext form
very powerful. (…)</p>
<p> (…): the additions for phonetic notations will never cover
the generic use of math, while the few styled alphabets for
math do nothing for general text use. (…)</p>
</div>
</blockquote>
<style>@font-face { font-family: "Cambria Math"; }@font-face { font-family: Calibri; }@font-face { font-family: Aptos; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0in; font-size: 12pt; font-family: Aptos, sans-serif; }a:link, span.MsoHyperlink { color: rgb(70, 120, 134); text-decoration: underline; }span.EmailStyle18 { font-family: Calibri, sans-serif; color: rgb(10, 47, 65); }.MsoChpDefault { font-size: 11pt; }div.WordSection1 { page: WordSection1; }</style>
<div>Mathematicians, on the other hand, did not need to prove that
each and every Latin letter, in upper and lower cases, had
already been used in all of the typographic styles. They simply
got encoded as complete sets (i.e. “math alphabets”) under the
mere <i>assumption</i> that there was existing usage. </div>
</blockquote>
<p>Correct. It had been an established fact of mathematical notation
that a full (Basic Latin) set of these form part of conventional
mathematical notation. It's nothing to do with "assumption"; the
documented nature of the use is as a set.<br>
</p>
<blockquote type="cite"
cite="mid:B811892F-AF60-47EE-A7B8-8A031AD56245@crissov.de">
<div>However, Unicode still implausibly claims that it won’t
encode something – the “missing” Latin superscript, subscript
and smallcaps letters in particular – just for “completeness”. <br>
</div>
</blockquote>
There is no phonetic use that is a "set". Most of the desire for
"completeness" comes from users who have an interest in using these
to spell out words, rather than to have a more complete rendition of
existing phonetic text.<br>
<blockquote type="cite"
cite="mid:B811892F-AF60-47EE-A7B8-8A031AD56245@crissov.de">
<div><br>
</div>
<div>That’s a bit frustrating and inefficient. So much discussion
and confusion could have been avoided if Unicode had just
pragmatically added full basic (i.e. 26-letter) Latin alphabets
in superscript, subscript and smallcaps early on. One practical
disadvantage, with the missing ones being added gradually and
only after sufficient proof of existing usage has been provided,
is that fonts need to be updated over time and fallbacks to
other fonts need to be employed in the meantime, which leads to
unaesthetic results. </div>
</blockquote>
<p>This is as maybe.</p>
<p>One advantage of encoding only characters in actual use is that
they can be given the correct and specific properties at time of
encoding. For phonetic and other "alphabetic" use, there is no
inherent guarantee that shapes that are derived from the basic
alphabetic form are all mutually consistent in their use. Which is
another way the mathematical sets are distinct.</p>
<p>A./<br>
</p>
<p><br>
</p>
</body>
</html>