Missing Latin superscript lowercase letters
Asmus Freytag
asmusf at ix.netcom.com
Sat Mar 25 16:34:42 CDT 2023
Kent,
I'm not able to match your beautifully color-code reply chain, but here
goes.
On 3/25/2023 10:29 AM, Kent Karlsson via Unicode wrote:
> >>Further, if some symbol/letter for some reason only ever occurred in superscript
> >>position in math expressions, such examples would still be supporting evidence for
> >>that symbol/letter. The closest practical example I can think of is the
> degree sign, which
> >>in origin is a superscript 0.
> >The degree sign is either the exception that proves the rule, or
> something else: a symbol
> >that occurs frequently in contexts that are not full mathematical
> expressions, as it is typical
> True, but I was arguing against Peter Constable's postulation that
> something that (for whatever
> reason) occurs only in a superscript position in a math expression
> /could not/ have its encoding
> supported by an example where it occurred in a superscript position in
> a math expression.
> THAT postulation is false. (And the closest example I could think of
> was the degree sign; there
> MAY be examples of yet unencoded characters that only occur in
> superscript position in math
> expressions.)
This is an argument best explored when there's an actual test case.
In essence, modifier letters in phonetics fall into this category,
because ordinarily you don't expect to style phonetic notation other
than globally (e.g. font choice). They therefore can be argued to have
an identity that is different from simply superscripting the same letter
form. The latter looks the same, but we assert (via encoding) that they
are not the same thing. That fits the conception of phonetic notation
that every character individually stands for something specific.
Whereas in a mathematical expression, the identity of a letter doesn't
change, whether it's superscripted or not. It's clearly just a different
use of the same letter, which is underlined by the fact that
superscripting can be nested.
So, we would have to have a test case, not yet encoded, where there's a
different identity for the superscripted shape than if the same shape
were to be rendered normally.
The degree sign is a bad example, in a way, as it's clearly not a
superscript 'o' or '0' (letter/digit) but is correctly implemented as a
pure circle. That puts it in the category of symbols for which the size,
spacing and placement of the "ink" matters more than the resemblance of
that "ink" to other symbols. It is also not considered a "superscript
circle" (no compat decomp).
It is a good example in a different way, since it's clearly a character
for which the "ink" is always in a position and size as would be
appropriate for a superscript. I'm sure that if we encounter some other
character for which it would be inappropriate to give a compat decomp
that we would consider whether it should be encoded.
At that juncture, we would look at the context in which it is to be used.
> >for unit symbols. When used with temperature, it's interesting to note
> that not all temperature
> >scales use it consistently. You don't see it with Fahrenheit very often, for
> example, reflecting
> >differences in traditional keyboard layouts.
> Ok, let’s digress a bit… I do see that too, in news articles (in web
> apps) from USA and British news
> companies and see also “C” when degrees Celsius is meant. But writing
> farad (F) or coulomb (C)
> when referring to temperature is just horrible, and only embarrassing
> for the journalist who wrote
> that. (Another related horror is “kph”, and there you cannot even
> blame keyboard layouts.)
I think it goes a bit too far to assume that any and all unit
abbreviations have to be in the SI notation always. I'm sure there are
places where there are regulations that define the use of specific
abbreviations and in any contexts where they apply to SI, you would be
free to read "k" as kilo and "kph" as kilo-ph (and then reject that as
undefined). The same is not true for ordinary everyday usage in places
where SI units aren't customary.
Likewise, the "ph" suffix to mean "per hour" is well established in
places, while "/h" is not. That said, given that usage, I'd personally
prefer kmph over kph.
For example, in the weather forecast, 80F never refers to capacity, is
understood by the audience, and therefore there's no objection to that
usage on ground of confusion with SI units. However, usage is not
consistent, you see it both with and without the degree sign, and
without naming names, websites by academic institutions are just as
likely to leave it off as popular websites are likely to add it.
As you can see, actual usage is all over the place and as Unicode is not
prescriptive, we simply deal with what's out there.
> >Note that many unit symbols have one-off encodings that Unicode had to
> support via compatibility
> >characters or even canonical duplicates (think micro and Ohm vs. their Greek
> letter counterparts).
> >Without the need to support a transition from pre-existing character sets,
> these duplicates would
> >not exist. But they do
> Yes. (But not relevant to this discussion.)
> >and so does the degree sign.
> The degree sign is not a compatibility character. It “divorced” from
> superscript 0 looong before
> computers…
> >Neither of them, however, form precedents for non-compatibility characters.
> Not sure what that sentence means, since the premise is skewed.
The argument is that because there may be some characters that are used
in ways that justify direct encoding (whether for compatibility or
whatever), this does not serve as a blanket justification to extend that
treatment to others.
A./
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230325/e729aa70/attachment.htm>
More information about the Unicode
mailing list