Missing Latin superscript lowercase letters

Asmus Freytag asmusf at ix.netcom.com
Sat Mar 25 16:34:42 CDT 2023


Kent,

I'm not able to match your beautifully color-code reply chain, but here 
goes.

On 3/25/2023 10:29 AM, Kent Karlsson via Unicode wrote:
> >>Further, if some symbol/letter for some reason only ever occurred in superscript
> >>position in math expressions, such examples would still be supporting evidence for
> >>that symbol/letter. The closest practical example I can think of is the 
> degree sign, which
> >>in origin is a superscript 0.
> >The degree sign is either the exception that proves the rule, or 
> something else: a symbol
> >that occurs frequently in contexts that are not full mathematical 
> expressions, as it is typical
> True, but I was arguing against Peter Constable's postulation that 
> something that (for whatever
> reason) occurs only in a superscript position in a math expression 
> /could not/ have its encoding
> supported by an example where it occurred in a superscript position in 
> a math expression.
> THAT postulation is false. (And the closest example I could think of 
> was the degree sign; there
> MAY be examples of yet unencoded characters that only occur in 
> superscript position in math
> expressions.)

This is an argument best explored when there's an actual test case.

In essence, modifier letters in phonetics fall into this category, 
because ordinarily you don't expect to style phonetic notation other 
than globally (e.g. font choice). They therefore can be argued to have 
an identity that is different from simply superscripting the same letter 
form. The latter looks the same, but we assert (via encoding) that they 
are not the same thing. That fits the conception of phonetic notation 
that every character individually stands for something specific.

Whereas in a mathematical expression, the identity of a letter doesn't 
change, whether it's superscripted or not. It's clearly just a different 
use of the same letter, which is underlined by the fact that 
superscripting can be nested.

So, we would have to have a test case, not yet encoded, where there's a 
different identity for the superscripted shape than if the same shape 
were to be rendered normally.

The degree sign is a bad example, in a way, as it's clearly not a 
superscript 'o' or '0' (letter/digit) but is correctly implemented as a 
pure circle. That puts it in the category of symbols for which the size, 
spacing and placement of the "ink" matters more than the resemblance of 
that "ink" to other symbols. It is also not considered a "superscript 
circle" (no compat decomp).

It is a good example in a different way, since it's clearly a character 
for which the "ink" is always in a position and size as would be 
appropriate for a superscript. I'm sure that if we encounter some other 
character for which it would be inappropriate to give a compat decomp 
that we would consider whether it should be encoded.

At that juncture, we would look at the context in which it is to be used.

> >for unit symbols. When used with temperature, it's interesting to note 
> that not all temperature
> >scales use it consistently. You don't see it with Fahrenheit very often, for 
> example, reflecting
> >differences in traditional keyboard layouts.
> Ok, let’s digress a bit… I do see that too, in news articles (in web 
> apps) from USA and British news
> companies and see also “C” when degrees Celsius is meant. But writing 
> farad (F) or coulomb (C)
> when referring to temperature is just horrible, and only embarrassing 
> for the journalist who wrote
> that. (Another related horror is “kph”, and there you cannot even 
> blame keyboard layouts.)

I think it goes a bit too far to assume that any and all unit 
abbreviations have to be in the SI notation always. I'm sure there are 
places where there are regulations that define the use of specific 
abbreviations and in any contexts where they apply to SI, you would be 
free to read "k" as kilo and "kph" as kilo-ph (and then reject that as 
undefined). The same is not true for ordinary everyday usage in places 
where SI units aren't customary.

Likewise, the "ph" suffix to mean "per hour" is well established in 
places, while "/h" is not. That said, given that usage, I'd personally 
prefer kmph  over kph.

For example, in the weather forecast, 80F never refers to capacity, is 
understood by the audience, and therefore there's no objection to that 
usage on ground of confusion with SI units. However, usage is not 
consistent, you see it both with and without the degree sign, and 
without naming names, websites by academic institutions are just as 
likely to leave it off as popular websites are likely to add it.

As you can see, actual usage is all over the place and as Unicode is not 
prescriptive, we simply deal with what's out there.
> >Note that many unit symbols have one-off encodings that Unicode had to 
> support via compatibility
> >characters or even canonical duplicates (think micro and Ohm vs. their Greek 
> letter counterparts).
> >Without the need to support a transition from pre-existing character sets, 
> these duplicates would
> >not exist. But they do
> Yes. (But not relevant to this discussion.)
> >and so does the degree sign.
> The degree sign is not a compatibility character. It “divorced” from 
> superscript 0 looong before
> computers…
> >Neither of them, however, form precedents for non-compatibility characters.
> Not sure what that sentence means, since the premise is skewed.

The argument is that because there may be some characters that are used 
in ways that justify direct encoding (whether for compatibility or 
whatever), this does not serve as a blanket justification to extend that 
treatment to others.

A./
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230325/e729aa70/attachment.htm>


More information about the Unicode mailing list