Superscript and Subscript Characters in General Use

Marcel Schneider charupdate at
Thu Jan 5 12:43:32 CST 2017

On Thu, 5 Jan 2017 03:56:15 -0800, Asmus Freytag wrote:
> On 1/5/2017 3:33 AM, Marcel Schneider wrote:
> >
> > If Arial Unicode MS is used (though it is no longer 
> > a part of new Windows versions), it really looks exactly like preformatted 
> > fractions in the same font. But I can understand that denominators are meant 
> > to align on the baseline, while subscripts are often set slightly below. 
> That's just the kind of issue that you will run into with undisciplined hacks.
> Just... don't.

So that cannot be recommended for general use, even outside of publishing software. 
The question left would be about readability of drafts and so on. From now on, when 
Iʼve to choose between fractions this way: '2/7', and this way: '²⁄₇', I should 
always use ASCII only? Iʼm thinking of an e-mail, like this one. Iʼm still unable 
to understand why the unformatted fraction should be better than the preformatted 
presentation (even when the latter is suboptimal). 

I still believe that keyboard layout developers are in debt of providing all and 
every characters of a given script and the related sets of numerals, generic 
punctuation and symbols, in order to enable the end-user to choose whatever effect 
he intends to produce. Since keyboards are shaping the practice, people are probably 
best served when the layout allows eveybody to adapt himself to all use cases. 

Earlier on Thu, 5 Jan 2017 03:55:06 -0800, Asmus Freytag wrote:
> On 1/4/2017 4:33 PM, Doug Ewell wrote:
> > 
> > > What I complain of as not mentioned in the Standard, is that U+2044
> > > can be used with superscript and subscript digits, rather than ASCII
> > > digits.
> > 
> > Almost any character(s) in Unicode "can be" used with almost any other.
> > You can surround U+2044 with emoji if you like. That doesn't mean you
> > should.
> This is a key point.
> You can use many code points to get some "effect", but that doesn't mean 
> it represents good practice or should be recommended.

This is particularly true for the French use of DEGREE SIGN for superscript o, 
that 99 % of the users are said to type to get the 'n°' abbreviation, or 'r°', 
'v°', 'f°'. It doesnʼt look really bad, is stable, and easy to input. The downside 
comes at least when itʼs up to append a plural s. And even before, itʼs poor 
typography, because depending on the font, the degree sign may look very different 
from a real superscript o. With respect to this, the modifier letter o is way better.

> There are no "traffic cops" out there that will flag you down for having made 
> a poor decision, but that's not a reason enough to endorse random suggestions.
> This goes particularly for practices that need support in systems and/or fonts to work
> correctly. If some implementer supports the recommended normal size digits for 2044
> why should they do the additional work of making sure it works for super/sub script.

If the implementers really do support the fraction slash U+2044 as triggering the 
authentic fraction formatting, then they may spare the extra work. But this feature 
is uncommon enough as to think seriously about the fallback options. And if despite 
of being discouraged by all recommendations (including mine), the use of super/sub 
scripts gets thriving, it would be a good idea to support them along with normal 
size digits, the more as this does not require a lot of supplemental code (just 
twenty equivalence classes, I guess).

In the meantime, what options are available as fallback? The recommendation [1] 
is unrealistic: A system (OS + program + font) that is unable to map digits to 
numerators/denominators, cannot be expected neither to map U+2044 to U+002F, as 
specified. Therefore, the fraction slash is left between the digits. Since in most 
proportional fonts it is so kerning that it overlaps baseline digits when displayed 
in between, this can hardly be used as a recommended fallback. This looks good in 
some fonts only, while in most proportional fonts it doesnʼt. Obviously, this use 
case is not intended. 

So perhaps all users might be given the unrecommended possibility to choose an 
unrecommended second-best solution. This would require to make sure that everybody 
gets the point of being at risk of running into issues. In any case, U+2044 ought 
to be on the keyboard, according to the Standard (in order to input the specified 
sequences). As of super/sub scripts, I think it would be a pity to keep them away. 

The rest could probably be considered as being up to the user.

In any case, fashion is unforeseeable.


[1] TUS 9.0, §6.2, p. 277:
| If the displaying software is incapable of mapping the fraction to a unit, then it can also be
| displayed as a simple linear sequence as a fallback (for example, 3/4). […]

More information about the Unicode mailing list