Superscript and Subscript Characters in General Use

Marcel Schneider charupdate at orange.fr
Wed Jan 4 17:36:49 CST 2017


On Wed, 04 Jan 2017 12:20:14 -0700, Doug Ewell wrote:
> 
> Marcel Schneider wrote:
> 
> > This is because even complemented with UAXes and TRs, the Core
> > Specifications cannot cover the whole practice. It seems that to stay
> > inside reasonable limits, a significant number of usage cases have
> > been left out, e.g. the mentioned use of plain text for styled custom
> > vulgar fractions is a recognized practice, but stays persistently
> > excluded from TUS.
> 
> I don't understand the relevance to vulgar fractions.

Vulgar fractions represented using super- and subscript digits around the 
FRACTION SLASH U+2044, that kerns, are one example illustrating superscript 
and subscript characters in general use. It is cited because it is the subject 
of a Microsoft Community wiki that is well referenced on the web:

https://answers.microsoft.com/en-us/msoffice/wiki/msoffice_word-mso_other/styled-fractions-in-windows/4a07d5fa-2484-4e39-b1f3-70bb3eb0c332

I recall again that when I launched the related 2015 thread, I was ignoring 
this page, until close to the end of the thread, when I found and shared the 
link. Vulgar fractions rather than mathematical fractions due to the slant of 
the fraction slash. (Though the so-called VULGAR FRACTIONs can be displayed 
with an horizontal bar, as TUS and Doug state (below).

> 
> Much of this thread has dealt with Basic Latin characters that have no
> superscript or subscript clones, and how their absence prevents certain
> passages from being representable in plain text. This is your basic
> debate over what constitutes plain text.

There was indeed a concern about what performance to recognize to plain text. 
But that had been settled to the extent that Unicode does not sustain attempts 
to fully represent styled mathematical expressions, but that a set of preformatted 
alphabets should be completed: superscripts lowercase (q) and uppercase, 
subscripts lowercase, and small caps (that take the place of subscript capitals).

Now Iʼm advocating the recognition of the re-use of existing modifier letters 
instead of new or newly modified superscripts, as well as the demand for ordinal 
indicators in French.

> 
> As explained in the July 2015 thread about vulgar fractions, TUS
> sections 6.2 and 22.3 thoroughly explain the use of U+2044 FRACTION
> SLASH with normal "Nd" digits. If I want to write "ninety-nine and
> forty-four one-hundredths," with the non-precomposed vulgar fraction, I
> can write "99 44⁄100" and be fully compliant with the Standard. This
> has nothing to do with what is and isn't plain text.

This and the spelling with SOLIDUS are referred to as fallback. What I 
complain of as not mentioned in the Standard, is that U+2044 can be used 
with superscript and subscript digits, rather than ASCII digits. The kerning 
of the FRACTION SLASH makes it fit for this use case, and in certain high-end 
fonts, especially Arial Unicode MS, the result is fully identical to precomposed 
fractions. This all is plain text. What isnʼt, is the use of U+2044 as a format 
control, as specified in that part of the Standard. High-end software is meant 
to automatically apply fraction styling when U+2044 is detected between digits.

> 
> The fact that many current rendering systems can't render this correctly
> is an implementation matter, though a hard-to-fix one. (Note that the
> fallback display is perfectly readable and correct, unless you see a box
> for U+2009.)

Agreed. Here the use of superscript and subscript digits is not indispensable 
to the readability. In this case, their availability constitutes a facility 
for better representation—even in plain text.

> 
> The fact that TUS doesn't sanction the use of U+2044 with superscript
> and subscript digits, which I imagine Marcel was alluding to, is
> irrelevant. TUS is a character encoding standard, not a glyph encoding
> standard.

The distinction between baseline digits and superscript/subscript digits 
is in my opinion not a glyphic issue, since in Unicode they all are available 
as distinct characters.

> 
> If Marcel is talking about distinguishing between horizontal and
> diagonal slashes in vulgar fractions, this is still not a question of
> plain text. However, in the emoji era, this type of presentation
> variation has become something that Unicode cares about, and so it might
> be handled in some way in the future, such as with a variation selector.
> I suspect this mechanism has been "excluded from TUS" because it doesn't
> yet exist.

I’m not talking about this, and I donʼt miss it in Unicode. Some fonts might 
have horizontal fraction bars. However, such a variation selector could be 
handy.

The plain text custom fractions are IMO a good example of the re-use of 
superscript and subscript characters. More, I thought that the fraction 
slash had been encoded to work with them, until I learned in TUS that this 
was not intended. The 2015 thread brought up that the observed synergy is 
due to an initiative of the font designer(s). The fact that this happened 
in a font that claims conformity to the Standard, seems to me non-trivial.

Marcel



More information about the Unicode mailing list