Superscript and Subscript Characters in General Use
charupdate at orange.fr
Wed Jan 4 17:36:49 CST 2017
On Wed, 04 Jan 2017 12:20:14 -0700, Doug Ewell wrote:
> Marcel Schneider wrote:
> > This is because even complemented with UAXes and TRs, the Core
> > Specifications cannot cover the whole practice. It seems that to stay
> > inside reasonable limits, a significant number of usage cases have
> > been left out, e.g. the mentioned use of plain text for styled custom
> > vulgar fractions is a recognized practice, but stays persistently
> > excluded from TUS.
> I don't understand the relevance to vulgar fractions.
Vulgar fractions represented using super- and subscript digits around the
FRACTION SLASH U+2044, that kerns, are one example illustrating superscript
and subscript characters in general use. It is cited because it is the subject
of a Microsoft Community wiki that is well referenced on the web:
I recall again that when I launched the related 2015 thread, I was ignoring
this page, until close to the end of the thread, when I found and shared the
link. Vulgar fractions rather than mathematical fractions due to the slant of
the fraction slash. (Though the so-called VULGAR FRACTIONs can be displayed
with an horizontal bar, as TUS and Doug state (below).
> Much of this thread has dealt with Basic Latin characters that have no
> superscript or subscript clones, and how their absence prevents certain
> passages from being representable in plain text. This is your basic
> debate over what constitutes plain text.
There was indeed a concern about what performance to recognize to plain text.
But that had been settled to the extent that Unicode does not sustain attempts
to fully represent styled mathematical expressions, but that a set of preformatted
alphabets should be completed: superscripts lowercase (q) and uppercase,
subscripts lowercase, and small caps (that take the place of subscript capitals).
Now Iʼm advocating the recognition of the re-use of existing modifier letters
instead of new or newly modified superscripts, as well as the demand for ordinal
indicators in French.
> As explained in the July 2015 thread about vulgar fractions, TUS
> sections 6.2 and 22.3 thoroughly explain the use of U+2044 FRACTION
> SLASH with normal "Nd" digits. If I want to write "ninety-nine and
> forty-four one-hundredths," with the non-precomposed vulgar fraction, I
> can write "99 44⁄100" and be fully compliant with the Standard. This
> has nothing to do with what is and isn't plain text.
This and the spelling with SOLIDUS are referred to as fallback. What I
complain of as not mentioned in the Standard, is that U+2044 can be used
with superscript and subscript digits, rather than ASCII digits. The kerning
of the FRACTION SLASH makes it fit for this use case, and in certain high-end
fonts, especially Arial Unicode MS, the result is fully identical to precomposed
fractions. This all is plain text. What isnʼt, is the use of U+2044 as a format
control, as specified in that part of the Standard. High-end software is meant
to automatically apply fraction styling when U+2044 is detected between digits.
> The fact that many current rendering systems can't render this correctly
> is an implementation matter, though a hard-to-fix one. (Note that the
> fallback display is perfectly readable and correct, unless you see a box
> for U+2009.)
Agreed. Here the use of superscript and subscript digits is not indispensable
to the readability. In this case, their availability constitutes a facility
for better representation—even in plain text.
> The fact that TUS doesn't sanction the use of U+2044 with superscript
> and subscript digits, which I imagine Marcel was alluding to, is
> irrelevant. TUS is a character encoding standard, not a glyph encoding
The distinction between baseline digits and superscript/subscript digits
is in my opinion not a glyphic issue, since in Unicode they all are available
as distinct characters.
> If Marcel is talking about distinguishing between horizontal and
> diagonal slashes in vulgar fractions, this is still not a question of
> plain text. However, in the emoji era, this type of presentation
> variation has become something that Unicode cares about, and so it might
> be handled in some way in the future, such as with a variation selector.
> I suspect this mechanism has been "excluded from TUS" because it doesn't
> yet exist.
I’m not talking about this, and I donʼt miss it in Unicode. Some fonts might
have horizontal fraction bars. However, such a variation selector could be
The plain text custom fractions are IMO a good example of the re-use of
superscript and subscript characters. More, I thought that the fraction
slash had been encoded to work with them, until I learned in TUS that this
was not intended. The 2015 thread brought up that the observed synergy is
due to an initiative of the font designer(s). The fact that this happened
in a font that claims conformity to the Standard, seems to me non-trivial.
More information about the Unicode