Superscript and Subscript Characters in General Use
charupdate at orange.fr
Wed Jan 4 15:48:29 CST 2017
On Wed, 4 Jan 2017 15:13:36 +0000, Alastair Houghton wrote:
> > Given that the WG of the French standard keyboard is actually interested in getting
> > encoded a new ordinal indicator (kind of 'ᵉ'), I feel the more urged to stay tuned,
> > and to comment on subsequent e-mails, too.
> I can understand the desire to encode the new ordinal indicator.
> Perhaps another option worth contemplating might be to standardise some control
> code points, to provide a mechanism for “plain text” to include the necessary
> minimum of formatting information without additional markup. The advantage of
> this approach is that it would make it explicitly obvious that Unicode wasn’t
> going to include further super or subscript forms, while providing everyone that
> wants them with access to a full set of super or subscripts subject to system
> (or font) support.
> A simple form of this might be to encode the new zero-width modifier code points
> SUBSCRIPT and SUPERSCRIPT that work somewhat like the variation selectors, so e.g.
> U+0032 DIGIT TWO
> U+???? SUPERSCRIPT
> U+0033 DIGIT THREE
> U+???? SUBSCRIPT
> would display as ²₃ on fonts that supported the new modifiers. The advantage of
> taking this very simplistic approach is that it can be dealt with in the OpenType
> (or AAT) tables in modern fonts, rather than necessitating changes to rendering
> code. It is also obviously not an attempt to replace markup, but will cope with
> most common “plain text” uses.
This would indeed make for stable plain text representations that convey the
necessary vertical alignment. However its encoding would imply that the design
principle of “not attempt[ing] to describe the positioning of a character
above or below the baseline in typographical layout” is superseded in this
particular case, that provides a universal mechanism for a basic formatting
parameter. Consistently this would call for some extensions catering for other
formatting parameters. The expense in code points would be very low, the scheme
would meet user expectations, and the Standard would become even more performative
and thus, even more attractive through its enhancing the plain text environment.
Eventually, the display of text editors, that actually is internally directed
(for syntactic highlighting), would become text-guided. This is not far from
It all tends to the conclusion that the French demand is based upon:
modifier letters that are superscript forms, are not real superscripts, they
don’t fit the expectations of people regarding superscripts and abbreviations.
I already expressed my point of view in this discussion. But the real concern
could be to emulate the Spanish ordinal indicators, arguing that their being
a part of Unicode justifies similar facilities for other languages. Here the
Unicode position is that the Spanish ordinal indicators are backcompat code
points for roundtrip compatibility with ISO/IEC 8859-1. This clearly results
from the Code Charts at U+00AA, U+00BA. There has been a deadline, that
diligence made to precede. Let alone that a complete set of ordinal indicators
for French necessitates four letters, that is probably exceeding the framework
of 8-bit charsets common to several countries.
As far as the discussion grew until now, I feel that French must live with
the existing infrastructure. Hence the idea of re-using four modifier letters
for that purpose.
If Iʼm wrong with this idea, that could be good or bad news. Good news if the
generic SUPERSCRIPT and SUBSCRIPT variant selectors (or alternatively, new
ordinal indicators) will be effectively encoded. Bad news if that as well as
the re-use of modifier letters will be discarded. In-between, I see the out-of-
the-box modifier letter solution, as a kind of second-best choice. Better than
nothing at all. In certain circumstances, better than markup and formatting.
More information about the Unicode