Superscript and Subscript Characters in General Use
Marcel Schneider
charupdate at orange.fr
Fri Jan 6 08:30:19 CST 2017
On Fri, 6 Jan 2017 00:21:29 -0800, Asmus Freytag wrote:
>
> On 1/5/2017 9:42 PM, Marcel Schneider wrote:
> >
> > Nevertheless,
> > the user might prioritize the stability of the document when it comes to plain text,
> > and he could be interested in a better-looking display of letters that elsewhere
> > should be superscripted. Here, the modifier letters could be a ready-to-use fallback
>
> The use of such hacks is destabilizing to any efforts to systematically format superscripts
> across a document.
That supposes a rich text environment. The orthographical correctness of some
languages, among which French, requires traditionally either a rich text environment
or some in-line markup like TeX (at the expense of direct usability, i.e. without
a LaTeX converter). That is limit non-conformant to the design principles of Unicode.
As I understand them, Unicode provides all characters that are needed to correctly
spell any language. This goal remains unreached as long as the orthography of some
languages cannot be entirely achieved without relying on formatting markup. (Iʼm
aware that complex scripts require hinted fonts for glyph reordering and glyph
substitution, but this still is plain text.)
The superscripting of abbreviation endings belongs to another level of correctness than the arbitrary stress as expressed with italics, bold, underline
(obsolete in this use), extra letter spacing (German, rather old-style), capitalization, or extra acute accents as in Dutch.
This is why Karl Pentzlin [1] cited ‘Biblio^{que}’ vs “Biblioque”, where the latter
is “no valid French word.”
>From this it becomes now clear that Alastair Houghtonʼs suggestion [2] of encoding
a superscript variant selector, would meet this requirement and is therefore not
to be confused with the first step towards making Unicode support rich text.
Saying it loud: The fact that French and a few other languages cannot be written
in a correct orthography when the environment is plain text, seems to me hard to
accept.
> Text fonts may not support them, because for "ordinary" text, by Unicode's
> recommendation, one would use ordinary letters / digits with superscript markup.
A text font that does not support all modifier letters has less of a text font than
of a title font. Ornamental fonts are produced in such a variety that completing
them is/was economically unfeasible. Iʼm considering this statement rather in the
past tense, because diacriticized letters are already (on request) automatically
generated and added to the font at creation. If automatic superscripting shouldnʼt
already be implemented, it will be soon, I suppose. So more and more (new and
updated) fonts will support them. But wherever they arenʼt, a _Convert modifier
letters to superscript_ feature (or an equivalent macro command) ought to be able
to make the text conformant to legacy handling.
> So, by using these hacks, anytime a document is re-formatted with a different font style,
> you are in danger of either losing these to boxes, or to be faced with random font styles.
Yes, people should always be aware that the use of modifier letters has its downside,
as has the use of superscripted baseline letters. I currently write e-mails (like
this one) in a text editor (Notepad++). Several features I use here, are IMO missing
in all e-mail clients, as column editing, line reordering, and so on. So I appreciate
to be able to spell correctly in plain text, without sloppy fallbacks (i.e. baseline
fallbacks for superscript). Itʼs a matter of making the most of the exsisting charset.
I believe that modifier letter fallbacks are very functional. When I paste them into
an HTML mail form, the display is always correct and doesnʼt need to add superscript
by hand in the whole mail. Furthermore, I can even use superscript in the subject.
> If you don't think that is a real problem: some (many) character pickers will insert font+code point into
> an application. These font bindings often survive and suddenly your text, when read on a different
> computer looks like a ransom note, just because the new machine has a new "default" font, and
> that is applied to all letters that don't have a specific font binding.
Basically this is a good scheme, because character pickers typically are used for
symbols. There are also two kinds: local, and online. I sometimes pick in the
full-size PDF of the Code Charts. Theyʼre the best character picker IMO.
> Some font pickers are "stupid" enough to do this for simple accented code points that would have
> been in the currently selected font anyway.
Thatʼs really bad. I know that some people are writing documents by picking accented
letters in the special characters dialog. I can figure out that some other people
may use an online picker instead, partly because the word processor theyʼre using
may be a web-app. Anyhow, this is very unefficient. The reason may be that one
often thinks either that a keyboard cannot be completed, or that completing a
keyboard would make it unusable, or hard to use, or full of stickers. Hereʼs one
main challenge of keyboard layout development.
> Your suggestions will just add to these problems.
> If editing in a rich text environment, work in rich text. And then lean on implementers to get
> export correct to other rich text formats....
I really worked nearly all the time in a rich text environment, and I added plenty
of autocorrections to speed up writing. Today, I work most of the time in plain
text. I donʼt use LaTeX, but I know that this is easily exported to many other
formats. PDF is a main target format. Most of the drawbacks start when the reader
wishes to copy-paste some lines of a (basically searchable) PDF either to rich text
or to plain text… but that is not the issue here.
I hope that my future recommendations will solve more problems than theyʼll create!
Marcel
[1] Karl Pentzlinʼs MODIFIER LETTER SMALL Q proposal:
http://www.unicode.org/L2/L2010/10230-modifier-q.pdf
[2] Alastair Houghtonʼs SUPERSCRIPT/SUBSCRIPT variant selectors suggestion:
http://www.unicode.org/mail-arch/unicode-ml/y2017-m01/0016.html
More information about the Unicode
mailing list