Why incomplete subscript/superscript alphabet ?

Marcel Schneider charupdate at orange.fr
Wed Oct 5 17:10:32 CDT 2016

On Wed, 5 Oct 2016 19:02:51 +0200, Frédéric Grosshans wrote:
Le 05/10/2016 à 15:57, Marcel Schneider a écrit :
> On Wed, 5 Oct 2016 14:27:44 +0900, Martin J. Dürst wrote:
>>> From a certain viewpoint (the chemist's in the example above), the
>>> result may look arbitrary, but from another viewpoint (the
>>> phoneticist's), it looks perfectly fine. At first, it looks like it
>>> would be easy to fix such problems, but each fix risks to introduce new
>>> arbitrariness when seen from somebody else's viewpoint. Getting upset
>>> won't help.
>> Iʼve got the point, thanks. Phonetics need to write running text that is
>> immediately legible, while a chemistry database may use particular notational
>> conventions that work with baseline letters to be parsed on semantics or light
>> markup for proper display in the UI. The UTC decision thus questioned the design
>> principle of using plain text for chemical formulae. No doubt it was understood
>> that validating this choice would have opened the door to encoding more special
>> characters for upgrading or similar purposes.
> I think there is a big difference between adding a few characters for a
> new use (chemistry formulae) and completing an obvious almost complete
> set. People are used to see the 26 basic alphabetic Latin character
> (abcdefghijklmnopqrstuvwxyz) being treated preferentially by computers,
> but are always surprised when only one of them is treated differently.
> Initially, superscript letters where restricted to a few letter, and it
> made sense to restrict the temptation to complete the set. But now that
> all modifier small latin letters except q are encoded, it makes little
> sense. Many people use these characters (arguably wrongly) for many uses
> beyond IPA, and they are invariably surprised if they need q. The
> special status of the basic Latin alphabet means that almost no one
> would be surprised not to find a superscripted α, è, or ∞ and adding the
> last missing latin basic letter q would not open the door to any more
> character.
That is however exactly what I believed, that this would open that door.
It seems to me as if the missing superscript q were the last key to keep 
that door locked (how nice an image, as the small q is somewhat key-shaped).
It is as if completing that series would trigger an avalanche of superscript 
alphabets and symbols to be asked for encoding without any means to be refused.

And, troublesome enough, this is exactly how the proposal to encode *MODIFIER 
LETTER SMALL Q was percieved, despite the rationale, which must have been 
completely misunderstood, although it seems to me to be written in good English.
Thanks to Denis Jacqueryeʼs detailed answer to the question “Why is there no 
character for "superscript q" in Unicode? [1], I got all links quickly [2][3][4].

>> At this point Iʼd like to mention what I thought about since this thread
>> was launched. The French language makes extensive use of superscripts
>> to note abbreviations. [...] Therefore I suggest to grant
>> the French language full support by enabling superscript lowercase letters
>> in order that the SUPERSCRIPT deadkey that the French Standards body recommends,
>> will work for all abreviations. There is no point about other letters than the basic
>> alphabet superscripted, as no French abbreviation exceeds this range (despite of
>> what I believed in 2014, like many other people).
> Whether è (and í) are needed or not is another question. Even if it were
> useful (as argued ny others in this thread), it brings non trivial
> technical difficulties in terms of NFC/NFD. But since people are used to
> see these characters being treated differently, I think the “problem” of
> the lack of superscript composed character is less obvious than the lack
> of *MODIFIER LETTER SMALL Q, in the sense that the first absence is
> perceived (by the Unicode naive user) as more normal than the second.

I really love your point of view, I understand that it is already shared by 
most people, and I strongly hope that it be adopted by the UTC. 
Perhaps it is, as there is no notice of non-approval found in the archive. 

However Iʼd like to know the answer to the proposer at/after the UTC meeting 
of August 9-13, 2010 at Redmond [5]. Such requests have to be sent to this List,
which is monitored by meeting participants.


[1] Denis Jacqueryeʼs post: 

[2] Karl Pentzlinʼs proposal: 

[3] A comment on behalf of Adobe Systems, written up the first day of 
the UTC meeting where the proposal was rejected: 

[4] Karl Pentzlinʼs reply, two days later i.e. three days before 
the end of the meeting: 

[5] The anchor in the UTC minutes at the related Action Item:

More information about the Unicode mailing list