Why incomplete subscript/superscript alphabet ?

Marcel Schneider charupdate at orange.fr
Wed Oct 5 10:44:38 CDT 2016

On Wed, 5 Oct 2016 06:35:52 +0000, Martin Mueller wrote:

> There is always a lot more history than reason in the world. 
> That said, given that alphabets have fixed numbers, it’s weird 
> that bits of super and subscripted letters appear in this or 
> that limited range but that you can’t cobble a whole alphabet 
> together in a consistent manner. If any , why not all, especially 
> if there are only two or three dozen. 

They would end up in the SMP, threatening their usability on Windows
keyboard layouts due to their not being defined in XML like Appleʼs are, 
and not being able to output two UTF-16 code points by dead keys, but for 
IMEs this is no problem. 

>From a more theoretical viewpoint, encoding superscripted letters as such 
is opposed to Unicodeʼs design principles, as it has already been pointed 
out. This is why only legacy superscripts have SUPERSCRIPT in their name.

As of the scattered code point allocations, they come from the pragmatic 
encoding. A letter isnʼt encoded as a preformatted superscript unless 
there are one or more precise usages, documented in the proposal.

To come back to my new point in this thread: Iʼm believing that in French,
superscript lowercase letters have a particular function as abbreviation 
indicators, in the absence of any other visible sign. This viewpoint is 
now gaining audience, as it comes from French authorities (DGLFLF, Afnor) 
who are demanding the /superscript/ dead key, to write abbreviations.
In French, there is a need and a demand to move this from higher level 
to plain text. 

Hence the need of the MODIFIER LETTER SMALL Q, for a proper solution. 
E.g., when trying to abbreviate ‘Bibliothèque’ to ‘Bibque’ in 
plain text, one will actually end up with ‘Bib ↑q_n’existe_pasᵘᵉ’. 
There must be such a message, otherwise users may think there is a bug 
in the keyboard. 
Once the encoding of MODIFIER LETTER SMALL Q is at the point where the new 
scalar value is known, this will take the place of the sequence, and first 
display as a notdef box. Explaining this is then a matter of documentation.
I wasnʼt upset about the missing superscript q. But end-users could get.



More information about the Unicode mailing list