Why incomplete subscript/superscript alphabet ?

Denis Jacquerye moyogo at gmail.com
Wed Oct 5 09:17:30 CDT 2016


> There is no point about other letters than the basic alphabet
superscripted,
> as no French abbreviation exceeds this range (despite of what I believed
> in 2014, like many other people).

What does that mean? How would that help for the French vernacular
3<super>ème</super>, or the Spanish C.<super>ía</super>. You might find
there are many more uses than you think. Higher level protocols can already
support these.
Maybe what we need is better and more general higher level protocol support.


On Wed, 5 Oct 2016 at 15:01 Marcel Schneider <charupdate at orange.fr> wrote:

> On Wed, 5 Oct 2016 14:27:44 +0900, Martin J. Dürst wrote:
> > On 2016/10/04 19:35, Marcel Schneider wrote:
> >> On Mon, 3 Oct 2016 13:47:09 -0700, Asmus Freytag (c) wrote:
> >>
> >>> Later, the beta and gamma were encoded for phonetic notation, but not
> the
> >>> alpha.
> >>>
> >>> As a result, you can write basic formulas for select compounds, but
> not all.
> >>> Given that these basic formulae don't need full 2-D layout, this still
> seems
> >>> like an arbitrary restriction.
> >>
> >> When itʼs about informatics, arbitrary restrictions are precisely what
> gets me
> >> upset. Those limitations are—as I wrote the other day—a useless
> worsening
> >> of the usability and usefulness of a product.
> >
> > This kind of "let's avoid arbitrary limitations" argument works very
> > well for subjects that are theoretical, straightforward, and rigid in
> > nature. Many (but not all) subjects in computer science (informatics)
> > are indeed of such a nature.
> >
> > The Unicode Consortium (or more specifically, the UTC) does a lot of
> > hard work to create theories where appropriate, and to explain them
> > where possible. But they recognize (and we should do so, too) that in
> > the end, writing is a *cultural* phenomenon, where straightforward,
> > rigid theories have severe limitations.
> >
> > From a certain viewpoint (the chemist's in the example above), the
> > result may look arbitrary, but from another viewpoint (the
> > phoneticist's), it looks perfectly fine. At first, it looks like it
> > would be easy to fix such problems, but each fix risks to introduce new
> > arbitrariness when seen from somebody else's viewpoint. Getting upset
> > won't help.
>
> Iʼve got the point, thanks. Phonetics need to write running text that is
> immediately legible, while a chemistry database may use particular
> notational
> conventions that work with baseline letters to be parsed on semantics or
> light
> markup for proper display in the UI. The UTC decision thus questioned the
> design
> principle of using plain text for chemical formulae. No doubt it was
> understood
> that validating this choice would have opened the door to encoding more
> special
> characters for upgrading or similar purposes.
>
> At this point Iʼd like to mention what I thought about since this thread
> was launched. The French language makes extensive use of superscripts
> to note abbreviations. This is not a mere styling issue, as it is in
> English.
> E.g. without superscripts, the abbreviation ‘nos’ [numbers] is ambiguated
> with
> the pronoun ‘nos’ [our]. The most that can be easily disambiguated is ‘n°’
> [number]
> with the degree sign available on the common French keyboard layout.
> For the anecdote: When a technician led me to discover the field
> ‘no centre mess’ in the UI of my cellphone, it took me several seconds to
> understand
> ‘number of SMS center/centre’ which is the actual meaning; but here, some
> additional
> confusion resulted from the interlanguage homograph ‘no’.
>
> Written words being ambiguated with one another is a common phenomenon in
> natural languages. Performing disambiguation is widely achieved by adding
> vowel signs (Hebrew) or diacritics (Latin script using languages).
> French was disfavored in computer practice (applied informatics) during a
> certain time when diacritics were unavailable—on uppercase letters longer
> than on lowercase.
> AFAIK, Latin letters like ‘ij’ and ‘œ’ first gained binary existence thanks
> to the ISO 6937 charset, while a Dutch standards author asked his
> compatriots
> to always write ‘ij’ with two ASCII letters, and two Frenchmen prevented
> the ‘œ’
> from being encoded in Latin-1 at the intended code points because of its
> non-existence in computer printers.
>
> But today, thanks to Unicode, thatʼs all over. Therefore I suggest to grant
> the French language full support by enabling superscript lowercase letters
> in order that the SUPERSCRIPT deadkey that the French Standards body
> recommends,
> will work for all abreviations. There is no point about other letters than
> the basic
> alphabet superscripted, as no French abbreviation exceeds this range
> (despite of
> what I believed in 2014, like many other people).
> Additionally Iʼm proposing a modifier key combination (using a new
> modifier key on
> the 105th key on ISO keyboards) to access the lowercase superscripts on
> live keys:
> Shift + Num + [letter key] ➔ [superscript lowercase].
> I can easily type ‘on the 105ᵗʰ key’, and so will all users in France, at
> least
> with the dead key.
>
> The missing letter is superscript q == MODIFIER LETTER SMALL Q.
> Actually, when Shift + Num + Q is pressed on the projects,
> ‘ ↑q_n’existe_pas’ [ superscript ‘q’ does not exist] is inserted.
>
> Karl Pentzlin had the merit of proposing the missing letter superscript q
> for use in French abbreviations, but the UTC must have refused by arguing
> from English usage and from French recommendations. These are now changing.
> More, as I tried to demonstrate above, one cannot always rely on such
> low-profile recommendations, which express more the humility and
> undemandingness
> of their author, than the real practical needs and linguistical
> requirements.
>
> As of searchability, Google have even the mathematical alphabets in their
> equivalence classes, so that any request written e.g. in doublestruck
> letters
> is read as if it were entered in plain ASCII.
>
> Best regards,
>
> Marcel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20161005/c939872c/attachment.html>


More information about the Unicode mailing list