A sign/abbreviation for "magister"

Philippe Verdy via Unicode unicode at unicode.org
Sat Nov 3 14:41:54 CDT 2018


Le ven. 2 nov. 2018 à 20:01, Marcel Schneider via Unicode <
unicode at unicode.org> a écrit :

> On 02/11/2018 17:45, Philippe Verdy via Unicode wrote:
> [quoted mail]
> >
> > Using variation selectors is only appropriate for these existing
> > (preencoded) superscript letters ª and º so that they display the
> > appropriate (underlined or not underlined) glyph.
>
> And it is for forcing the display of DIGIT ZERO with a short stroke:
> 0030 FE00; short diagonal stroke form; # DIGIT ZERO
> https://unicode.org/Public/UCD/latest/ucd/StandardizedVariants.txt
>
>  From that it becomes unclear why that isn’t applied to 4, 7, z and Z
> mentioned in this thread, to be displayed open or with a short bar.
>
> > It is not a solution for creating superscripts on any letters and
> > mark that it should be rendered as superscript (notably, the base
> > letter to transform into superscript may also have its own combining
> > diacritics, that must be encoded explicitly, and if you use the
> > varaition selector, it should allow variation on the presence or
> > absence of the underline (which must then be encoded explicitly as a
> > combining character.
>
> I totally agree that abbreviation indicating superscript should not be
> encoded using variation selectors, as already stated I don’t prefer it.
> >
> > So finally what we get with variation selectors is: <baseline letter,
> > variation selector, combining diacritic> and <baselineletter
> > precombined with the diacritic, variation selector> which is NOT
> > canonically equivalent.
>
> That seems to me like a flaw in canonical equivalence. Variations must
> be canonically equivalent, and the variation selector position should
> be handled or parsed accordingly. Personally I’m unaware of this rule.
> >
> > Using a combining character avoids this caveat: <baseline letter,
> > combining diacritic, combining abbreviation mark> and <baselineletter
> > precombined with the diacritic, combining abbreviation mark> which
> > ARE canonically equivalent. And this explicitly states the semantic
> > (something that is lost if we are forced to use presentational
> > superscripts in a higher level protocol like HTML/CSS for rich text
> > format, and one just extracts the plain text; using collation will
> > not help at all, except if collators are built with preprocessing
> > that will first infer the presence of a <combining abbreviation mark>
> > to insert after each combining sequence of the plain-text enclosed in
> > a italic style).
>
> That exactly outlines my concern with calls for relegating superscript
> as an abbreviation indicator to higher level protocols like HTML/CSS.
>

That's exactlky my concern that this relation to HTML/CSS should NOT occur
at all ! It's really not the solution, HTML/CSS styles have NO semantic at
all (I demonstrated it in the message you are quoting).


> > There's little risk: if the <combining abbreviation mark> is not
> > mapped in fonts (or not recognized by text renderers to create
> > synthetic superscript scripts from existing recognized clusters), it
> > will render as a visible .notdef (tofu). But normally text renderers
> > recognize the basic properties of characters in the UCD and can see
> > that <combining abbreviation mark> has a combining mark general
> > property (it also knows that it has a 0 combinjing class, so
> > canonical equivalences are not broken) to render a better symbols
> > than the .notdef "tofu": it should better render a dotted circle.
> > Even if this tofu or dotted circle is rendered, it still explicitly
> > marks the presence of the abbreviation mark, so there's less
> > confusion about what is preceding it (the combining sequence that was
> > supposed to be superscripted).
>
> The problem with the <combining abbreviation mark> you are proposing
> is that it contradicts streamlined implementation as well as easy
> input of current abbreviations like ordinal indicators in French and,
> optionally, in English. Preformatted superscripts are already widely
> implemented, and coding of "4ᵉ" only needs two characters, input
> using only three fingers in two times (thumb on AltGr, press key
> E04 then E12) with an appropriately programmed layout driver. I’m
> afraid that the solution with <combining abbreviation mark> would be
> much less straightforward.
>

This is not a real concern: this is legacy old practives that should no
longer be recommanded as it is ambiguous (nothing says that "4ᵉ" is an
abbreviated ordinal, it can as well be 4 elevated to the power e, or
various other things).

Also the keys to press on a keyboard is absolutely not a concern: the same
key presses you propose can as well generate the letter followed by the
combining abbreviation mark. In fact what you propose is even less
practical because it uses complex input for all characters and requires
mapping keys on the whole alphabet (so it uses precious space on the key
layout). It's just simpler for everyone to press "4", "e", followed by a
combination (like AltGr+".") to produce the <combining abbreviation mark> !

And these legacy superscript characters still are not warrantied to not
have any underline (the variation may as well be significant), and there
will never be enough superscript characters for the many superscript
notations (not just abbreviations) that should still be encoded the normal
letters (including in clusters, with diacritics, ligatures and so on):
Unicode will never accept to reencode all existing letters (plus all the
infinite set of clusters that can be formed with them) just to turn them
into superscript/subscript variants. These encodings that found their way
from the need of roundtrip compatibility of legacy charsets (before the
UCS) should have never occured at all: these should have not even been
tolerated for IPA symbols, for mathematical symbols (monospace, bold,
italic...).

The variation selector solution is also not suitable when the intent is
only to add semantic to the encoded text and not drive the choice between
glyph variants (when the default glyph without the variant selector can
FREELY vary into forms that are UNACCEPTABLE in some contexts, then the
variation does not really encode the semantic but encodes the visual
rendering intent: it is too easily abuse to do something else).
But a single *semantic* combining mark does not encode any visual rendering
intent like what variation selectors do. They still allow glyphic
variations as long as the the semantic is kept, and they have the correct
fallbacks (there's no obscuring of the encoding of the clusters to which
the semantic combining mark applies: they are still part of the same
general encoding as normal letters, and rendering abbreviation mark does
not necessarily means that the base cluster MUST be rendered differently
than normal letters: it is permitted as well to render the combining mark
for example as a dot, or as a true diacritic on top of the letters). And if
needed the following can control the visual appearence:

> >
> > The <combining abbreviation mark> can also have its own <variation
> > selector> to select other styles when they are optional, such as
> > adding underlines to the superscripted letter, or rendering the
> > letter instead as underscript, or as a small baseline letter with a
> > dot after it: this is still an explicit abbreviation mark, and the
> > meaning of the plein text is still preserved: the variation selector
> > is only suitable to alter the rendering of a cluster when it has
> > effectively several variants and the default rendering is not
> > universal, notably across font styles initially designed for specific
> > markets with their own local preferences: the variation selector
> > still allows the same fonts to map all known variants distinctly,
> > independantly of the initial arbitrary choice of the default glyph
> > used when the variation selector is missing).
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20181103/7b4cafa7/attachment.html>


More information about the Unicode mailing list