A sign/abbreviation for "magister"
Marcel Schneider via Unicode
unicode at unicode.org
Fri Nov 2 10:10:21 CDT 2018
On 01/11/2018 16:43, Asmus Freytag via Unicode wrote:
> I don't think it's a joke to recognize that there is a continuum here and that
> there is no line that can be drawn which is based on straightforward principles.
> In this case, there is no such framework that could help establish pragmatic
> boundaries dividing the truly useful from the merely fanciful.
I think the red line was always between the positive and the negative answer to
the question whether a given graphic is relevant for legibility/readability of
the plain text backbone. But humans can be trained to mentally disambiguate
a mass of confusables, so the line vanishes and the continuum remains intact.
On 02/11/2018 06:22, Asmus Freytag via Unicode wrote:
> On 11/1/2018 7:59 PM, James Kass via Unicode wrote:
>> Alphabetic script users write things the way they are spelled and spell things
>> the way they are written. The abbreviation in question as written consists of
>> three recognizable symbols. An "M", a superscript "r", and an equal sign
>> (= two lines). It can be printed, handwritten, or in fraktur; it will still
>> consist of those same three recognizable symbols.
>> We're supposed to be preserving the past, not editing it or revising it.
> Alphabetic script users' handwriting does not match print in all features.
> Traditional German handwriting used a line like a macron over the letter 'u'
> to distinguish it from 'n'. Rendering this with a u-macron in print would be
> the height of absurdity.
> I feel similarly about the assertion that the "two lines" are something that
> needs to be encoded, but only an expert would know for sure.
Indeed it would be relevant to know whether it is mandatory in Polish, and I’m
not an expert. But looking at several scripts using abbreviation indicators as
superscript, i.e. Latin and Cyrillic (when using the Latin-script-written
abbreviation of "Numero", given Cyrillic for "N" is "Н", so it’s strictly
speaking one single script, and two scripts using it), then we can easily see
how single and double underlines are added or not depending on font design
and on customary writing and display. E.g. the Romance feminine and masculine
ordinal indicators have one or zero underlines, to such extent that French
typography specifies that the masculine ordinal indicator, despite beinga
superscript small o, is unfit to compose the French "numéro" abbreviation,
that must not have an underline. Hence DEGREE SIGN is less bad than U+00BA.
If applying the same to Polish, "Magister" is "Mʳ" and is straigtforward
to input when using a new French keyboard layout or an enhanced variant of
any national Latin one having small supersripts on the Shift+Num level, or
via a ‹superscript› dead key, mapped e.g. on Shift + AltGr/Option + E or
any of the 26 letter keys as mnemonically convenient ("superscript"
translates to French "exposant"); or ‹Compose› ‹^› [e] (where the ASCII
circumflex or caret is repurposed for superscript compose sequences, while
‹circumflex accent› is active *after* LESS-THAN SIGN, consistently with the
*new* convention for ‹inverted breve› using LEFT PARENTHESIS rather than "g)".
These details are posted in this thread on this List rather than CLDR-USERS
in order to make clear that typing superscript letters directly via the
keyboard is easy, and therefore to propose it is not to harrass the end-user.
On 02/11/2018 13:09, Asmus Freytag via Unicode wrote:
> To transcribe the postcard would mean selecting the characters appropriate
> for the printed equivalent of the text.
As already suggested, selecting the variants can be done using variation
selectors, provided the Standard has defined the intended use case.
> If the printed form had a standard way of superscripting letters with a
> decoration below when used for abbreviations,
As already pointed out, Latin script does not benefit from a consensus
to use underline for superscript. E.g. Italian, Portuguese and Spanish
do use underline for superscript, English and French do not.
> then, and only then would we start discussing whether this decoration
> needs to be encoded, or whether it is something a font can supply as part
> of rendering the (sequence of) superscripted letters.
I think the problem is not completely outlined, as long as the use of
variation sequences is not mentioned. There is no "all" or "nothing"
dilemma, given Unicode has the means of providing a standard way of
representing calligraphic variations using variation selectors. E.g.
the letter ENG is preferred in big lowercase form when writing
Bambara, while other locales may like it in hooked uppercase.
The Bambara Arial font allows to make sure it is the right glyph,
and Arial in general follows the Bambara preference, but other fonts
do not, while some of them have the Bambara-fit glyph inside but
don’t display it unless urged by an OpenType supporting renderer,
and appropriate settings turned on, e.g. on a locale identifier basis.
> (Perhaps with the aid of markup identifying the sequence as abbreviation).
That seems to me a regression, after the front has moved in favor of
recognizing Latin script needs preformatted superscript. The use case is
clear, as we have ª, º, and n° with degree sign, and so on as already
detailed in long e-mails in this thread and elsewhere. There is no point
in setting up or maintaining a Unicode policy stating otherwise, as such
a policy would be inconsistent with longlasting and extremely widespread
The main thing to fix is the font stack of user agents, that is finally
everyone’s computer. Alternatively web sites may wish to use web fonts.
In order to have superscripts displayed in a professional and civilized
way, with no ransome note effect.
In aUnicode conformant way, to say it shortly.
> All else is just applying visual hacks to simulate a specific appearance,
> at the possible cost of obscuring the contents.
As already pointed out, the hack here is to use a higher level protocol
to simulate the effect of abbreviation indicator superscript. Using the
latter is not “obscuring”, but _clarifying_ “the contents.” But I agree
that adding combining diacritics to get the related underlines may
obscure the content if unsupported (displaying as .notdef box).
The concern about machine readability of the content is addressed by
setting up equivalence classes and using DUCET discussed in the
More information about the Unicode