Encoding <combining abbreviation mark>

Philippe Verdy via Unicode unicode at unicode.org
Sun Nov 4 13:19:55 CST 2018

Le dim. 4 nov. 2018 à 18:34, Marcel Schneider <charupdate at orange.fr> a
écrit :

> On 04/11/2018 17:45, Philippe Verdy wrote:
> Marcel
> * As already repeatedly stated, I’m taking the one bit where TUS states
> that all natural languages shall be given a semantically unambiguous (ie
> not introducing new ambiguity) and interoperable digital representation.

I also support the sermantically unambiguous digital representation of all
natural languages.
Interoperability is always limited, even for existing script (including
Latin), that's why text renderers (and fonts) constantly need new
developments (but that does not need that these developments will be
That's why we have to document reasonnable fallbacks for rendering on
limited platforms, each time this is possible (and in this case this is
clearly possible with extremely low efforts).

Even the mere fallback to render the <combining abbreviation mark> as a
dotted circle (total absence of support) will not block completely reading
the abbreviation:
* you'll see "2e◌" (which is still better than only "2e", with minimal
impact) instead of
* "2◌" (which is worse ! this is still what already happens when you use
the legacy encoded <superscript e> which is also semantically ambiguous for
text processing), or
* "2e." (which is acceptable for rendering but ambiguous semantically for
text processing)

So compare things faily: the solution I propose is EVEN MOREINTEROPERABLE
than using <superscript Latin  letters> (which is also impossible for
noting all abbrevations as it is limited to just a few letters, and most of
the time limited to only the few lowercase IPA symbols). It puts an end to
the pressure to encode superscript letters.

If you want to support other notations (e.g. in chemical or
mathematics notations, where both superscript and subscript must be present
and stack together, and where the allowed varaition using a dot or similar)
you need another encoding and the existing legacy <superscript Latin
letters> are not suitable as well.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20181104/7c1592bb/attachment.html>

More information about the Unicode mailing list