Character Sequences of Uncertain Rendering (was: Version linking?)

Richard Wordingham via Unicode unicode at
Sun Aug 27 21:40:54 CDT 2017

On Sun, 27 Aug 2017 19:55:31 +0200
Philippe Verdy via Unicode <unicode at> wrote:

> 2017-08-27 6:06 GMT+02:00 Richard Wordingham via Unicode <
> unicode at>:  
> Canonical reordering is unambiguously refering to the canonical
> equivalences in TUS. These are automated and can occur at any time,
> and the only way to avoid them is to insert joiners. But they should
> never be needed for normal texts, except to split clusters or
> introduce semantic differences where they are relevant (and in that
> case the renderers will also try to distinguish them, otherwise they
> can freely reorder every sequence of diacritics with distinct
> non-zero combining classes and will represent all canonically
> equivlent sequences exactly the same way without distinguishing them).

This wasn't the sort of problem I was talking about.  The Indic
example with undefined rendering has two left matras with ccc=0.  The
questions was whether they should be displayed from left to right (as in
MS Edge) or right to left (as in Firefox).

The problem of diacritics below having different combining classes has
been raised for minority languages in Thai.  There seems a definite
prospect that the rendering order has to depend on the writing system -
and the other order would simply be wrong.  Standardisation occurs
outside the purview of the UTC.  The order may be forced by CGJ,
which is a joiner in name only when it occurs before combining marks.


More information about the Unicode mailing list