Character Sequences of Uncertain Rendering (was: Version linking?)

Philippe Verdy via Unicode unicode at
Sun Aug 27 12:55:31 CDT 2017

2017-08-27 6:06 GMT+02:00 Richard Wordingham via Unicode <
unicode at>:

> On Sat, 26 Aug 2017 21:52:19 +0200
> Philippe Verdy via Unicode <unicode at> wrote:
> > 2017-08-26 21:28 GMT+02:00 Richard Wordingham via Unicode <
> > unicode at>:
> > Of course SHY in this use is not suitable, but who knows if one will
> > not need this to split in tow parts what would be otherwise a single
> > cluster (possibly reordered by canonical reordering if one needs to
> > split between two Indic matras: this would suggest there's a need for
> > a new "empty base consonnant" for that Indic script, but SHY (U+00AD)
> > should probably not have the correct effect if it also inserts an
> > undesired line break opportunity, independantly of how the glyph
> > which would be rendered and the position (first or second line) where
> > it would be rendered if the linebreak is honored).
> I am confused as to what conceivable case you have in mind.  An example
> would help.  I wonder if I'm misunderstanding what you mean by
> 'canonical reordering'.

Canonical reordering is unambiguously refering to the canonical
equivalences in TUS. These are automated and can occur at any time, and the
only way to avoid them is to insert joiners. But they should never be
needed for normal texts, except to split clusters or introduce semantic
differences where they are relevant (and in that case the renderers will
also try to distinguish them, otherwise they can freely reorder every
sequence of diacritics with distinct non-zero combining classes and will
represent all canonically equivlent sequences exactly the same way without
distinguishing them).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list