Choosing the Set of Renderable Strings

James Kass via Unicode unicode at
Tue May 15 07:19:42 CDT 2018

On Mon, May 14, 2018 at 11:31 AM, Richard Wordingham via Unicode
<unicode at> wrote:

> ...  One could argue that the three positions require
> different glyphs for SIGN U.  Each font would need its own PUA.

Or a consensus.

> ... There are several
> places in Tai Tham layout where I want to swap glyphs round, but for
> the layout engine to do so for me would cause grief for other Tai Tham
> fonts. This rearrangement cannot be delegated to the rendering
> engine.  There are Tai Tham fonts which handle Indic rearrangement in
> the ccmp feature, but they are then totally defeated by either ccmp not
> being enabled or by the USE doing basic Indic shaping.

Suppose the OpenType specs were revised to include a bit which could
be set for disabling basic Indic shaping by the USE?  I wouldn't set
it if I were just starting out to make a font for a complex script
requiring basic Indic shaping, and cannot imagine why anyone else just
starting out would.

> ...
> I think it would also help to make SIGN AA and SIGN TALL AA into
> letters as far as the USE is concerned. The default grapheme
> segmentation rules already treat them as consonants. The possible
> downside is that so doing might mess up some fonts.

The possibility of messing up some fonts has seldom (if ever) stopped
needed revisions to shaping engines before.  I should know.

>> A good keyboard driver ...
> It won't work.  The text input delivered by X still needs to be
> supported, and without modifying the application, X can only input one
> character at a time.  Not everyone uses an 'input method'.

Every keyboard uses a driver, though.  I can't speak for "X", but my
understanding is that the keyboard driver acts as sort of a buffer
between the user's key strokes and the application.

> Apparently, Hangul input should not be canonically normalised in South
> Korea. I've seen an implementation of the USE render canonically
> equivalent strings differently.  ...

Because the USE failed or because the font provided look-ups for each
of those strings to different glyphs?

Best regards,

James Kass

More information about the Unicode mailing list