Choosing the Set of Renderable Strings

Mon May 14 07:12:56 CDT 2018

In response to William Overington's post, it's easier to transcode
data from a PUA scheme into Unicode than it is to enter the data from
scratch.  (The same could be said for a customized ASCII font.)  Some
users may not wish to wait even the handful of years it took for
mainstream Indic complex scripts to be rendered properly.

At this phase of Unicode's progress, however, we shouldn't encourage
the interchange of such PUA data.  Since it's simple to transcode, any
such data should be transcoded prior to interchange or permanent
storage.  Recipients lacking systems supporting proper Unicode
rendering for complex scripts such as Tai Tham could then transcode it
to the PUA scheme for display/printing purposes.

An OpenType font, a keyboard driver, and a text conversion utility
might go a long way towards supporting complex scripts for users whose
systems cannot otherwise currently support them.

A good keyboard driver should be able to remove some of the burden off
of the OpenType tables, enabling multiple
fonts covering the same script to be used without having bloated and
redundant OpenType tables, by offering some degree of control over the
actual character strings which are being stored (and presented to the
font for rendering).

(Many font developers might consider that any kind of normalization
should be handled at input rather than left up to the font.  Keyboard
developers might have a different idea, though.)

A hundred years from now, properly encoded Tai Tham text should be
legible.  But the ability to display data using temporary PUA schemes
which were set up in lieu of proper rendering support appears to fade
away over time.