PUA (BMP) planned characters HTML tables

Richard Wordingham via Unicode unicode at unicode.org
Mon Aug 12 02:26:07 CDT 2019

On Mon, 12 Aug 2019 01:21:42 +0000
James Kass via Unicode <unicode at unicode.org> wrote:

> There was a time when populating the PUA with precomposed glyphs was 
> necessary for printing or display, but that time has passed.

There is still the issue that in pure X one can't put sequences of
characters on a key; if the application doesn't invoke an input method
one is stuck.  Useful 20-year old proprietary code may be totally unable
to use modern font capabilities.  Don't forget the Cobol Y10k joke.

On Ubuntu at least, there was a period when Emacs couldn't access
X-based input methods from an English locale. The work-around: Use a
Japanese locale plus the vanilla lack of internationalisation in the
interface, or Emacs's very convenient alternative keyboard capability
for text input as opposed to commands.  The bug turned out to be in the
definition of the locales, i.e. in privileged data beyond the purview
of Emacs.

As to the need for the PUA, writing fonts to cope with Tai Tham
rendering engines is not easy, and it's no surprise that the PUA is used
on line for a newspaper that uses the Tai Tham script.  The USE is too
user-hostile for it to have helped if it had been available earlier.
(It just ignored the regular expression published in 2007.
(It's in L2/07-007R in the UTC document register, ISO/IEC
JTC1/SC2/WG2/N3207R on ISO land.) Indeed, perhaps I should be
researching the PUA encoding for Tai Tham. (My Tai Tham font Da Lekh
started as proof of principle, for there is already an unpleasant
amount of glyph sequence changing, some style-dependent. I couldn't see
how to get rendering engine support even when it might be added.  I was
pleasantly surprised at how far from impossible Tai Tham layout was
until the USE came along and made everything harder.  I now have to work
out which glyph instances have already been Indicly rearranged when I
repair the clustering.)

Oh, and i seem to need some PUA codepoints for vowels that get stranded
when line-breaks occur between the columns of an akshara.  The
proposals show this phenomenon in old(?) Pali text.  Or is there any
chance of getting them encoded?


More information about the Unicode mailing list