Choosing the Set of Renderable Strings

Richard Wordingham via Unicode unicode at unicode.org
Tue May 15 17:51:35 CDT 2018


On Tue, 15 May 2018 06:04:45 -0800
James Kass via Unicode <unicode at unicode.org> wrote:

> Display behaviour which is script-specific should be handled by the
> rendering/shaping engine.  Only that which is font-specific should be
> handled by the font.

That makes a lot of sense.  Unfortunately, script-specific behaviour
often needs to be fixed or is completely absent.  It annoys me that my
font has to redo the bits of basic Indic shaping that are left undone
because the USE chops the aksharas up.

> The font's OpenType tables will include pointers to presentation forms
> which aren't directly encoded, the location and repertoire of which
> would naturally differ from font to font.  Likewise, the font's GPOS
> tables will handle things such as mark positioning, because each
> font's metrics are going to be different.
> 
> Because the USE apparently accesses current on-line Unicode data, the
> USE will re-order anything which needs to be moved around.

In Thai, the sequence <consonant, tone, SARA AM> is converted to
<consonant, NIKHAHIT, tone, SARA AA>.  Please tell me where in the
on-line Unicode data it says that:

1) Tai Tham <consonant, tone, SIGN AA, MAI KANG> is reordered to
<consonant, MAI KANG-am**, tone, SIGN AA>:
(a) When the base consonant is NA; and also
(b) in a typical Northern Thai font, but not a Lao*, Tai Lue or Tai
Khuen font.

*Some claim that Lao Tham doesn't use tone marks, but some version at
least does, or Gregory Kourilsky wouldn't have included them in his
encoding of the Tham script.

**The placement may be different to that of MAI KANG in /bɔː waː/
ᨷᩴ᩠᩵ᩅᩣ <BA, MAI KANG, TONE-1, SAKOT, WA, SIGN AA> or ᨷᩴ᩠ᩅ᩵ᩣ <BA, MAI
KANG, SAKOT, WA, TONE-1, SIGN AA> - I don't know whether the first or
the second tone mark is dropped.

(Getting the tone and MAI KANG to interact after <NA, tone, SIGN AA,
MAI KANG> has formed the NAA ligature from <NA, SIGN AA> seems
impossible.  I assume this is because such interaction is undesirable
for Arabic.) 

2) <tone, (subscript consonant or sign, or stand-in)+, top matra> needs
to be rearranged to <top matra, tone, (subscript ...)+> (or equivalent).

And how am I supposed to position MAI SAM to the right of the rightmost
of the level 1 marks above?  Is this a standard positioning as opposed
to a stylistic decision?

Incidentally, how does Unicode document the handling of a tone mark
before U+0E33 THAI CHARACTER SARA AM?

Richard. 




More information about the Unicode mailing list