Android 5.1 ships with support for several minority scripts

Richard Wordingham richard.wordingham at ntlworld.com
Sat Mar 14 19:36:26 CDT 2015


On Sun, 15 Mar 2015 08:18:26 +1100
Andrew Cunningham <lang.support at gmail.com> wrote:

> Testing on Thai Tham will occur ... I was curious as to what the
> original design parameters forvthe font was. It is easier to evaluate
> a fonts language support knowing what was originally indended.

The target language (dubious concept) for Noto Sans Tai Tham appears to
be the Unicode code chart language!  The strongest evidence of any
affiliation seems to be for Pali!  I may just be jealous, but I find it
an appalling font. I'm reporting on the 2013 version, Version 1.01,
which I downloaded this week.

The main problem the font has is that it has not solved the problem of
vertical spacing.  The problem is that subscript and superscript
marks (including non-spacing sequences) are not naturally significantly
smaller than base letters, and eye strain results from their being
artificially shrunk.

Another problem is that it seems to have been written without any clear
model for Indic rearrangement, but merely a hope that it happens.
These two problems combine for consonant stacks.  To reduce the
vertical space problem, base consonants are shrunk when they are
followed by SAKOT + consonant.  However, this does not
occur if a vowel intervenes in the logical order, whether preposed (in
which case it will not intervene in the final glyph order) or
superscript (in which case it will intervene in the final glyph order.
Of course, these combinations don't occur in Pali, so perhaps the font's
target language is Pali!

There's an egregious bug in the handling of the sequences MEDIAL RA plus
preposed vowel (E, AE, OO, AI, THAM AI).  They are formed into a
ligature with MEDIAL RA on the left, whereas it should appear on the
right!

The font also exhibits a problem relating to <SAKOT, BA>, <SAKOT PA>
and <SIGN BA>.  There are two subscript forms relating to BA and PA.
The common form is a spacing subscript form, generally used for final
consonants and as the second element of the Pali clusters -pp- and
-mp-.  (It's also often used for some other clusters.)  Note that the
normal way of writing the phonetically final consonant as a base
consonant generally uses BA, though PA may occasionally occur as a
result of influence from Standard Thai.  The rarer, non-spacing form
represents /b/ at the start of a phonetic cluster.

The final encoding proposal for the Tai Tham script (then designated
Lanna) assigned <SAKOT, BA> to the non-spacing form and <SAKOT, PA> to
the spacing form - this corresponded to their uses as phonetically
initial consonants.  There was thus an unwelcome difference between the
two default ways of writing a final consonant - BA or <SAKOT, PA>,
depending on what preceded.  (There was already another pair like this
- RA or <SAKOT, NA> for final /n/.)  However, during the ISO
standardisation process, U+1A5D TAI THAM CONSONANT SIGN BA was
introduced to represent the non-spacing form. This made the original
distinction of <SAKOT, BA> and <SAKOT, PA> redundant, and most fonts
and writers seem to use <SAKOT, BA> for the spacing subscript form.
<SAKOT, BA> and <SAKOT, PA> generally yield the same glyph.

However, in the Noto Sans Tai Tham font, <SAKOT, BA> is rendered as a
non-spacing glyph, similar to that for <SIGN BA>.

The font only use mark2mark positioning for tone marks.  As a result, a
subscript vowel will overstrike a subscript consonant resulting in an
illegible blob, e.g. in the word ᩉ᩠ᨾᩪ <HA, SAKOT, MA, UU> /muː/ 'pig'.

There are other issues with the font.  I was wondering how to organise
a bug report or set of bug reports on it when this thread arose.

Richard.



More information about the Unicode mailing list