Tamil Brahmi Compliance for Unicode Versions 13.0 and 14.0

Richard Wordingham richard.wordingham at ntlworld.com
Sun Jun 12 06:28:31 CDT 2022


I'm asking so as to get my facts straight; this is not intended as a
complaint about as the standard.  The issues are probably inherent in
disunification.

Can a rendering system support Tamil Brahmi without being told whether
the text it is rendering is in Unicode 13.0 or 14.0?  I think the answer
is 'no', for the following reasons:

(a) In Unicode 14.0, rendering U+11034 BRAHMI LETTER LLA and
U+11075 BRAHMI LETTER OLD TAMIL LLA the same would be a violation of
character identity.  For example, Noto Sans Brahmi renders U+11034 with
a Tamil Brahmi-style glyph, which is compliant with Unicode 13.0, but
is a violation of character identity for Unicode 14.0.

(b) Likewise with U+11046 BRAHMI VIRAMA and U+11070 BRAHMI SIGN OLD
TAMIL VIRAMA, though less clearly.  Rendering the former as a pulli is
compliant with Unicode 13.0, but not with Unicode 14.0.

(c) Interpreting the four Unicode 13.0 vowel sequences
<LETTER/VOWEL_SIGN E/O, U+11046 BRAHMI VIRAMA> as calling for a
pulli-like element in the rendering does not appear to respect the
Unicode 14.0 character identity of U+11046.   HarfBuzz at least no
longer has a problem with vowel + virama/stacker sequences, which are
irremovable elements of the Sinhala script (two canonical
decompositions) and early documented features of the Khmer and Tai Tham
scripts, though only in the first is it part of the vowel symbol.
These combinations could have grandfathered (compare Malayalam chillus),
but haven't been.

I think this is a case where having a non-compliant rendering
system would be the right thing to do.  Fortunately for me, the data in
Tamil Brahmi that I am concerned with is mostly tagged as being in
Old Tamil.

Have I got all this right?

Richard.


More information about the Unicode mailing list