Solution for Extended Tamil
James Kass
jameskass at code2001.com
Mon Jan 22 12:23:03 CST 2024
On 2024-01-22 11:01 AM, Shriramana Sharma via Unicode wrote:
> Please see the original attestations. I have noted that they always
> put the digit immediately after the consonant.
>
> There is not much meaning IMO in quoting online attestations or search
> results because when it doesn't display properly and throws a dotted
> circle, they will adjust it so that it doesn't display such junk.
> Speaking as one of the authors of a de facto Unicode-based
> transliteration scheme from Devanagari to Tamil which seems to be
> widely used (but we can't get assured statistics).
Quoting from
https://en.wiktionary.org/wiki/Module:sa-convert/testcases/Tamil :
"in most forms of Extended Tamil (including the Gita book mentioned
previously running to almost 420,000 copies) the diacritics are placed
between the consonant and any vowel signs placed to the right".
Maybe not always, for example : "நாபி⁴ஜாநாதி" -- would the superscript
digit be expected to break the ligature here?
As we know, when typing Tamil on a mechanical typewriter, for example,
U+0BC6 TAMIL VOWEL SIGN E was always typed before the consonant. But in
the standardized computer encoding for Tamil, U+0BC6 is always entered
after the consonant. In both cases, the display properly shows the
vowel sign on the left of the consonant.
The original question here was about a standardized encoding order for
Extended Tamil, and the user community has apparently already chosen a
/de facto/ standardization. And the results are legible.
Placing the superscript digits next to the consonants instead of at the
end of the syllable appears to be a display issue. But superscript
digits are "number, other" and "not reordered"; so the rendering system
won't automatically treat the digits as marks. Encoding clones of the
superscript digits to be treated as marks might not be practical. And,
after all, the character identity of those superscript digits is that
they are superscript digits.
Has any effort been made to use OpenType to get the desired display?
Classifying the superscripts digits as "marks" in the GDEF (glyph
definition) table and then using GPOS (glyph positioning) for the
desired placement? Or has the user community accepted the plain-text
legibility of the /de facto/ standard encoding order and reconciled with
the fact that not all published books can be exactly rendered in plain-text?
More information about the Unicode
mailing list