Proposed Expansion of Grapheme Clusters to Whole Aksharas - Implementation Issues
Martin J. Dürst via Unicode
unicode at unicode.org
Thu Dec 21 02:55:33 CST 2017
On 2017/12/15 07:40, Richard Wordingham via Unicode wrote:
> On Mon, 11 Dec 2017 21:45:23 +0000
> Cibu Johny (സിബു) <cibu at google.com> wrote:
>> Malayalam could be a similar story. In case of Malayalam, it can be
>> font specific because of the existence of traditional and reformed
>> writing styles. A conjunct might be a ligature in traditional; and it
>> might get displayed with explicit virama in the reformed style. For
>> example see the poster with word ഉസ്താദ് broken as [u, sa-virama,
>> ta-aa, da-virama] - as it is written in the reformed style. As per
>> the proposed algorithm, it would be [u, sa-virama-ta-aa, da-virama].
>> These breaks would be used by the traditional style of writing.
>
> Working round that seems to be tricky. The best I can think of is to
> have two different locales, traditional and reformed, and hope that the
> right font is selected. It doesn't seem at all straightforward to
> work out what the font is doing even from a character to glyph map
> without knowing what the glyphs are. I'm not sure how one should have
> the difference designated - language variants, or two scripts?
I'm not at all familiar with Malayalam, but from my experience with
typing Japanese (where the average kana character requires two
keystrokes for input, but only one for deleting) would lead to different
advice. When typing, it is very helpful to know how many times one has
to hit backspace when making an error. This kind of knowledge is usually
assimilated into what one calls muscle memory, i.e. it is done without
thinking about it. I would guess that would be very difficult to
maintain two different kinds of muscle memory for typing Malayalam. (My
assumption is that the populations typing traditional and reformed
writing styles are not disjoint.)
Regards, Martin.
More information about the Unicode
mailing list