[Lohit-devel-list] Handling Malayalam "NTA" issue for Lohit2

pravin.d.s at gmail.com pravin.d.s at gmail.com
Mon Jan 13 00:04:33 CST 2014


On 10 January 2014 17:54, Shriramana Sharma <samjnaa at gmail.com> wrote:

> On Fri, Jan 10, 2014 at 3:45 PM, pravin.d.s at gmail.com
> <pravin.d.s at gmail.com> wrote:
> >     In my humble opinion here one thing is very clear that Unicode
> forgot to
> > add normalization (backward compatibility) for newly added sequence in
> (B).
>
> Dear Pravin,
>
> If by normalization you mean
> http://www.unicode.org/glossary/#normalization -- then it is not
> possible in this case since the individually encoded chillus do not
> have canonical decomposition to their related consonants. Indeed, that
> would defeat the purpose of the separate encoding, which was to
> provide semantically distinct chillus!
>

Ok not normalization but at least Unicode should mention old habit of
writing NTA and new with addition of atomic chillu. It will definitely help
people working on NLP to handle data having these two different sequence.


>
> On a more serious note, I think it is important to adhere to the
> standard, as it is good for you in the long run even though it is
> difficult at first. If you delay the adoption of the standard, it only
> gets all the harder as time passes, since in the interim even more
> people continue to assume the old behaviour...
>

>From font perspective if we consider there is NTA sequence is available in
both form (A) & (B) in data around. We have to add required rules for both
way. Unfortunately in this case Unicode has not consider for backward
compatibility but at least Lohit project definitely consider it.

So to be in safer side now i am fever of having both rules in font.

Regards,
Pravin Satpute
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/indic/attachments/20140113/9f0cae58/attachment.html>


More information about the Indic mailing list