From rick at unicode.org  Thu Jan  2 11:25:35 2014
From: rick at unicode.org (Rick McGowan)
Date: Thu, 02 Jan 2014 09:25:35 -0800
Subject: Mail list changes for 2014
In-Reply-To: <52C2F5D4.30909@unicode.org>
References: <529E6194.5020103@unicode.org> <52C2F5D4.30909@unicode.org>
Message-ID: <52C5A10F.7060501@unicode.org>

The Indic mail list has now been re-activated.
Regards,
Rick

On 12/31/2013 8:50 AM, Rick McGowan wrote:
> As mentioned, this list will be taken off-line shortly, and be 
> restored after the new year. (A note will be sent when it is back.)
> Regards,
>     Rick
>
> On 12/3/2013 2:56 PM, Rick McGowan wrote:
>> At the end of the year, we will be changing the mail list server for 
>> the public-access mail lists, including this one. The new system will 
>> be Gnu "Mailman", an interface familiar to many. This should make it 
>> easier for users to handle their subscriptions and options in one 
>> place, via the web interface.
>>
>> We will thus be shutting down the public mail lists over the "holiday 
>> break" in the final days of 2013, and re-open with the new system in 
>> January 2014.
>>
>> Affected mail lists are those listed on the Mail Lists page here:
>>     http://www.unicode.org/consortium/distlist.html
>> including Unicode, CLDR-Users, ULI-Users, and Indic.
>>
>> The new mail list system is documented here: 
>> http://www.gnu.org/software/mailman/
>>
>


From pravin.d.s at gmail.com  Fri Jan 10 04:15:00 2014
From: pravin.d.s at gmail.com (pravin.d.s at gmail.com)
Date: Fri, 10 Jan 2014 15:45:00 +0530
Subject: Handling Malayalam "NTA" issue for Lohit2
Message-ID: <CALuKHAeWWYTv4NkH-1mgzA0J3qHBVmxyOCdKvFwu3P20swWs+Q@mail.gmail.com>

Hi All,

    We are working on lohit2[1] project, whose plan is to create standard
and reusable open type tables with additional improvement. Lohit as a
default system fonts in most of the open source distros always follow
standard around language technology. (Font specification, Storage,
Guideline related to Languages)

    Recently we started working on Lohit Malayalam font [2] with some
planned improvement and came across couple of bugs related [3][4] with well
know "NTA" issue introduced during the addition of Atomic chillu characters
in Unicode 5.1

    Now dilemma is number of users already using

*     A. u0D28 + u0D4D + u0D31 for getting NTA character even before
Unicode 5.1 *


*     B. But Unicode from 5.1 onward says (TUS 6.2 chapter 9.9 p 321) use
        u0D7B + u0D4D + u0D31 for getting same "NTA" *
    In my humble opinion here one thing is very clear that Unicode forgot
to add normalization (backward compatibility) for newly added sequence in
(B). Still i have not seen any improvement in it from long time.

    Now dilemma with lohit2 development is

    - Lohit 1 is supporting sequence (A) from long time (even before
Unicode 5.1), so for the backward compatibility lohit2 should support the
same.

    - Since Lohit follows standards, it is important to support sequence
(B) for following Unicode 6.3. But following Unicode 6.3 in this case
clearly invites dual encoding without any normalization rules handy.

    Good documentation on NTA issues is available at [5]

    Presently i am in favour of not supporting Unicode defined sequence (B)
in lohit2 and keep on using (A) which is used in Lohit fonts family from
long time.

    Please let me know your view on it. Is there any chance of getting this
mention in Unicode chapter 9? is there any chance of Normalization rule for
this?


Regards,
Pravin Satpute


1.
http://pravin-s.blogspot.in/2013/08/project-creating-standard-and-reusable.html
2.
http://pravin-s.blogspot.in/2013/12/lohit2-lohit-malayalam-development-plans.html
3. https://bugzilla.redhat.com/show_bug.cgi?id=1016984
4. https://bugzilla.redhat.com/show_bug.cgi?id=1016989
5. http://thottingal.in/documents/Malayalam-NTA.pdf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/indic/attachments/20140110/7f0c7d13/attachment.html>

From samjnaa at gmail.com  Fri Jan 10 06:24:46 2014
From: samjnaa at gmail.com (Shriramana Sharma)
Date: Fri, 10 Jan 2014 17:54:46 +0530
Subject: [Lohit-devel-list] Handling Malayalam "NTA" issue for Lohit2
In-Reply-To: <CALuKHAeWWYTv4NkH-1mgzA0J3qHBVmxyOCdKvFwu3P20swWs+Q@mail.gmail.com>
References: <CALuKHAeWWYTv4NkH-1mgzA0J3qHBVmxyOCdKvFwu3P20swWs+Q@mail.gmail.com>
Message-ID: <CAH-HCWVX3AjnR9=qR6ai02sYh7TWVHLGxKvVYuKPDitP7UJqcw@mail.gmail.com>

On Fri, Jan 10, 2014 at 3:45 PM, pravin.d.s at gmail.com
<pravin.d.s at gmail.com> wrote:
>     In my humble opinion here one thing is very clear that Unicode forgot to
> add normalization (backward compatibility) for newly added sequence in (B).

Dear Pravin,

If by normalization you mean
http://www.unicode.org/glossary/#normalization -- then it is not
possible in this case since the individually encoded chillus do not
have canonical decomposition to their related consonants. Indeed, that
would defeat the purpose of the separate encoding, which was to
provide semantically distinct chillus!

The recent additional chillus trickling into the standard seems to
indicate that one should have encoded a CHILLU MARKER back then, but
there's no going back now, so chillus galore! ;-)

On a more serious note, I think it is important to adhere to the
standard, as it is good for you in the long run even though it is
difficult at first. If you delay the adoption of the standard, it only
gets all the harder as time passes, since in the interim even more
people continue to assume the old behaviour...

-- 
Shriramana Sharma ???????????? ????????????


From paivakil at gmail.com  Fri Jan 10 11:46:30 2014
From: paivakil at gmail.com (Mahesh T. Pai)
Date: Fri, 10 Jan 2014 23:16:30 +0530
Subject: Handling Malayalam "NTA" issue for Lohit2
In-Reply-To: <CALuKHAeWWYTv4NkH-1mgzA0J3qHBVmxyOCdKvFwu3P20swWs+Q@mail.gmail.com>
References: <CALuKHAeWWYTv4NkH-1mgzA0J3qHBVmxyOCdKvFwu3P20swWs+Q@mail.gmail.com>
Message-ID: <20140110174630.GA18104@localhost>

pravin.d.s at gmail.com said on Fri, Jan 10, 2014 at 03:45:00PM +0530,:
    - Lohit 1 is supporting sequence (A) from long time (even before
 > Unicode 5.1), so for the backward compatibility lohit2 should support the
 > same.
 > 

I believe thet the UTC wanted to maintain compatibility with some
_beta_ version of Microsoft's some software in making the choice that
it did regarding the /nta/ sequence. 


 >     Presently i am in favour of not supporting Unicode defined
 > sequence (B) in lohit2 and keep on using (A) which is used in Lohit
 > fonts family from long time.

Allow me to go on a nostalgia trip. Almost a decade back, the then SMC
team came accross what was obvious lack of clarity in the UTS. They
decided, against my advise, to follow the suggestions in OpenType
definition. To be fair, then, I had no alternative to offer, except
not to implement the suggestion in the OpenType pages. Microsoft
ultimately waited for some clarity in the UTS before implementing
anything. and the communimity efforts went (mostly) in vain. 

Right now, given a choice between supporting legacy data and
standards, I will choose the latter, with some kind of jugaad based on
the PUA / glyph name to enable support for legacy data. 

Not the ideal situation, but when politics get the uppoer hand over
merits, efficiency and appropriateness always takes a backseat. 

-- 
Mahesh T. Pai   ||
free -  (adj) able to  act at will;  not hampered;
       not  under  compulsion  or restraint;  free
       from  obligations or  duties; not  bound to
       servitude; at liberty.

From pravin.d.s at gmail.com  Mon Jan 13 00:04:33 2014
From: pravin.d.s at gmail.com (pravin.d.s at gmail.com)
Date: Mon, 13 Jan 2014 11:34:33 +0530
Subject: [Lohit-devel-list] Handling Malayalam "NTA" issue for Lohit2
In-Reply-To: <CAH-HCWVX3AjnR9=qR6ai02sYh7TWVHLGxKvVYuKPDitP7UJqcw@mail.gmail.com>
References: <CALuKHAeWWYTv4NkH-1mgzA0J3qHBVmxyOCdKvFwu3P20swWs+Q@mail.gmail.com>
 <CAH-HCWVX3AjnR9=qR6ai02sYh7TWVHLGxKvVYuKPDitP7UJqcw@mail.gmail.com>
Message-ID: <CALuKHAcGNpqYLdurbWnnqiS+5=2APinPbKWS4Hs+Lt994qgyeQ@mail.gmail.com>

On 10 January 2014 17:54, Shriramana Sharma <samjnaa at gmail.com> wrote:

> On Fri, Jan 10, 2014 at 3:45 PM, pravin.d.s at gmail.com
> <pravin.d.s at gmail.com> wrote:
> >     In my humble opinion here one thing is very clear that Unicode
> forgot to
> > add normalization (backward compatibility) for newly added sequence in
> (B).
>
> Dear Pravin,
>
> If by normalization you mean
> http://www.unicode.org/glossary/#normalization -- then it is not
> possible in this case since the individually encoded chillus do not
> have canonical decomposition to their related consonants. Indeed, that
> would defeat the purpose of the separate encoding, which was to
> provide semantically distinct chillus!
>

Ok not normalization but at least Unicode should mention old habit of
writing NTA and new with addition of atomic chillu. It will definitely help
people working on NLP to handle data having these two different sequence.


>
> On a more serious note, I think it is important to adhere to the
> standard, as it is good for you in the long run even though it is
> difficult at first. If you delay the adoption of the standard, it only
> gets all the harder as time passes, since in the interim even more
> people continue to assume the old behaviour...
>

>From font perspective if we consider there is NTA sequence is available in
both form (A) & (B) in data around. We have to add required rules for both
way. Unfortunately in this case Unicode has not consider for backward
compatibility but at least Lohit project definitely consider it.

So to be in safer side now i am fever of having both rules in font.

Regards,
Pravin Satpute
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/indic/attachments/20140113/9f0cae58/attachment.html>

From pravin.d.s at gmail.com  Mon Jan 13 00:28:52 2014
From: pravin.d.s at gmail.com (pravin.d.s at gmail.com)
Date: Mon, 13 Jan 2014 11:58:52 +0530
Subject: Handling Malayalam "NTA" issue for Lohit2
In-Reply-To: <20140110174630.GA18104@localhost>
References: <CALuKHAeWWYTv4NkH-1mgzA0J3qHBVmxyOCdKvFwu3P20swWs+Q@mail.gmail.com>
 <20140110174630.GA18104@localhost>
Message-ID: <CALuKHAeatXn4TkgpL-Qu8Uc+O41H8pwH+DTfNiK35wwnnytmAA@mail.gmail.com>

On 10 January 2014 23:16, Mahesh T. Pai <paivakil at gmail.com> wrote:

> pravin.d.s at gmail.com said on Fri, Jan 10, 2014 at 03:45:00PM +0530,:
>     - Lohit 1 is supporting sequence (A) from long time (even before
>  > Unicode 5.1), so for the backward compatibility lohit2 should support
> the
>  > same.
>  >
>
> I believe thet the UTC wanted to maintain compatibility with some
> _beta_ version of Microsoft's some software in making the choice that
> it did regarding the /nta/ sequence.
>
>
>  >     Presently i am in favour of not supporting Unicode defined
>  > sequence (B) in lohit2 and keep on using (A) which is used in Lohit
>  > fonts family from long time.
>
> Allow me to go on a nostalgia trip. Almost a decade back, the then SMC
> team came accross what was obvious lack of clarity in the UTS. They
> decided, against my advise, to follow the suggestions in OpenType
> definition. To be fair, then, I had no alternative to offer, except
> not to implement the suggestion in the OpenType pages. Microsoft
> ultimately waited for some clarity in the UTS before implementing
> anything. and the communimity efforts went (mostly) in vain.
>

I was wondering how ISCII was handling this.


>
> Right now, given a choice between supporting legacy data and
> standards, I will choose the latter, with some kind of jugaad based on
> the PUA / glyph name to enable support for legacy data.
>

Yeah, as said above will support both legacy and standard sequence.


>
> Not the ideal situation, but when politics get the uppoer hand over
> merits, efficiency and appropriateness always takes a backseat.
>

That is pain point of standardization activities.

Thanks & Regards,
Pravin Satpute
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/indic/attachments/20140113/1b958702/attachment.html>

From cibucj at gmail.com  Mon Jan 13 00:32:16 2014
From: cibucj at gmail.com (=?UTF-8?B?4LS44LS/4LSs4LWBIOC0uOC0vyDgtJzgtYY=?=)
Date: Sun, 12 Jan 2014 22:32:16 -0800
Subject: [Lohit-devel-list] Handling Malayalam "NTA" issue for Lohit2
In-Reply-To: <CALuKHAcGNpqYLdurbWnnqiS+5=2APinPbKWS4Hs+Lt994qgyeQ@mail.gmail.com>
References: <CALuKHAeWWYTv4NkH-1mgzA0J3qHBVmxyOCdKvFwu3P20swWs+Q@mail.gmail.com>
 <CAH-HCWVX3AjnR9=qR6ai02sYh7TWVHLGxKvVYuKPDitP7UJqcw@mail.gmail.com>
 <CALuKHAcGNpqYLdurbWnnqiS+5=2APinPbKWS4Hs+Lt994qgyeQ@mail.gmail.com>
Message-ID: <CAD8TiP4SejuB_Cqs9-XPW5ddN_KiYkGjn8=48+KTggTXLY-n7g@mail.gmail.com>

In fact, there is one more sequence to consider. Kartika in Windows follows
<NA, VIRAMA, ZWJ, RRA> for NTA. However, the existing data in that sequence
is quite less.

In case, Chillus standard is asking display software to be prepared for
data in both sequences. I agree, it could document NTA's legacy Vs standard
sequences, likewise.


2014/1/12 pravin.d.s at gmail.com <pravin.d.s at gmail.com>

>
>
>
> On 10 January 2014 17:54, Shriramana Sharma <samjnaa at gmail.com> wrote:
>
>> On Fri, Jan 10, 2014 at 3:45 PM, pravin.d.s at gmail.com
>> <pravin.d.s at gmail.com> wrote:
>> >     In my humble opinion here one thing is very clear that Unicode
>> forgot to
>> > add normalization (backward compatibility) for newly added sequence in
>> (B).
>>
>> Dear Pravin,
>>
>> If by normalization you mean
>> http://www.unicode.org/glossary/#normalization -- then it is not
>> possible in this case since the individually encoded chillus do not
>> have canonical decomposition to their related consonants. Indeed, that
>> would defeat the purpose of the separate encoding, which was to
>> provide semantically distinct chillus!
>>
>
> Ok not normalization but at least Unicode should mention old habit of
> writing NTA and new with addition of atomic chillu. It will definitely help
> people working on NLP to handle data having these two different sequence.
>
>
>>
>> On a more serious note, I think it is important to adhere to the
>> standard, as it is good for you in the long run even though it is
>> difficult at first. If you delay the adoption of the standard, it only
>> gets all the harder as time passes, since in the interim even more
>> people continue to assume the old behaviour...
>>
>
> From font perspective if we consider there is NTA sequence is available in
> both form (A) & (B) in data around. We have to add required rules for both
> way. Unfortunately in this case Unicode has not consider for backward
> compatibility but at least Lohit project definitely consider it.
>
> So to be in safer side now i am fever of having both rules in font.
>
> Regards,
> Pravin Satpute
>
>
>
> _______________________________________________
> Indic mailing list
> Indic at unicode.org
> http://unicode.org/mailman/listinfo/indic
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/indic/attachments/20140112/2197a4d8/attachment.html>

From pavanaja at vishvakannada.com  Sat Jan 18 06:38:25 2014
From: pavanaja at vishvakannada.com (Pavanaja U B)
Date: Sat, 18 Jan 2014 18:08:25 +0530
Subject: Tulu Unicode
Message-ID: <001501cf144a$316a28d0$943e7a70$@vishvakannada.com>

What are the steps involved to add Tulu language to Unicode?

 
Regards,

Pavanaja

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/indic/attachments/20140118/19c1c6a1/attachment.html>

From sisrivas at yahoo.com  Sat Jan 18 07:26:52 2014
From: sisrivas at yahoo.com (Sinnathurai Srivas)
Date: Sat, 18 Jan 2014 05:26:52 -0800 (PST)
Subject: Tulu Unicode
In-Reply-To: <001501cf144a$316a28d0$943e7a70$@vishvakannada.com>
References: <001501cf144a$316a28d0$943e7a70$@vishvakannada.com>
Message-ID: <1390051612.3615.YahooMailNeo@web125803.mail.ne1.yahoo.com>

I would like to interact with experts involved in encoding Tulu.

The use of the original scientific base for gramatising alphabet, which is scalable and covers the entire spectrum with simplified representation need to be considered as Tulu is a branch of such original foundations.

Thanks
Sinnathurai Srivas


On Saturday, 18 January 2014, 12:47, Pavanaja U B <pavanaja at vishvakannada.com> wrote:
 
What are the steps involved to add Tulu language to Unicode?
?
Regards,
Pavanaja
?
?
?
_______________________________________________
Indic mailing list
Indic at unicode.org
http://unicode.org/mailman/listinfo/indic
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/indic/attachments/20140118/ebf8d4f4/attachment-0001.html>

From samjnaa at gmail.com  Sun Jan 19 00:53:50 2014
From: samjnaa at gmail.com (Shriramana Sharma)
Date: Sun, 19 Jan 2014 12:23:50 +0530
Subject: Tulu Unicode
In-Reply-To: <001501cf144a$316a28d0$943e7a70$@vishvakannada.com>
References: <001501cf144a$316a28d0$943e7a70$@vishvakannada.com>
Message-ID: <CAH-HCWW06ZN=8ectXde6Kg+1yeW0sXf4ox3O9+ciaR3V9HhHXg@mail.gmail.com>

You cannot add a language to Unicode -- you can only add a script, for
which you need to prepare a technically correct proposal with
sufficient attestations.

Or do you mean adding data about Tulu language written in the Kannada
script (such as weekday names etc) to the related standard CLDR? See
the CLDR section on unicode.org. (I'm not very knowledgeable about
CLDR.)


-- 
Shriramana Sharma ???????????? ????????????


From naa.ganesan at gmail.com  Sun Jan 19 01:05:29 2014
From: naa.ganesan at gmail.com (N. Ganesan)
Date: Sat, 18 Jan 2014 23:05:29 -0800
Subject: Tulu Unicode
In-Reply-To: <001501cf144a$316a28d0$943e7a70$@vishvakannada.com>
References: <001501cf144a$316a28d0$943e7a70$@vishvakannada.com>
Message-ID: <CAA+QEUdPahd4eTFo3i8=73zSCOQkroiv_WhW1HD4eLcpAYxb9g@mail.gmail.com>

On Sat, Jan 18, 2014 at 4:38 AM, Pavanaja U B <pavanaja at vishvakannada.com>wrote:

> What are the steps involved to add Tulu language to Unicode?
>
>
>

There is already a detailed proposal to add Tulu script to Unicode standard,
M. Everson's document on Tulu encoding:
http://www.unicode.org/L2/L2011/11120-n4025-tulu.pdf

Tulu, like many Indian and other languages is written in two scripts.
For example, Tevaram, sacred scriptures from Tamil, gets written in Tamil
as well as Grantha scripts.

Regards
N. Ganesan


> Regards,
>
> Pavanaja
>
>
>
>
>
>
>
> _______________________________________________
> Indic mailing list
> Indic at unicode.org
> http://unicode.org/mailman/listinfo/indic
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/indic/attachments/20140118/e6731daf/attachment.html>