From samjnaa at gmail.com  Wed May  6 03:22:36 2015
From: samjnaa at gmail.com (Shriramana Sharma)
Date: Wed, 6 May 2015 13:52:36 +0530
Subject: Bengali Vedic characters
Message-ID: <CAH-HCWXph0aan04pKith5vhErjDzAJY0ERvjRoqLZhNy5oSCUg@mail.gmail.com>

This is w.r.t. Srinidhi's preliminary review of non-Devanagari Vedic
characters L2/15-101, my comments on it in L2/15-113, and the script
review committee's report L2/15-149 p 5.

I had located the Arcika (verse) part of the Kauthuma Sama Veda
printed in Bengali script via DLI:
http://www.dli.ernet.in/cgi-bin/DBscripts/allmetainfo.cgi?barcode=4990010095079.
(Obviously Srinidhi had obtained the samples from this same site but
had neglected to provide the link in his document.)

However no scans are available on DLI for the Gana (melody) part of
the same, whereas it is this part which requires a greater number of
svara markers.

First I was wondering whether the Gana part was published in Bengali
script at all, but I hunted down the phone number of a qualified
scholar of the Kauthuma Sama Veda who resides in Varanasi and who is a
native Bengali, and had a telephonic conversation with him an hour
ago.

He informs me that the entire Kauthuma Sama Veda including the Arcika
(verse) and Gana (melody) forms was published by Satyavrata Samashrami
in Calcutta in the previous century. However he has no printed copies
on hand and one has to go to the National Library in Calcutta to
locate them.

I am not sure when I will have the time and occasion to travel to
Calcutta from Tamil Nadu for this. If anyone can help in locating
digital copies of the Gana part, a comprehensive proposal for Bengali
Sama Vedic svara markers can be prepared...

-- 
Shriramana Sharma ???????????? ????????????


From elie.roux at telecom-bretagne.eu  Tue May 12 08:28:04 2015
From: elie.roux at telecom-bretagne.eu (=?UTF-8?B?w4lsaWUgUm91eA==?=)
Date: Tue, 12 May 2015 15:28:04 +0200
Subject: Deterministic sorting impossible for Tibetan with current state
Message-ID: <5551FFE4.7070605@telecom-bretagne.eu>

Dear all,

I'm not sure I'm sending a mail to the correct list, please tell me if
I'm not.

I'm currently working on Tibetan sorting. It mostly works, except for
this case:

????

This unicode sequence can be interpreted in two very different ways,
both valid in terms of Tibetan language:

- prefix ?, main letter ?, suffix ?
- main letter ?, suffix ?, second suffix ?

Both have their entries in a Tibetan dictionnary: one in the entries for
letter ?, another (with a different meaning) in the entries for letter ?.

It is thus currently impossible to determine the place of the string
"????" in a dictionnary (Tibetans guess from the context).

Are there other languages where this undetermination happens? Did they
solve that problem? If not, what I propose is a new character,
invisible, with the meaning "previous letter is the main letter in case
of indetermination". This would, of course, not solve the problem
entirely, as the string "????" would still be undetermined, but at least
it would be possible for users to force its determination.

What do you think?

Thank you,
-- 
Elie Roux


From richard.wordingham at ntlworld.com  Tue May 12 13:24:11 2015
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Tue, 12 May 2015 19:24:11 +0100
Subject: Deterministic sorting impossible for Tibetan with current state
In-Reply-To: <5551FFE4.7070605@telecom-bretagne.eu>
References: <5551FFE4.7070605@telecom-bretagne.eu>
Message-ID: <20150512192411.2f7a4c69@JRWUBU2>

On Tue, 12 May 2015 15:28:04 +0200
?lie Roux <elie.roux at telecom-bretagne.eu> wrote:

> I'm currently working on Tibetan sorting. It mostly works, except for
> this case:
> 
> ????
> 
> This unicode sequence can be interpreted in two very different ways,
> both valid in terms of Tibetan language:
> 
> - prefix ?, main letter ?, suffix ?
> - main letter ?, suffix ?, second suffix ?
> 
> Both have their entries in a Tibetan dictionnary: one in the entries
> for letter ?, another (with a different meaning) in the entries for
> letter ?.
> 
> It is thus currently impossible to determine the place of the string
> "????" in a dictionnary (Tibetans guess from the context).
> 
> Are there other languages where this undetermination happens?

Certain examples are rare; it's been claimed that there are none in
Tibetan.  Welsh has this problem, but the closest I could come is
_englyna_ 'to compose' between eg- and eh- versus _engrafu_ 'to
engrave', between enf- and enh-.

> Did they solve that problem?

Where one is a digraph, as with Welsh the letter 'ng', which comes
between 'g' and 'h', the Unicode Collation Algorithm recommends
inserting U+034F COMBINING GRAPHEME JOINER (CGJ).  Soft hyphen will
often do as well, as in the Welsh place name Llangollen, which does not
include the letter 'ng'.

So for your example, I would suggest that as in a lean Tibetan
collation table, <U+0F58 TIBETAN LETTER MA, U+0F44 TIBETAN LETTER NGA>
would be a collating element, that you write _mangs_ as <U+0F58, U+034F,
U+0F44, U+0F66 TIBETAN LETTER SA> and reserve <U+0F58, U+0F44, U+0F66>
for _mngas_.

Richard.


From ake.persson at mimer.se  Tue May 12 13:31:02 2015
From: ake.persson at mimer.se (=?UTF-8?Q?=C3=85ke_Persson?=)
Date: Tue, 12 May 2015 20:31:02 +0200
Subject: Deterministic sorting impossible for Tibetan with current state
In-Reply-To: <5551FFE4.7070605@telecom-bretagne.eu>
References: <5551FFE4.7070605@telecom-bretagne.eu>
Message-ID: <1DBC21ECC96246E7A0E59106A9BACEC7@upright.nu>

Dear ?lie,

The combination
- prefix ?, main letter ?, suffix ?
does not exist in the dictionaries referenced from
http://developer.mimer.com/charts/tibetan.htm.

Where did you find it?

Best regards,
?ke Persson

> I'm currently working on Tibetan sorting. It mostly works, except for
> this case:
>
> ????
>
> This unicode sequence can be interpreted in two very different ways,
> both valid in terms of Tibetan language:
>
> - prefix ?, main letter ?, suffix ?
> - main letter ?, suffix ?, second suffix ?
>
> Both have their entries in a Tibetan dictionnary: one in the entries for
> letter ?, another (with a different meaning) in the entries for letter ?.
>
> It is thus currently impossible to determine the place of the string
> "????" in a dictionnary (Tibetans guess from the context).
>
> Are there other languages where this undetermination happens? Did they
> solve that problem? If not, what I propose is a new character,
> invisible, with the meaning "previous letter is the main letter in case
> of indetermination". This would, of course, not solve the problem
> entirely, as the string "????" would still be undetermined, but at least
> it would be possible for users to force its determination.
>
> What do you think?
>
> Thank you,
> -- 
> Elie Roux
>
> _______________________________________________
> Indic mailing list
> Indic at unicode.org
> http://unicode.org/mailman/listinfo/indic
> 


From elie.roux at telecom-bretagne.eu  Tue May 12 16:07:23 2015
From: elie.roux at telecom-bretagne.eu (=?UTF-8?B?w4lsaWUgUm91eA==?=)
Date: Tue, 12 May 2015 23:07:23 +0200
Subject: Deterministic sorting impossible for Tibetan with current state
In-Reply-To: <1DBC21ECC96246E7A0E59106A9BACEC7@upright.nu>
References: <5551FFE4.7070605@telecom-bretagne.eu>
 <1DBC21ECC96246E7A0E59106A9BACEC7@upright.nu>
Message-ID: <55526B8B.8090809@telecom-bretagne.eu>

> The combination
> - prefix ?, main letter ?, suffix ?
> does not exist in the dictionaries referenced from
> http://developer.mimer.com/charts/tibetan.htm.
> 
> Where did you find it?

There are a few examples of these page 48 of "Manuel de Tib?tain
Standard" by Nicolas Tournadre. It exists in English under the name
"Manual of Standard Tibetan", but the page might not be the same.

The example he cites are (in ewts):

- dabs vs. dbas
- mangs vs. mgnas
- dangs vs. dgnas
- dgas vs dags (this one is often disambiguated with dwags)

"mgnas" seems rare indeed, I can only find it in the word "gzugs mngas".
But I'm no expert in Tibetan, I can ask some people with more knowledge
if you want confirmation.

Thank you,
-- 
Elie

From elie.roux at telecom-bretagne.eu  Tue May 12 16:19:01 2015
From: elie.roux at telecom-bretagne.eu (=?UTF-8?B?w4lsaWUgUm91eA==?=)
Date: Tue, 12 May 2015 23:19:01 +0200
Subject: Deterministic sorting impossible for Tibetan with current state
In-Reply-To: <20150512192411.2f7a4c69@JRWUBU2>
References: <5551FFE4.7070605@telecom-bretagne.eu>
 <20150512192411.2f7a4c69@JRWUBU2>
Message-ID: <55526E45.6010609@telecom-bretagne.eu>

> So for your example, I would suggest that as in a lean Tibetan
> collation table, <U+0F58 TIBETAN LETTER MA, U+0F44 TIBETAN LETTER NGA>
> would be a collating element, that you write _mangs_ as <U+0F58, U+034F,
> U+0F44, U+0F66 TIBETAN LETTER SA> and reserve <U+0F58, U+0F44, U+0F66>
> for _mngas_.

You're right, I think this would work! I think I understand the
COMBINING GRAPHEME JOINER better now with your example.

Thank you very much for your help!
-- 
Elie