Fwd: L2/18-181

Anshuman Pandey via Unicode unicode at unicode.org
Wed May 16 17:41:12 CDT 2018

> On May 16, 2018, at 3:46 PM, Doug Ewell via Unicode <unicode at unicode.org> wrote:
> http://www.unicode.org/L2/L2018/18181-n4947-assamese.pdf
> This is a fascinating proposal to disunify the Assamese script from
> Bengali on the following bases:

‘Fascinating’ is a not a term I’d use for this proposal.

If folks are interested in a valid proposal for disunification of
Bengali, please look at the proposal for Tirhuta.

> 1. The identity of Assamese as a script distinct from Bengali is in
> jeopardy.

This is not a technical matter. Moreover, its typical rhetoric used by
various language communities in South Asia. Fairly standard fare for
those familiar with such issues.

The proposal needs to show how the two scripts differ, ie. conjuncts,
CV ligatures, etc. The number forms are similar to those already
encoded. Again, cf. Tirhuta.

> 2. Collation is different between the Assamese and Bengali languages,
> and code point order should reflect collation order.

The same issue applies to dictionary order for Hindi, Marathi, which
differ from the conventional Sanskrit order for Devanagari.
Orthographies for various languages put conjuncts and other things at
the end, which are not considered atomic letters. Nothing special in
this regard for Assamese and Bengali.

> 3. Keyboard design is more difficult because consonants like ক্ষ
> are encoded as conjunct forms instead of atomic characters.

Ignorant question on my part: is it difficult to use character
sequences as labels for keys? I see keys for both क्ष and ज्ञ on the
iOS Hindi keyboard, and त्र is tucked away under त.

> 4. The use of a single encoded script to write two languages forces
> users to use language identifiers to identify the language.

Same applies to each of the 40+ varieties of Hindi, as well as
Marathi, etc. Another ignorant question: how to identify the various
languages that use Arabic and Cyrillic?

> 5. Transliteration of Assamese into a different script is problematic
> because letters have different phonological value in Assamese and
> Bengali.

Transliteration or transcription? In any case, this applies to other
languages written using similar scripts: a Marathi speaker pronounces
ज and ऋ differently than a Hindi speaker does.

> It will be interesting to see where this proposal goes.

Hopefully, it does not go too far. What it proposes is contrary to
Unicode and redundant.

> Given that all
> or most of these issues can be claimed for English, French, German,
> Spanish, and hundreds of other languages written in the Latin script, if
> the Assamese proposal is approved we can expect similar disunification
> of the Latin script into language-specific alphabets in the future.

Fascinating. I mean, terrible.

All my best,

More information about the Unicode mailing list