Counting Devanagari Aksharas

Fri Apr 21 18:04:27 CDT 2017

On Thu, 20 Apr 2017 11:17:05 -0700
Manish Goregaokar via Unicode <unicode at unicode.org> wrote:

> On Wed, Apr 19, 2017 at 4:35 PM, Richard Wordingham via Unicode
> <unicode at unicode.org> wrote:

> > Is there consensus on how to count aksharas in the Devanagari
> > script? The doubts I have relate to a visible halant in
> > orthographic syllables other than the first.

> I don't think there's consensus.

I've found related discussion at
https://lists.w3.org/Archives/Public/public-i18n-indic/.  The question
of how to count was raised and not answered there.

> On Wed, Apr 19, 2017 at 4:35 PM,
> Richard Wordingham via Unicode <unicode at unicode.org> wrote:
> > Is there consensus on how to count aksharas in the Devanagari
> > script? The doubts I have relate to a visible halant in
> > orthographic syllables other than the first.

> I'm of the opinion that Unicode should start considering devanagari
> (and possibly other indic) consonant clusters as single extended
> grapheme clusters.

Do Hindi speakers really think of orthographic syllables as characters?

What may be useful is the concept of a definition of an orthographic
syllable.  It may be possible to get the information from a font -
depending on the renderer - but a locale-dependent definition should be
possible for use as a fall-back.  Devanagari rules won't work for
Tamil, and I think rules for Hindi and Nepali will be slightly
different - <VIRAMA, ZWNJ> looks like a problem.

The concept is possibly not useful in some Indic scripts - the concept
won't work well in Thai, but will work in Pali in the Thai script, for
both Pali orthographies.

Richard.