Counting Devanagari Aksharas

Richard Wordingham via Unicode unicode at
Thu Apr 20 20:19:05 CDT 2017

On Thu, 20 Apr 2017 14:14:00 -0700
Manish Goregaokar via Unicode <unicode at> wrote:

> On Thu, Apr 20, 2017 at 12:14 PM, Richard Wordingham via Unicode
> <unicode at> wrote:

> > On Thu, 20 Apr 2017 11:17:05 -0700
> > Manish Goregaokar via Unicode <unicode at> wrote:

> >> I'm of the opinion that Unicode should start considering devanagari
> >> (and possibly other indic) consonant clusters as single extended
> >> grapheme clusters.

> > You won't like it if cursor movement granularity is reduced to one
> > extended grapheme cluster.  I'm grateful that Emacs allows me to

> I mean, we do the same for Hangul.

Hangul is generally a maximum of three characters, which is about the
border of tolerance. I find it irritating to have to completely retype
Thai grapheme clusters of consonant, vowel and tone mark.  There were
loud protests from the Thais when preposed vowels were added to the
Thai grapheme cluster and implementations then responded, and Unicode
quickly removed them. Now imagine you're typing Vedic Sanskrit, with its
clusters and pitch indicators.

> The main time you need intra-conjunct segmentation in Devanagari is
> when deleting something you just typed.

You'll typically be several words beyond by the time you notice, or by
the time a spell-checker spots a problem.


More information about the Unicode mailing list