Counting Devanagari Aksharas

Manish Goregaokar via Unicode unicode at unicode.org
Thu Apr 20 16:14:00 CDT 2017


I mean, we do the same for Hangul.

The main time you need intra-conjunct segmentation in Devanagari is
when deleting something you just typed. And backspace usually operates
on code points anyway (except for some weird cases like flag emoji,
though this isn't uniform across platforms). I don't see how
intra-conjunct selection would be useful otherwise.
-Manish


On Thu, Apr 20, 2017 at 12:14 PM, Richard Wordingham via Unicode
<unicode at unicode.org> wrote:
> On Thu, 20 Apr 2017 11:17:05 -0700
> Manish Goregaokar via Unicode <unicode at unicode.org> wrote:
>
>> When given a rendered representation people seem to uniformly count
>> conjuncts as multiple aksharas if rendered with visible halant, and as
>> a single akshara if they are rendered conjoined.
>
> Now, that's what I expected.
>
>> I'm of the opinion that Unicode should start considering devanagari
>> (and possibly other indic) consonant clusters as single extended
>> grapheme clusters. Yes, sometimes it's not rendered as a single glyph,
>> but sometimes family emoji will not render as a single glyph either
>> (if you use skin tones or more than 4 family members) and we still
>> consider those EGCs.
>
> You won't like it if cursor movement granularity is reduced to one
> extended grapheme cluster.  I'm grateful that Emacs allows me to
> delete and replace the first NFC character of a grapheme cluster.
>
> Richard.


More information about the Unicode mailing list