Counting Devanagari Aksharas

Richard Wordingham via Unicode unicode at
Mon Apr 24 02:08:19 CDT 2017

On Mon, 24 Apr 2017 00:36:26 +0530
Naena Guru via Unicode <unicode at> wrote:

> The Unicode approach to Sanskrit and all Indic is flawed. Indic
> should not be letter-assembly systems.
> Sanskrit vyaakaraNa (grammar) explains the phonemes as the atoms of
> the speech. Each writing system then assigns a shape to the
> phonetically precise phoneme.
> The most technically and grammatically proper solution for Indic is 
> first to ROMANIZE the group of writing systems at the level of
> phonemes. That is, assign romanized shapes to vowels, consonants,
> prenasals, post-vowel phonemes (anusvara and visarjaniiya with its
> allophones) etc. This approach is similar to how European languages
> picked up Latin, improvised the script and even uses Simples and
> Capitals repertoire. Romanizing immediately makes typing easier and
> eliminates sometimes embarrassing ambiguity in Anglicizing -- you
> type phonetically on key layouts close to QWERTY. (Only four
> positions are different in Romanized Sinhala layout).
> If we drop the capitalizing rules and utilize caps to indicate the 
> 'other' forms of a common letter, we get an intuitively typed system
> for each language, and readable too. When this is done carefully,
> comparing phoneme sets of the languages, we can reach a common set of 
> Latin-derived SINGLE-BYTE letters completely covering all phonemes of 
> all Indic.

Unless this implies a spelling reform for many languages, I'd like to
see how this works for the Tai Tham script.  I'm not happy with the
Romanisation I use to work round hostile rendering engines.  (My
scheme is only documented in variable hack_ss02 in the last script
blocks of  For example,
there are several different ways of writing what one might naively
record as "ontarAy".

> Next, each native script can be obtained by making orthographic smart 
> fonts that display the SBCS codes in the respective shapes of the
> native scripts.

That sounds like a letter-assembly system.

So how does your scheme help one split words into orthographic

> I have successfully romanized Sinhala and revived the full repertoire
> of Sinhla + Sanskrit orthography losing nothing. Sinhala script is
> perhaps the most complex of all Indic because it is used to write
> both Sanskrit and Pali.

What complication does Pali impose on top of Sanskrit.  As far as I'm
aware, it just needs one extra letter, usually called LLA, which you
will already have if 'Sanskrit' includes Vedic Sanskrit.
> See this: (It's all SBCS underneath).
> Test here:

All I get for these are blank pages.  Perhaps there's an unreported
communication failure in the network,


More information about the Unicode mailing list