Romanized Singhala got great reception in Sri Lanka

Naena Guru naenaguru at
Sun Mar 16 21:44:08 CDT 2014


All you said about ISCII is probably right. So, it has given you guys a lot
of pain. I did not do it nor followed it.

As for Japanese (and also for Indic) I have read the warnings in RFC 1815:

I am not creating a transcoding table as you say. I assume you think I take
Unicode Sinhala to be a legitimate encoding for Singhala that I am mapping
to SBCS for the love of SBCS. No. And I don't know what concepts I am
mixing. I am trained in Computer Science, I have taught it at college
level, and have done years of consulting work and written project proposals
for a pretty good size one for the Federal Government too.

I believe that you need to understand the problem at hand to find a
solution for it. You cannot make solutions for Indic not knowing Indic.
Starting blindly with ISCII was a mistake. It is useless at least for

========= STORY OF UNICODE SINHALA ==========
The first draft for the Sinhala chart was handwritten by Andy Daniels. He
mentioned some doubts about some letters in it. He had a good instinct on
that. It sat there people wondering from where he got his information. He
said from Germany. Someone said that it came from a $300 book. I suspect
that it is Rev. Fr. A.M.Gunasekara's book (1891).

Then came the Lion of Unicode Michael Everson (down in this thread). He was
making fonts by the dozen and took Daniels' draft certified the letters
side of it, not having a nicely printed set of the digits. This certificate
was countersigned by a Mettavihari for users. I know Ven. Mettavihari. He
is a Danish man that researched and put up the most comprehensive
Tripitaka, the Buddhist canon. This irreproachable man denies that he
endorsed the standard on behalf of the Singhalese saying obviously he is
not Singhalese. (Actually, I think he is more Singhalese than me). Who
signed as him, a forgery?

When the code chart came to Lanka, the closest to a computer that they knew
was the IBM Selectric typewriter. When they did not do anything about it,
the World Bank offered a $83 scheme to bring Lanka to the computer age all
the way so the village fellow could communicate with the government online.
They set up the IT agency ICTA and got the academics gathered there doing
'projects'. They even paid a fellow to come over and read the OpenType
specification for them. I understand that the kingpin of the operations
there is one person that studied in US.He is the adviser to the President,
The top Colombo University and the ICTA itself. He is one consultant that
does most projects.

When Everson wanted to add the digits apparently finding Fr. Gunasekara's
book, the Lankans denied such existed. When he showed them, they said they
are not necessary. Now this everybody's consultant announced at my
presentation that they are going to add them.
============ END STORY OF UNICODE SINHALA ==========

Unicode Singhala violates Singhala / Sanskrit grammar. Unicode Singhala is
not compatible with Sanskrit, an integral part of the Singhala script. That
also applies to Pali whose native script is Singhala. Unicode Sinhala
further helps kill Singhala by making it very difficult to type and
impossible to obtain the entire repertoire of letters and limiting the
applications and OSs that it can be used in.

Typing Unicode Sinhala requires you to learn a key map that is entirely
different from the familiar English keyboard, while losing some marks and
signs too. There is a program called Helabasa by Keyman typing system that
printers use to type it. There is a physical keyboard too. Then there is
Google transliteration - very inadequate and another one by Colombo
University found on a web page. These last two allow you to type
phonetically but not entirely. The result is very few people type Unicode
Singhala, only those that their job requires them to type Unicode Singhala.

I did the same thing English and Western European languages did; very
close. I mapped the well-known 58+2 Singhala-Sanskrit phonemes in the SBCS.
The reason is because then Singhala gets to use all those applications
perfected over decades that most here Westerners enjoy. That set covers all
letters necessary for Singhala, Sanskrit and  Pali, the three languages
that use the Singhala script.

See it here displayed using the first orthographic smartfont:

Let's look at this as a lay person (whose interest is our ultimate goal)

English was fully romanized from fuþark by about 600 AD. Romanizing is
writing by using letters of the Latin alphabet plus many, many others added
to it. All Europeans when they became fully Christianized / literate, they
all adopted Latin letters and extended them as they pleased. This set has
branched off as Latin script and Cyrillic script. Printing industry
standardized the greater part of the alphabets.

Singhala has a well defined phoneme chart called hodiya. It is an extension
of the Sanskrit hodiya. Rev. Fr. Theodore G. Perera's grammar book (1932)
and Rev. Fr. A. M. Gunasekera's book (1891) that dug up sinking Singhala
fully describe the writing system. Like most other languages, including
English before printing arrived in England, it is written phonetically.

Singhala was romanized first in 1860s by Rhys Davids, called PTS scheme, to
print Pali (Magadhi) in the Latin script. This requires letters with bars
(macron) and dots not found in common fonts. This scheme is called PTS
Pali. It is similar to IAST Sanskrit. It is impossible to type these on the
regular keyboard.

I freshly romanized Singhala by mapping its phonemes to the SAME area 13
Western European languages mapped their alphabetic letters within the
following Unicode code charts:

So, if that is "creating a transcoding table" all Europeans did it and I do
it too.

On Sun, Mar 16, 2014 at 12:36 AM, Philippe Verdy <verdy_p at> wrote:

> Don't you realize that what you are trying to create is completely out of
> topic of Unicode, as it is simply another new 8-bit encoding similar to
> what ISCII does for supporting multiple Indic scripts with a common
> encoding/transcoding table?
> The ISCII standard has shown its limitations, it cannot be enough to
> support all scripts correctly and completely, it has lots of unsolved
> ambiguities for tricky cases or historic orthographies, or newer
> orthographies, that the UCS encoding better supports due to its larger
> character set and more precise character properties and algorithms.
> You are in fact creating a transcoding table... Except that you are mixing
> the concepts; and the Unicode and ISO technical commitees working on the
> UCS don"t need to handle new 8-bit encodings. And you'll soon experiment
> the same problems as in ISCII and all other legacy 8-bit encodings: very
> poor INTEROPERABILITY due to version tracking or complax contextual rules...
> You may still want to promote it at some government or education
> institution, in order to promote it as a national standard, except that
> there's little change it will ever happen when all countries in ISO have
> stopoed working on standardization of new 8-bit encodings (only a few ones
> are maintained; but these are the most complex ones used in China and Japan.
> Well in fact only Japan now seens to be actively updating its legacy JIS
> standard; but only with the focus of converging it to use the UCS and solve
> ambiguities or solve some technical problems (e.g. with emojis used by
> mobile phone operators). Even China stopped updating its national standard
> by publishing a final mapping table to/from the full UCS (including for
> characters still not encoded in the UCS): this simplified the work because
> only one standard needs to be maintained instead of 2.
> Note that as long there will not be any national standard supporting your
> proposed encodng, there is no chance that the font standards will adopt it.
> You may still want to register your encoding in the IANA registry, but
> you'll need to pass the RFC validation. And there are lots of technical
> details missing in your proposal so that it can work for supporting it with
> a standard mapping in fonts.
> There is better chance for you to pomote it only as a transliteration
> scheme, or as an input method for leyboard layout (both are also not in the
> scope of the Unicode and ISO/ISC 10646 standards though, they could be in
> the scope of the CLDR project, which is not by itself a standard but just a
> repository of data, supported by a few standards)... Think about it.
> 2014-03-16 5:12 GMT+01:00 Naena Guru <naenaguru at>:
>> I made a presentation demonstrating Dual-script Singhala at National
>> Science Foundation of Sri Lanka. Most of the attendees were government
>> employees and media representatives; a few private citizens came too.
>> Dual-script Singhala means romanized Singhala that can be displayed
>> either in the Latin script or in the Singhala script using an Orthographic
>> Smart Font. It is easy to input (phonetically) using a keyboard layout
>> slightly altered from QWERTY. The font uses Standard Ligature feature
>> <liga> of OpenType / OpenFont standard to display glyphs of Sanskrit
>> ligatures as well as many Singhala letters. The font is supported across
>> all OSs: Windows, Macintosh, Linux, iOS and Android. Dual-script Singhala
>> is the proper and complete solution on the computer for the Singhala script
>> used to write Singhala, Sanskrit and Pali languages. The same solution can
>> be applied for all Indic languages.
>> The government ministries, media and people welcomed it with enthusiasm
>> and relief that there is something practical for Singhala. The response in
>> the country was singularly positive, except for the person that
>> filibustered the Q&A session of the presentation that spoke about the hard
>> work done on Unicode Sinhala, clearly outside the subject matter of the
>> presentation.
>> The result of the survey passed around was 100% as below (translated from
>> Singhala):
>>    1. I believe that Dual-script Singhala is convenient to me as it is
>>    implemented similar to English - Yes
>>    2. Today everyone uses Unicode Sinhala. It is easy and has no
>>    problems - No
>>    3. The cost of Unicode Sinhala should be eliminated by switching to
>>    Dual-scrip Singhala - Yes
>>    4. We should amend Pali text in the Tripitaka according to rulings of
>>    SLS1134 - No
>>    5. Digitizing old books is a very important thing - Yes
>>    6. We should focus on making this easy-to-use Dual-script Singhala
>>    method a standard - Yes
>> Please comment or send questions.
>> _______________________________________________
>> Unicode mailing list
>> Unicode at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list