Romanized Singhala got great reception in Sri Lanka

Naena Guru naenaguru at gmail.com
Tue Mar 18 00:23:31 CDT 2014


Thank you, Ken.

You very nicely analyzed it. Why I said that the signs might pop out
because  I have had complaints that happening. I think this is because
implementation of proper rendering is behind in some systems.

On input, I tried to make a layout that is close to QWERTY. But failed
because of the need for too many combination keys. Keyman uses the old
typewriter keyboard Wijesekara. I saw a better one on the front page for
Singhala but did not find it further inside. Marc would know, of course.

Anyway my complaint is that Unicode Singhala is incomplete and wrong and
that it has a deleterious effect on the language, one of the oldest in the
world. What's aggravating is that they institutionalize errors as correct.
Rev. Fr. Perera warned against this 80 years ago. I suppose I wouldn't have
much to say if the 58 phonemes are used to replace the ones there. It will
not happen.


On Mon, Mar 17, 2014 at 8:36 PM, Whistler, Ken <ken.whistler at sap.com> wrote:

>  Well, I actually don’t see. I took a look at the Sinhala you inserted in
> this
>
> email. I cannot tell what you did at your input end (about “inserted all
> joiners”),
>
> but there are no actual joiners in the text itself. It displayed just fine
>
> in my email (including the correct conditional formatting of the –u vowel
>
> applied to the ra in pu*ru*kee), without me doing anything special (or
> installing
>
> any hacked font). Why? Because it was transmitted in plain Unicode.
>
>
>
> I cut and pasted that Unicode Sinhala string into a Word document, and
>
> it worked just fine. The boundaries for all the syllables were correctly
>
> detected.
>
>
>
> I saved it as a plain text UTF-8 file, and it worked just
>
> fine. I even then read the plain text UTF-8 file into a UTF-8 aware
>
> programming editor, and it worked just fine. (In a programming editor,
>
> which doesn’t attempt complex script rendering,
>
> the vowels don’t apply to the consonants and no reordering is done, so
>
> the display isn’t correct, but each character is correctly preserved, and
>
> if I write it back out to a document and read it in Word or some other
>
> tool that has access to proper rendering, it is still fine.) And all that
>
> interoperability works, why? Because this is plain Unicode.
>
>
>
> So while I don’t doubt that people may be having serious issues with
>
> input methods for Sinhala, I tend to agree with Marc Durdin that you are
> confusing
>
> encoding with input methods. Yes, I know you know the difference,
>
> but it appears to me that the inescapable conclusion from your
>
> argumentation is that the highest priority for the design of an
>
> encoding system should be to make the design of input methods
>
> as simple as possible. And in my estimation, that is confusing encoding
>
> with input methods.
>
>
>
> The art of input methods is to hide encoding details from users, and
>
> instead to provide them with an abstraction that they find easy to
>
> use and which accords with their general understanding of the writing
>
> system they are using. If done correctly, then the details of the input
>
> method *also* recede into the background, and users then simply
>
> do what they want: write and edit text easily on their devices.
>
>
>
> --Ken
>
>
>
> P.S. Here is an octal dump of that text (after I inserted a closing
> parenthesis in
>
> the editor). Sinhala sequence highlighted. Plain Unicode in UTF-8,
>
> no fancy stuff, and works just fine.
>
>
>
> 0000000000    EF  BB  BF  62  61  6C  75  20  76  61  6C  69  67  65  65
> C2
>
> 0000000020    A0  75  C2  B5  61  20  70  75  72  75  6B  65  65  C2  A0
> C3
>
> 0000000040    B0  61  61  6C  61  61  20  68  C3  A6  C3  B0  75  76  61
> C3
>
> 0000000060    BE  20  6E  C3  A6  C3  A6  20  C3  A6  C3  B0  65  65  20
> C3
>
> 0000000100    A6  72  65  6E  6E  65  65  0D  0A  28  E0  B6  B6  E0  B6
> BD
>
> 0000000120    E0  B7  94  20  E0  B7  80  E0  B6  BD  E0  B7  92  E0  B6
> 9C
>
> 0000000140    E0  B7  9A  20  E0  B6  8B  E0  B6  AB  20  E0  B6  B4  E0
> B7
>
> 0000000160    94  E0  B6  BB  E0  B7  94  E0  B6  9A  E0  B7  9A  20  E0
> B6
>
> 0000000200    AF  E0  B7  8F  E0  B6  BD  E0  B7  8F  20  E0  B7  84  E0
> B7
>
> 0000000220    90  E0  B6  AF  E0  B7  94  E0  B7  80  E0  B6  AD  E0  B7
> 8A
>
> 0000000240    20  E0  B6  B1  E0  B7  91  20  E0  B6  87  E0  B6  AF  E0
> B7
>
> 0000000260    9A  20  E0  B6  87  E0  B6  BB  E0  B7  99  E0  B6  B1  E0
> B7
>
> 0000000300    8A  E0  B6  B1  E0  B7  9A  29  0D  0A  0D  0A
>
>
>
> As you see, this is a terrible mess and cannot be straightened, granted
> few people use it, and there'll be more. What other choice do they have
> except Anglicizing?. In Singhala, they say, "balu valigee uµa
> purukee ðaalaa hæðuvaþ nææ æðee ærennee" (බලු වලිගේ උණ පුරුකේ දාලා හැදුවත්
> නෑ ඇදේ ඇරෙන්නේ <- I inserted all joiners, but can't guarantee if vowel
> signs would pop out). It means you cannot straighten dog tail even if you
> put it in a bamboo.piece. You cannot fix Unicode Singhala and sadly, it is
> bringing down the language with it.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20140318/71059d9e/attachment.html>


More information about the Unicode mailing list