Romanized Singhala got great reception in Sri Lanka
Naena Guru
naenaguru at gmail.com
Tue Mar 18 00:23:31 CDT 2014
Thank you, Ken.
You very nicely analyzed it. Why I said that the signs might pop out
because I have had complaints that happening. I think this is because
implementation of proper rendering is behind in some systems.
On input, I tried to make a layout that is close to QWERTY. But failed
because of the need for too many combination keys. Keyman uses the old
typewriter keyboard Wijesekara. I saw a better one on the front page for
Singhala but did not find it further inside. Marc would know, of course.
Anyway my complaint is that Unicode Singhala is incomplete and wrong and
that it has a deleterious effect on the language, one of the oldest in the
world. What's aggravating is that they institutionalize errors as correct.
Rev. Fr. Perera warned against this 80 years ago. I suppose I wouldn't have
much to say if the 58 phonemes are used to replace the ones there. It will
not happen.
On Mon, Mar 17, 2014 at 8:36 PM, Whistler, Ken <ken.whistler at sap.com> wrote:
> Well, I actually don’t see. I took a look at the Sinhala you inserted in
> this
>
> email. I cannot tell what you did at your input end (about “inserted all
> joiners”),
>
> but there are no actual joiners in the text itself. It displayed just fine
>
> in my email (including the correct conditional formatting of the –u vowel
>
> applied to the ra in pu*ru*kee), without me doing anything special (or
> installing
>
> any hacked font). Why? Because it was transmitted in plain Unicode.
>
>
>
> I cut and pasted that Unicode Sinhala string into a Word document, and
>
> it worked just fine. The boundaries for all the syllables were correctly
>
> detected.
>
>
>
> I saved it as a plain text UTF-8 file, and it worked just
>
> fine. I even then read the plain text UTF-8 file into a UTF-8 aware
>
> programming editor, and it worked just fine. (In a programming editor,
>
> which doesn’t attempt complex script rendering,
>
> the vowels don’t apply to the consonants and no reordering is done, so
>
> the display isn’t correct, but each character is correctly preserved, and
>
> if I write it back out to a document and read it in Word or some other
>
> tool that has access to proper rendering, it is still fine.) And all that
>
> interoperability works, why? Because this is plain Unicode.
>
>
>
> So while I don’t doubt that people may be having serious issues with
>
> input methods for Sinhala, I tend to agree with Marc Durdin that you are
> confusing
>
> encoding with input methods. Yes, I know you know the difference,
>
> but it appears to me that the inescapable conclusion from your
>
> argumentation is that the highest priority for the design of an
>
> encoding system should be to make the design of input methods
>
> as simple as possible. And in my estimation, that is confusing encoding
>
> with input methods.
>
>
>
> The art of input methods is to hide encoding details from users, and
>
> instead to provide them with an abstraction that they find easy to
>
> use and which accords with their general understanding of the writing
>
> system they are using. If done correctly, then the details of the input
>
> method *also* recede into the background, and users then simply
>
> do what they want: write and edit text easily on their devices.
>
>
>
> --Ken
>
>
>
> P.S. Here is an octal dump of that text (after I inserted a closing
> parenthesis in
>
> the editor). Sinhala sequence highlighted. Plain Unicode in UTF-8,
>
> no fancy stuff, and works just fine.
>
>
>
> 0000000000 EF BB BF 62 61 6C 75 20 76 61 6C 69 67 65 65
> C2
>
> 0000000020 A0 75 C2 B5 61 20 70 75 72 75 6B 65 65 C2 A0
> C3
>
> 0000000040 B0 61 61 6C 61 61 20 68 C3 A6 C3 B0 75 76 61
> C3
>
> 0000000060 BE 20 6E C3 A6 C3 A6 20 C3 A6 C3 B0 65 65 20
> C3
>
> 0000000100 A6 72 65 6E 6E 65 65 0D 0A 28 E0 B6 B6 E0 B6
> BD
>
> 0000000120 E0 B7 94 20 E0 B7 80 E0 B6 BD E0 B7 92 E0 B6
> 9C
>
> 0000000140 E0 B7 9A 20 E0 B6 8B E0 B6 AB 20 E0 B6 B4 E0
> B7
>
> 0000000160 94 E0 B6 BB E0 B7 94 E0 B6 9A E0 B7 9A 20 E0
> B6
>
> 0000000200 AF E0 B7 8F E0 B6 BD E0 B7 8F 20 E0 B7 84 E0
> B7
>
> 0000000220 90 E0 B6 AF E0 B7 94 E0 B7 80 E0 B6 AD E0 B7
> 8A
>
> 0000000240 20 E0 B6 B1 E0 B7 91 20 E0 B6 87 E0 B6 AF E0
> B7
>
> 0000000260 9A 20 E0 B6 87 E0 B6 BB E0 B7 99 E0 B6 B1 E0
> B7
>
> 0000000300 8A E0 B6 B1 E0 B7 9A 29 0D 0A 0D 0A
>
>
>
> As you see, this is a terrible mess and cannot be straightened, granted
> few people use it, and there'll be more. What other choice do they have
> except Anglicizing?. In Singhala, they say, "balu valigee uµa
> purukee ðaalaa hæðuvaþ nææ æðee ærennee" (බලු වලිගේ උණ පුරුකේ දාලා හැදුවත්
> නෑ ඇදේ ඇරෙන්නේ <- I inserted all joiners, but can't guarantee if vowel
> signs would pop out). It means you cannot straighten dog tail even if you
> put it in a bamboo.piece. You cannot fix Unicode Singhala and sadly, it is
> bringing down the language with it.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20140318/71059d9e/attachment.html>
More information about the Unicode
mailing list