Editing Sinhala and Similar Scripts
verdy_p at wanadoo.fr
Sat Mar 22 17:37:49 CDT 2014
2014-03-22 20:50 GMT+01:00 Richard Wordingham <
richard.wordingham at ntlworld.com>:
> > But it won't apply to "diacritics" (combining characters or joiner
> > controls like CGJ, ZWK and ZWNJ, and possibly even some oher format
> > controls) that have combining class 0 because their encoding order is
> > significant to you know where to stop the effect of Backspace.
> Your approach recommends input methods that separate combining
> marks of different combining classes by CGJ for easier editing!
NO. I certainly do not recommend it ! This is a false assertion.
> I see absolutely no reason why Backspace would arbitrarily delete
> > only the last encoded character when users canno even count them and
> > may not have input them separately. or could expect them to have be
> > typed in a different order.
> > So yes, entering:
> > <CEDILLA DEADKEY, ACUTE DEADKEY, C, BACKSPACE>, or
> > <ACUTE DEADKEY, CEDILLA DEADKEY, C, BACKSPACE>, or
> > <ACUTE DEADKEY, C WITH CEDILLA, BACKSPACE>, or
> > <CEDILLA DEADKEY, C WITH ACUTE, BACKSPACE>
> > should all result in keeping only the letter C in the backing store.
> > And with a IME supporint Compose key this will also be true;
> > <COMPOSE, C, CEDILLA, ACUTE, BACKSPACE>, or
> > <COMPOSE, C, ACUTE, CEDILLA, BACKSPACE>, or
> > <COMPOSE, C WITH CEDILLA, ACUTE, BACKSPACE>, or
> > <COMPOSE, C WITH ACUTE, CEDILLA, BACKSPACE>
> Your input methods suggest that there is something unitary about the
> result - which makes sense if their output is U+1E08 LATIN CAPITAL
> LETTER C WITH CEDILLA AND ACUTE. Would you make the same arguments if
> 'C' were replaced with 'S'? There is no character LATIN CAPITAL
> LETTER S WITH CEDILLA AND ACUTE.
I have NOT said that there existed such character (look at the separating
This is a false interpretation.
> It will be distinctly unpleasant and unnatural with an input method
> that allows separate input of all three characters - C,
> COMBINING CEDILLA and COMBINING ACUTE - one by one. Your suggestion
> that typing THAI CHARACTER RO RUA, THAI CHARACTER SARA UU, THAI
> CHARACTER MAI THO, BACKSPACE should result in just THAI CHARACTER RO RUA
> is unlikely to be welcome to Thais.
> I believe our sharply opposing opinions arise because of different
> views of the clusters. You are seeing characters that are composed of
> multiple elements. I am seeing groups of characters that, in general,
> happen not to be arranged in a line of constant direction.
This is a pragmatic consideration, that canonical equivalence should also
be respected even when editing texts. The same key should produce
canonically equivalent text when editing at the same logical position texts
that are canonincally equivalent.
> Canonical equivalence should be respected in visual editing modes.
> > Deleting only the "last" encoding diacritic should only be done in
> > specific non-visual editing modes (with "visible controls") and it is
> > not expected that most users will like this editing mode.
> For users who know what characters should be there, it makes a lot of
> sense to enter a non-visual editing mode - ideally of limited scope
> - when editing a previously typed cluster.
As long as the IME (or keyboard driver) has not transmitted the characters
to the edited document, it may record the sequence of keystrokes used. But
clicing anywhere in the document, or pressing any cursor movement key will
reset the IME to its initial state. If an advanced IME is used to allow
editing the content of a cluster before the cursor position, it will
require a specific dialog to decompose the characters and render in the IME
the cluster as a sequence of characters rendered isolately in "view
Most text editors do not support such separate IME panel and in fact users
do not like seeing these IME popups appearing on top of the edited text.
They want to be able to inpute text diretly in the WYSIWIG window. The IME
panel is an advanced edit mode which requires specific support in the
application (and an integration similar to the panels used by spell
IME popups also cause severe difficulties for accessibility, due to the
separation of the previewed text and the edited text in the panel, also
because it is difficult to naviate in these popups with the keyboard and
also because the popup is obscuring the rest of the text (complicating the
And on small screens below 5 inches (like smartphones), it is really
difficult to fit the IME panel and make it easy to use with fingers, and
allow also reading a complete sentence, without reducing a lot the size of
touchable buttons, reducing a lot the font sizes, and making the text very
difficult to read.
That's why so many people over the age of 40 really hate composing any text
on smartphones and will prefer larger tablets : their smartphone is used
only to view small texts : this is a problem of vision - presbytie - and
size of fingers, the screen is too small to fit an IME editor except a TS9
one with 12 keys, used only to compose very short messages such as SMS or
Facebook status. On for this usage, people do not care much about composing
advanced diacritics; theu will compose only the basic letters and will even
drop correct punctuation except space and they won't care about
capitalisation if the spell cehceker of the TS9 editor does not guess it.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Unicode