Re: Fwd: Combined Yorùbá characters with dot below and tonal diacritics

dzo at bisharat.net dzo at bisharat.net
Fri Apr 10 13:00:05 CDT 2015


Hi Luis, This harks back to discussions some years ago on this list and the old A12n-collaboration list. The short answer, which I assume is still valid, is that Unicode will not encode more "precomposed" characters such as you propose. 

That said, you highlight ongoing issues with what I've called "category 4" Latin orthographies, which include extended Latin characters plus combining diacritics. 

It has been a while, but one workaround proposed was for glyphs representing a base character plus combining diacritics. Perhaps someone else has more recent information than I do re that concept. 

Don Osborn



Sent via BlackBerry by AT&T

-----Original Message-----
From: Luis de la Orden <webalorixa at gmail.com>
Sender: "Unicode" <unicode-bounces at unicode.org>Date: Fri, 10 Apr 2015 18:24:04 
To: <unicode at unicode.org>
Subject: Fwd: Combined Yorùbá characters with dot below and tonal diacritics

Hi to all in the list,

This is my first post and apologies in advance if I make any mistakes.
Today I enrolled as an individual member seeking to support the Unicode
effort. I would like congratulate you all for the good work you are doing.
You make the world a much better and easy place for many people out there.

My journey into Unicode started a bit more than four years ago when I
started playing with the creation of keyboard layouts that allowed me to
write Brazilian Portuguese, my mother tongue, on a British keyboard by
unlocking the latin accents already existent in that keyboard layout to
write accented Portuguese characters.

My interest widened to African languages who nowadays find themselves in
the same situation even in their own geography and to make a long story
short I also enabled the output of Yorùbá characters from a UK keyboard
(Mac and PC): using the ALT/ALT GR keys to make e, o and s to output ẹ, ọ
and ṣ and allowing them to be tonalised with the accents in the UK keyboard
changed to combining diacritics or dead keys combinations: ọ̀, ẹ́, ń, etc..

Whilst working in the creation of the layouts I realised combined
characters (diacritic + character combined in one code) made life much
easier as dead key outputs than using combining diacritics.

The advantages and challenges I discovered are:

1. Dead keys (pressing accent key and then letter) are the way most
(perhaps all) European keyboards work. Combining diacritics work the other
way around, first one types the letter then the combining diacritic. There
is an element of familiarity that is lost with using combining characters;

2. The dead key layout system prevents diacritics piling up on top of a
character if you press them more than once, something essential for less
technology-savvy typists as it limits the amount of mistakes one could
make. You also avoid getting all and any character accented as it would
happen with combining diacritics;

3. In the African techno-social context where local languages have to be
typed from an European keyboard, if one decides to use the single quote as
a rising tonal, making it a combining acute, they will lose the single
quote forever. As a dead key the single quote will behave as an acute or
tonal acute if it comes followed by the vowels and consonants you chose to,
otherwise if followed by space it works as a single quote again.

4. In Windows 8 and probably earlier, combining diacritics (one code) added
to a character (another code) misalign when cut and pasted from one
document to another. If I typed Ẹ́ (capital letter e with dot below
and combining acute) in MS Word and copied to Excel or vice-versa, the
rendering would display something like Ẹ'.

5. Both Windows and Mac sometimes re-adjust the line spacing and
consequently length when one uses combining diacritics, which makes the
line shrink or expand. Terrible if you are dyslexic.

Needless to say that in my experience so far, dead keys are the most
friendly, familiar and supported way to produce accented or tonalised
characters.

You might be asking, so why don't you go on and use dead keys from now on
and be happy?

There is a limitation with dead keys, the combination of two characters
(accent and character) can only output one code. In the case of Yorùbá, I
could go on setting the dead key combinations for: á, é, í, ó, ú and even ń
as they have one single code for the tonalised/accented character but I
wouldn't be able to do create a dead key for ẹ́, ọ́, etc..., as they don't
have a single code combining the character (e with dot below) and the
diacritic (combining acute). We need a (e with dot below with acute) in
order to make this work well for Yorùbá.

If you are still reading this I would like to submit a proposal for the
creation of the following:

ẹ́ - LATIN SMALL LETTER E WITH DOT BELOW WITH ACUTE TONE MARK
ọ́ - LATIN SMALL LETTER O WITH DOT BELOW WITH ACUTE TONE MARK
Ẹ́ - LATIN CAPITAL LETTER E WITH DOT BELOW WITH ACUTE TONE MARK
Ọ́ - LATIN CAPITAL LETTER O WITH DOT BELOW WITH ACUTE TONE MARK
ẹ̀ - LATIN SMALL LETTER E WITH DOT BELOW WITH GRAVE TONE MARK
ọ̀ - LATIN SMALL LETTER O WITH DOT BELOW WITH GRAVE TONE MARK
Ẹ̀ - LATIN CAPITAL LETTER E WITH DOT BELOW WITH GRAVE TONE MARK
Ọ̀ - LATIN CAPITAL LETTER O WITH DOT BELOW WITH GRAVE TONE MARK

Would you be very kind to provide any advice on whether you think this
would be acceptable for submission?

Many thanks to all,

Luis Morais

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150410/93c99eb8/attachment.html>


More information about the Unicode mailing list