Possible to add new precomposed characters for local language in Togo?

Philippe Verdy verdy_p at wanadoo.fr
Tue Feb 16 02:14:12 CST 2016

Note that I have also produced my own keyboard several years ago containing
almost all characters or sequences needed for African languages in the
Latin script, also based on an extension of the French (AZERTY) keyboard.
It contains also additions for other European languages (notably German,
Dutch, Spanish, Scholar Latin, Czech, Serbian Latin...), or romanizations
of other languages (notably Japanese), so it also includes the macron (for
Japanese Romaji), breve (for Scholar Latin), caron (Slavic languages), dot
below (for Maltese), and additional letters (ij/IJ ligatures for Dutch, o/O
with stroke, and other letters used in IPA). I've not extended it though
for Chinese romanizations (there are several conventions for tone marks),
or Vietnamese (two diacritics needed also for tone marks in addition to
vowel modifiers).

2016-02-16 9:00 GMT+01:00 Philippe Verdy <verdy_p at wanadoo.fr>:

> 2016-02-16 0:32 GMT+01:00 Mats Blakstad <mats.gbproject at gmail.com>:
>> I've worked to upload a keyboard for local languages in Togo to XKB
>> project, it is a combination keyboard based on French keyboard and extended
>> to make it possible to write all the local languages in Togo. However many
>> of the languages have several tones and even use combined tones. However
>> when I tried to update the composer to make it work it seems like the
>> composer only can give back a precomposed character and not a string with
>> combined characters.
>> I now wonder, generally, is it best to add new precomposed characters to
>> Unicode? Should there be a unicode symbol for each combination used? What
>> is best practise? I ask because I see some unicodes are precomposed
>> characters, I'm not sure why they are useful, but if they are maybe we also
>> should add these?
> You don't need that.
> Keyboard layouts MUST generate the combining sequence. (It's then up to
> the text editors and softwares to adapt themselves to the possibility that
> a single keystroke could generate multiple characters/code points, and to
> handle themselves the case of text selection and corrections by grapheme
> cluster rather than by single character/code point: this is already done in
> many softwares, including for the Latin script).
> However, Unicode could standardize names for these "common" combinations
> (without assigning new code points, which is not needed). There's already a
> supplementary datafile for them.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20160216/f8c2ef39/attachment.html>

More information about the Unicode mailing list