Possible to add new precomposed characters for local language in Togo?

Marcel Schneider charupdate at orange.fr
Sun Nov 6 01:22:25 CST 2016


On Sun, 6 Nov 2016 05:40:59 +0100, Philippe Verdy wrote:

> Another use case: being able to type Bopomofo along with Cyrillic or 
> Kanas...; and new extensions will be needed for the 2012 German layout and 
> other layouts made according to the ISO standard (you cannot do all what 
> you want with just a few modifier bits and Windows only implementing a Kana 
> modifier key and limiting the number of modifiers supported even below the 
> capacity of the WORD ModificationNumber ! 

This does not match my experience. Iʼm actually using modifiers 0x10, 0x20, 
0x40 and 0x80 too, and kbd.h has even names for most of them: [kbd.h(51)]

/*
* Keyboard Shift State defines. These correspond to the bit mask defined
* by the VkKeyScan() API.
*/
#define KBDBASE 0
#define KBDSHIFT 1
#define KBDCTRL 2
#define KBDALT 4
// three symbols KANA, ROYA, LOYA are for FE
#define KBDKANA 8
#define KBDROYA 0x10
#define KBDLOYA 0x20
#define KBDGRPSELTAP 0x80

0x40 proves to be useable too. What I cannot understand, and others 
are puzzled too, is the name KBDGRPSELTAP. It sounds like it were an 
acronym of “GRouP SELecTor APing” or the like, hence my suspicion that 
the developers were asked to ape the *then new* ISO/IEC 9995-3 group 
selector. by implementing it as a dead key, as a *remnant* group selector.

Thatʼs about the name only. Much more annoying is that Iʼve been unable 
to get any result from the application of the related attribute: [kbd.h(364)]

#define CAPLOK 0x01
#define SGCAPS 0x02
#define CAPLOKALTGR 0x04
// KANALOK is for FE
#define KANALOK 0x08
#define GRPSELTAP 0x80

And there is even NO COMMENT, as only the first two are mentioned in the 
preceding comment: [kbd.h(46)]

* Special values for Attributes:
* CAPLOK - The CAPS-LOCK key affects this key like SHIFT
* SGCAPS - CapsLock uppercases the unshifted char (Swiss-German)

So I added 0x80 to the attribute of a key, expecting that this would 
make it sensitive to the CapsLock toggle key VK_CAPITAL, because this 
would match the ISO/IEC 9995 intent of having a secondary group that is 
subject to CapsLock. But it did not work.

Thank you for the instructions below. I hope that the programmers on 
this List know how exactly it must be translated into C so that it will 
be compiled and the API can read the compiled binaries it, and that 
Microsoft will make and ship the kernel-level update you mention below
with one of the very next Windows Updates so that all users whose 
Windows version stays maintained, will be able to use keyboard layouts 
that can input WCHAR strings trough dead keys.

Best regards,

Marcel

On Sun, 6 Nov 2016 05:37:12 +0100, Philippe Verdy wrote:

> Note: such extension is absolutely necessary for scripts not encoded in 
> the BMP (e.g. Gothic or Deseret, or larger scripts that will absolutely 
> need mechanisms like dead keys if they want to have a usable keyboard 
> layout !) 
> 
> 2016-11-06 5:32 GMT+01:00 Philippe Verdy : 
> 
>> 
>> 
>> 2016-11-06 4:11 GMT+01:00 Marcel Schneider : 
>> 
>>> On Fri, 04 Nov 2016 15:30:48 -0700, Doug Ewell wrote: 
>>> 
>>> — And with LATIN CAPITAL LETTER OPEN E? Why not this way (as has been 
>>> suggested): 
>>> /*TILDE&AIGU */ DEADTRANS( 0x0190 ,0x1e4d ,{0x0190,0x0303,0x0301} ,DKF_0 
>>> ), // *LATIN CAPITAL LETTER OPEN E WITH 
>>> TILDE AND ACUTE 
>>> 
>> 
>> This snippet cannot work as is, because the DEADTRANS() macro maps 
>> gernerates a 8-BYTE structure only has a single WCHAR for storing the 
>> result of the map of a (VKEY+modifier number): 
>> 
>> typedef struct _DEADKEY { 
>> DWORD dwBoth; 
>> WCHAR wchComposed; 
>> USHORT uFlags; 
>> } DEADKEY, *PDEADKEY; 
>> 
>> So it will need to map a WCH_LGTR instead, and then use a "ligature" 
>> table to store the string containing the 3 code units you want. 
>> 
>> Then there's an unused BYTE in the DEADTRANS structure for the flags, 
>> that can be used (specifically for entries mapped to WCH_LGTR) to pass 
>> flags to the LIGATURE(n) table (where there's also a free BYTE in the 
>> indexing key, allowing to pass an identifier needed for the lookup in the 
>> LIGATURE(n) table; alternatively, instead of mapping WCH_LGTR (a PUA), you 
>> could as well map another PUA there in 0xE001.0xE0FF for passing a byte for 
>> the deadkey state into the lookup of ligatures: 
>> 
>> #define TYPEDEF_LIGATURE(i) \ 
>> typedef struct _LIGATURE ## i { \ 
>> BYTE VirtualKey; \ 
>> WORD ModificationNumber; \ 
>> WCHAR wch[i]; \ 
>> } LIGATURE ## i, *PLIGATURE ## i; 
>> 
>> which can safely be changed to: 
>> 
>> typedef struct _LIGATURE ## i { \ 
>> BYTE VirtualKey, DeadKeyState; \ 
>> WORD ModificationNumber; \ 
>> WCHAR wch[i]; \ 
>> } LIGATURE ## i, *PLIGATURE ## i; 
>> 
>> (in the current definition of  the extra byte is implicit for the 
>> alignment, but not declared explicitly, it is implicitly filled with zeroes 
>> by C compilers when declaring the structure, but in my opinion this extra 
>> byte should have been declared explicitly.) 
>> 
>> But now it's up to the OS to support it, may be it works already if the 
>> lookup in the LIGATURE(n) table already scans for values of a DWORD, 
>> including this free padding byte, however there's a need to change some 
>> code in the kernel-level to check the PUA values mapped in DEADKEY 
>> structures and extract a DeadKeyState from it. 
>> 
>> The alternative is to map the combination of two deadkeys to a bit in the 
>> modifier number (this can be instructed by the uFlags, which will set the 
>> modifier bit number specified in the mapped PUA). In all cases there's 
>> still space for extension there. 
>> 
>> The last alternative is to extend the KBDTABLES structure to append new 
>> members for a table of extended DEADKEYS, and a separate table of LIGATURE 
>> for DEADKEYs (the KBDTABLE does not specify its own size, but it has a 
>> fLocaleFlags field just before the table of ligatures, which can indicate 
>> the presence of these extensions. 
>>



More information about the Unicode mailing list