Windows 10 release (was: Re: WORD JOINER vs ZWNBSP)

Marcel Schneider charupdate at
Sun Aug 2 07:26:45 CDT 2015

On 30 Jul 2015 at 20:56, Doug Ewell  wrote:

> Marcel Schneider wrote:

>>> Unfortunately that doesnʼt work on at least one recent version of
>>> Windows. An unambigous bug was due to the presence of 0x2060 in the
>>> Ligatures table. This has cost me a whole workday to retrieve, fix,
>>> and verify.

The bug on Windows I encountered at the end of July has been definitely identified and reconstructed. After ninety-five drivers compiled since the bug appeared, I can tell so much as that the problem is related to the length of the so-called ligatures. When the MSKLC was built, they were limited to four characters on Windows (see glossary in the MSKLC Help). On my machine the maximal length is 16 characters. The problem is that this is not equal on all shift states and perhaps keys. Roughly, I can put five characters on modification number three, that is normally AltGr, but not on #4 (Shift+AltGr). Relating the problem to the presence of 0x2060 was due to a misinterpretation.

[About why five characters: The ellipsis made of three times PERIOD looks often better or seemingly, *and* is a part of all fonts, *and* doesnʼt bug when a server enforces Latin-1 even on the UTF-8 pages it sends itself (see last monthʼs thread “UTF-8 display”). The complete sequence is a braced ellipsis, for more usefulness in a context of quotation. I wanted the braced U+2026 on Ctrl+Alt+Period, and the braced three periods on Shift+Ctrl+Alt+Period. Now itʼs the other way round.]

The following source lines show the sole difference between a bugging driver and a driver that works fine:

{VK_OEM_PERIOD /*T33 B08*/ ,3 ,'[' ,0x2026 ,']' ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE }, // 
{VK_OEM_PERIOD /*T33 B08*/ ,4 ,'[' ,'.' ,'.' ,'.' ,']' ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE }, //

{VK_OEM_PERIOD /*T33 B08*/ ,3 ,'[' ,'.' ,'.' ,'.' ,']' ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE }, // 
{VK_OEM_PERIOD /*T33 B08*/ ,4 ,'[' ,0x2026 ,']' ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE ,NONE }, //

I was about to catalogize the cases depending on shift states (and possibly key scan codes and so on), but I encountered so many keyboard bugs (Windows keypress added while key not pressed; arrow keys disabled; Backspace disabled; and so on), that I decided not to waste more than the past week on that problem.

BTW, I got really aware that the so-called Windows 7 Starter is not Windows Seven but a sort of relooked Windows Vista. Thatʼs why its version number is 6.1. I think debugging Windows Vista isnʼt worthwile, as today we have Windows 10, which as the ultimate Windows version must have fixed all those bugs. Iʼm hopeful and I expect people will ♥ it.

>>> The effect of the bug was that Word, Excel, Firefox and Zotero were
>>> unstartable.

Only when the faulty driver is the one of the default keyboard when Windows starts up. Otherwise the apps arenʼt blocked, but the buggy keyboard layout is disabled, but on Word, not on the built-in Notepad.

>>> As a result, the WORD JOINER cannot be implemented on a driver based
>>> keyboard layout for general use on Windows. By contrast, the ZWNBSP
>>> can.

Thatʼs complete nonsense, sorry. Both can be implemented in the driver.

> and:

>> The so-called ligatures, by contrast, must not be constructed with
>> 0x2060. This however was the case of three items:
>> - A justifying no-break space emulation 0x2060 0x0020 0x2060, for use
>> in word processors where the NBSP is not justifying, unlike as in
>> desktop publishing and high-end editing software as Philippe Verdy
>> referred to, where U+00A0 is justifying. It not being in word
>> processing is consistent with the need of using U+00A0 along with
>> punctuations in French, and the lack of U+202F in many fonts.
>> - A colon with such a justifying no-break space, for use in documents
>> that imitate the usage of at least a part, if not mainstream, old-
>> fashioned typography: 0x2060 0x0020 0x2060 0x003a.
>> - A punctuation apostrophe emulation 0x2060 0x0027 0x2060, mapped to
>> Kana + I.

There is a mistake in my e-mail: the curly punctuation apostrophe is emulated using the letter apostrophe. This sequence runs:
0x2060 0x02bc 0x2060
Iʼm not sure however if this is useful, as such sequences are obtainable by autocorrect where the word joiners are really useful, while in English the letter apostrophe is preferrable (whereas other languages can use U+2019 unambiguously).

>> I'm about to test on another Windows Edition. I wonder if there is a
>> real issue or not, as you are suggesting. Nevertheless I believe that
>> no such bugs must occur in whatever version and edition of Windows.

That remains true, as the versions weʼre talking about are known to be unstable. But nobodyʼs perfect, and everybodyʼs invited to improve, notably on keyboard layouts which traditionally are neglected to the benefit of upper-level tools and high-end programs.

> I created, installed, and activated an MSKLC keyboard with the three WJ
> sequences described above, mapped for convenience to AltGr+Z, AltGr+X,
> and AltGr+C respectively 

Thank you again. Curiously I hadnʼt not even the idea; perhaps the missing dead key chaining and some other limitations lead me to rely rather on the WDK since I got aware of its existence (despite of its mention in the MSKLC glossary) on an explaining web page.

> (not the Kana key, which I don't have), 

Iʼm using the standard keyboard on my netbook and wouldnʼt have any Kana neither but thanks to the Windows Driver Kit allowing to add this as a modifier and as a toggle. Using Kana as main 3rd level helps limiting the messing of Ctrl+Alt with AltGr. I dismapped the latter and am using Ctrl+Alt in a few cases, like this one.

> and had no trouble opening or using any applications on Windows 7, including 
> the four mentioned above (except Zotero, which I don't use). KLC source
> available on request.

Thank you for the proposal. and your test has even brought me to the idea of making a patch of the layout Iʼm working on, so I took a subset and made it from scratch in MSKLC. Thatʼs much safer and easier to install. KanaLock is emulated using SGCaps. Compose could be emulated using other apps. But for a number of non-English languages, which SGCaps is for, CapsLock and easy input of multiply diacriticized letters is missing.

> I wouldn't have wasted the 15 minutes but for the continuing, tiresome
> rhetoric about Windows bugs.

Iʼm sorry. As stated above, Windows made me waste not only fifteen minutes, but about fifty hours. And Iʼm not even talking about all the other cases and my far over one thousand noted desiderata.

Best regards,

Marcel Schneider
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list