Implementing SMP on a UTF-16 OS

Marcel Schneider charupdate at
Mon Aug 10 15:53:11 CDT 2015

On Mon, 10 Aug 2015, at 22:33, Richard Wordingham  wrote:

> Non-BMP characters must be entered as 'ligatures'.

This is bad news for a universal Latin keyboard layout, where a number of SMP characters should be available trough dead keys, or Compose.
We can implement Compose as a dead key chaining tree, but it seems to be limited to the BMP.
The mathematical letters are part of the symbols, and it would be handy to get them too with dead keys, as Compose, &, &, for the script alphabet. 
But the deadtrans combined character argument must be one code unit, not one character. So there seems to be no place for SMP.

This is clearly a Unicode implementation problem. C and C++ should be standardized for handling of UTF-16. IMO we cannot consider that Windows supports UTF-16 for internal use, if it does not support surrogates pairs except with workarounds using ligatures.

I may be wrong, but that's how I see the problem now.

Best regards,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list