Support for Latin ligature IJ (was another thread)

Marcel Schneider charupdate at orange.fr
Wed Mar 30 15:12:27 CDT 2016


On Wed, 30 Mar 2016 00:14:59 +0100, Kent Karlsson  wrote [in the thread “Re: Swapcase for Titlecase characters”]:

[…]

> I still think ij should have the "soft-dotted" property (and that
> that property is finally implemented properly in various systems...).

[Refers to:
Re: Case for letters j and J with acute from Kent Karlsson on 2016-02-09
http://www.unicode.org/mail-arch/unicode-ml/y2016-m02/0044.html]

For ‘ij’ that may be unambiguous, but for ‘i’ there is a need of locale-dependent tailoring, as for Lithuanian it should be hard-dotted.

> I've heard that old typewriters used to have a key for IJ ij.

Iʼve read it on Wikipedia, though Iʼve been unable to grab any image of such off the internet. 
This one is Dutch but has none:

https://www.bing.com/images/search?q=typewriter+dutch&view=detailv2&id=5473CA1D2B05879CE21B98CD9F729EE838A49E69&selectedindex=31&ccid=wLABJru4&simid=608029570327776271&thid=OIP.Mc0b00126bbb87be9b1d849df9b11a201o0&mode=overlay&first=1

These machines have lowercase ij only, while the uppercase position is given the florin sign:

https://img1.etsystatic.com/062/0/5543707/il_570xN.794019731_fiyd.jpg

http://www.tiptopvintage.co.uk/wp-content/uploads/2015/05/Brown-Vendex-Typewriter-7.jpg

> Maybe it should be reintroduced for Dutch computer keyboards,

I pledge in favor. To achieve this, it would be sufficient to have an ISO/IEC 9995-3 compliant keyboard layout for the Netherlands—and one for Belgium, as there are already one for Canada and one for Germany (given that ‘IJ’, ‘ij’ are included on T3).

And to complete the job, all of these could be added to CLDR.

> as well as used
> (for Dutch) in autocorrects (IJ -> IJ, ij -> ij) or spell correctors
> (looking at the whole word rather than just two letters, and then
> not restricted to Dutch per se, but certain Dutch names regardless
> of the language for the surrounding text).

Itʼs urgent to spell the names correctly, notably because there are insufficient equivalence classes in search engines. Correctly spelled ‘IJsselmeer’ vs missspelled ‘IJsselmeer’ points to different numbers of results:

Bing Search: 2 850 000 vs 886 000
Google Search: 343 000 vs 345 000

while DuckDuckGo, Startpage and Yahoo do not state the number of results (that in any case is mainly theoretical since only the top 500 ones are currently displayable).

> That, in turn, would
> probably be a better approach than trying to have some special
> handling of the sequence "ij" in case mapping (for Dutch alone).

In current understanding there seems to be a flaw on whether the ‘IJ’ ligatures are to be used, or are deprecated. The mere fact that they are compatibility decomposable is cited[1] along with rule D21 to justify separate encoding as ‘IJ’. TUS indeed seems to support that POV when it declares Dutch as supported by the Latin-1 supplement. One page below, the ‘IJ’ ligatures are discussed as compatibility characters, which does not imply deprecation. And indeed, their replacement by two-letter sequences is pointed as a mere matter of fact.

While atomic typing of ‘ij’ seems to be a relict from the ISO/IEC 646 era, I’m puzzled not to find any related autocorrect in word processor when Dutch is on (no instances found in MSO1043.acl of 2010), whereas French ‘œ’ is supported in the French ACL.

As of special case mapping for ‘ij’, its implementation goes increasing, but yes it remains a workaround that wonʼt be needed any longer as soon as people switch to ISO/IEC 9995-3 keyboard layouts. In the era of globalization, there is pretty no other choice.

Hopefully,

Marcel

[1] https://en.wikipedia.org/wiki/IJ_(digraph)#cite_note-15



More information about the Unicode mailing list