Global apostrophe solution? (Part of: A new take on the English apostrophe in Unicode; Keyman Developer for free?; Input methods at the age of Unicode)

Marcel Schneider charupdate at
Wed Jul 22 02:22:58 CDT 2015

On Mon, Jun 15, 2015 at 10:19 AM, Mark Davis ☕️  wrote: 

> More seriously, it is not all so black and white. 

This applies to apostrophe recommendations too. The thread about the English apostrophe was biased because it (I) ended up discussing Unicodeʼs general apostrophe recommendation, while the scope of the thread was originally limited to one language. And before all, the discussion was somewhat biased by not taking into consideration the following TUS statement (§ 6.2 Punctuation Apostrophe):

| The semantics of U+2019 are therefore context dependent. For example, if surrounded by
| letters or digits on both sides, it behaves as an in-text punctuation character and does not
| separate words or lines.

I may fail, of course, but actually Iʼm thinking that U+02BC is not needed to prevent word separation.  As U+02BC is missing in most fonts and on all native Latin Windows keyboards, it cannot be used, even as a letter, before we have resolved some problems.  
Please see the advice of User:Gholton in the very last paragraph of

Moreover, if it exists in a given font, U+02BC looks mostly like U+2019, slanted if this is slanted (as in Tahoma, Segoe UI, Open Sans, Sakkal Majalla), and thus does not match some expectations as stated on a web page I already cited:

The only fonts I found where U+02BC is a bit smaller than U+2019, are Linux Biolinum G, Gentium Basic, Gentium Book Basic. If this difference of size matches the preferences of English native readers, U+02BC could be preferred in English typography.

Another bias of the Apostrophe tread was that it focussed on disambiguation for text processing only, whereas disambiguation is more generally a human readersʼ issue, which needs to be resolved on a glyphic level. And which comes from far, very far into the past. See again — the last section, where Potential Problems are resolved.

Along with adding some missing information in the Standard about disambiguating quotation quotes and scare quotes, weʼll end up with language-specific recommendations for the apostrophe like for the quotation marks. About the mixup between scare quotes and quotation quotes, there was my last sentence yesterday that contained a lot of quotes looking like scare quotes but that marked quotations. Letʼs take this handy example:

> I hope that some macro could enable "webmasters" to rapidly update websites, because resolving this "funny" "scenario" has cost me some "effort" today!

Iʼm not going to put webmasters between scare quotes! The quotes in _"webmasters"_ indicate that Iʼm quoting somebody whoʼs started talking about webmasters.
That goes on with "funny", a word that is often scare-quoted, but here it is simply a quotation from “Re: a mug”, where such kind of phenomena looked rather funny (on a mug), I was told.
Again, "scenario" and "effort" are two more quotations from the e-mail I was responding to.

Straightforward: In English we should take example on the French and German people, who distinguish quotations and scares by using angle quotation marks for the former, comma quotation marks for the latter, even though these are considered as “English” (Iʼm quoting) in France, so primarily French typographers are reluctant to use them, generating thus exactly the same irritating mixup where one is often unsure whether the author is serious or not. But serious journalism leads to systematically differenciate «quotations» and “scares”. This is common usage in print and web news media products from roughly all publishers.

In actual French and German usage, single quotes are nearly unexistent, despite of U+2019 being unambiguously an apostrophe in German.  Primary quotations are always in «double quotes» (or »this way«), and a nested close-quote (› or ‹) never looks like an apostrophe.  When the goal is to help text reading and text handling, would using angle quotation marks for quotations not be a good idea?  I would add that personally I consider these marks as more respectful towards authors who are quoted, as well as towards readers who are to understand unambiguously how itʼs meant.

Eventually there could be different recommendations, so for example, in German, U+2019 is preferred for apostrophe, in French it is, too, and the use of U+2018 should be strongly discouraged, which it should be in English too when U+2019 is preferred for apostrophe; otherwise, following user preferences, U+02BC can be preferred for this, and the use of U+00AB and U+00BB would be preferred for quotations, U+2039 and U+203A for nested quotations, and U+201C - U+201D for markup that does not mean a quotation.  The same sould be recommended for all languages that donʼt already differenciate visually the two meanings of quotation marks, because they don't already use angle quotes, or comma quotes.

For input, rather than (as I meant) a layout with U+02BC on E00 (because this key is too peripherical for an often used character, and the grave accent is used in TeX), a smart keyboard layout is needed, with an *apostrophe toggle* that allows to get alternately U+0027, U+2019, U+02BC on the same apostrophe key, and another independent or related toggle that makes the < and > keys produce the « and » quotes. Such keyboards can be programmed using Keyman Developer. Keyman uses a powerful language to define flexible layouts including an unlimited number of toggles, which may have more than two states. See
Keyman is the solution for what I expected a keyboard layout to perform, and that is very hard (or even impossible) to obtain with the OS related keyboard drivers as I am programming for Windows.

As a keyboard layout framework, I recommend Keyman.

Best regards,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list