Another take on the English apostrophe in Unicode

Markus Scherer markus.icu at gmail.com
Thu Jun 4 16:34:27 CDT 2015


Looks all wrong to me.

"don’t" is a contraction of two words, it is not one word.

English is taught as that squiggle being punctuation, not a letter.
(Unlike, say, the Hawaiʻian ʻOkina
<http://en.wikipedia.org/wiki/%CA%BBOkina>.)

You can't use simple regular expressions to find word boundaries. That's
why we have UAX #29.

Confusion between apostrophe and quoting -- blame the scribe who came up
with the ambiguous use, not the people who gave it a number.

If anything, Unicode might have made a mistake in encoding two of these
that look identical. How are normal users supposed to find both U+2019 and
U+02BC on their keyboards, and how are they supposed to deal with incorrect
usage?

markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150604/830a65ba/attachment.html>


More information about the Unicode mailing list