Ancient Greek apostrophe marking elision

Richard Wordingham via Unicode unicode at unicode.org
Sun Jan 27 12:19:28 CST 2019


On Sun, 27 Jan 2019 12:38:39 -0500
"Mark E. Shoulson via Unicode" <unicode at unicode.org> wrote:

> On 1/27/19 11:08 AM, Michael Everson via Unicode wrote:
> > It is a letter. In “can’t” the apostrophe isn’t a letter. It’s a
> > mark of elision.  I can double-click on the three words in this
> > paragraph which have the apostrophe in them, and they are all
> > whole-word selected.  
> 
> That doesn't work when I try it: I double-click on the "a" in "can’t" 
> and get only the "can" selected.
> 
> This does not necessarily prove anything; my software (Thunderbird)
> is arguably doing it wrong.

Except the Uniocde-compliant processes aren't required to follow the
scheme of TR27 Unicode Text Segmentation.  However, it is only required
to select the whole word because the U+2019 is followed by a letter.
TR27 prescribes different behaviour for "dogs'" with U+2019 (interpret
as two 'words') and U+02BC (interpret as one word).  The GTK-based
email client I'm using has that difference, but also fails with
"don't" unless one uses U+02BC.

However LibreOffice treats "don't" as a single word for U+0027, U+02BC
and U+2019, but "dogs'" as a single word only for U+02BC.  This
complies with TR27.  I'm not surprised, as LibreOffice does use or has
used ICU.

Richard.



More information about the Unicode mailing list