Ancient Greek apostrophe marking elision

Richard Wordingham via Unicode unicode at unicode.org
Sat Jan 26 19:15:18 CST 2019


On Sat, 26 Jan 2019 15:45:54 +0000
James Kass via Unicode <unicode at unicode.org> wrote:

> Perhaps I'm not understanding, but if the desired behavior is to 
> prohibit both line and word breaks in the example string, then...
> 
> In Notepad, replacing U+0020 with U+00A0 removes the line-break.

I believe the problem is that "δ’ αρχαια" should have non-blank
*words*.  With U+2019, one gets 3.  Line-break suppressing spaces don't
help with word-breaking, because they are not treated as letters.

A clunky solution would be to have a sequence <delta,
control-joining-words, U+2019>.  However, there is no such
thing as a 'control-joining-words' if one complies with the TUS
injunction in Section 23.3, "The word joiner should be ignored in
contexts other than line breaking".  A robust, trainable spell-checker
will treat this institutionally racist injunction with the contempt it
deserves.

It's interesting that the spellings "'bus" and "'phone" have died.
They would once have hit the word-boundary problems when "bus" and
"phone" were rejected.

Richard.



More information about the Unicode mailing list