Hyphenation Markup

Martin J. Dürst via Unicode unicode at unicode.org
Sat Jun 2 19:26:40 CDT 2018

Hello Richard,

On 2018/06/02 20:37, Richard Wordingham via Unicode wrote:

>> Am 2018-06-02 um 06:44 schrieb Richard Wordingham via Unicode:
>>> In Latin text, one can indicate permissible line break opportunities
>>> between grapheme clusters by inserting U+00AD SOFT HYPHEN.  What
>>> low-end schemes, if any, exist for such mark-up within grapheme
>>> clusters?

> 1) In the sequence
> <letter-0, character-1, ZWSP, character-2, letter-1>
> realisation of the break should definitely result in <letter-0,
> character-1> on one line and in <character-2, letter-1> on the next
> line, whereas in visual order, character-2 should precede character-1.

My question goes a bit further than to Doug's: Why would you want to do 
such a thing? Are there actual scripts/languages where line breaks 
within grapheme clusters occur? If yes, what are there? Can you show 
actual examples, e.g. scans of documents,...?

In writing systems, there are almost always exceptions to simple rules, 
but in general, breaking a line *within* a grapheme cluster seems to be 
a bad idea.

Regards,   Martin.

More information about the Unicode mailing list