Hyphenation Markup
Richard Wordingham via Unicode
unicode at unicode.org
Sat Jun 2 06:37:45 CDT 2018
On Sat, 2 Jun 2018 11:06:43 +0200
Otto Stolz via Unicode <unicode at unicode.org> wrote:
> Am 2018-06-02 um 06:44 schrieb Richard Wordingham via Unicode:
> > In Latin text, one can indicate permissible line break opportunities
> > between grapheme clusters by inserting U+00AD SOFT HYPHEN. What
> > low-end schemes, if any, exist for such mark-up within grapheme
> > clusters?
>
> What about U+200B ZWSP?
> > this character is intended for invisible word
> > separation and for line break control; it has no
> > width, but its presence between two characters
> > does not prevent increased letter spacing in
> > justification
Thanks for the suggestion, but it's not likely to work:
Within a word and with a proper layout implementation, using ZWSP
would be worse than using backing store <character-1, SHY,
character-2>.
1) In the sequence
<letter-0, character-1, ZWSP, character-2, letter-1>
realisation of the break should definitely result in <letter-0,
character-1> on one line and in <character-2, letter-1> on the next
line, whereas in visual order, character-2 should precede character-1.
2) Use of ZWSP will usually result in a dotted circle even when the break does not occur.
3) ZWSP will result in a mandatory word boundary. That will cause
problems with the spell checker.
I've experimented
(http://wrdingham.co.uk/lanna/renderer_test.htm#test_and_tell) with the
combination <letter, right matra> where there is a default grapheme
cluster boundary between the two characters. I get generally better
results with SHY than ZWSP. The downside was that the rendering
systems I tried seemed to insist on inserting the glyph of U+002D or
U+2010, rather than the glyph of U+00AD.
Incidentally, does CLDR define the rendering of soft hyphen, or is one
entirely at the mercy of the application?
Richard.
More information about the Unicode
mailing list