Soft Hyphens in Complex and East Asian Scripts

Richard Wordingham richard.wordingham at
Sun Apr 27 17:46:09 CDT 2014

I'm trying to assess the impact of what I regard as a word-processing
bug, and this forum seems to be the best source of information.

What writing systems using 'complex' or 'East Asian' scripts use U+00AD
SOFT HYPHEN in a manner that is potentially visually distinct from

The only good example I have is Thai, and it seems remiss that most of
the 8-bit encodings for Thai don't support invisible line-breaking
opportunities at all.

I do have two probable examples from a book in Tai Khuen (Tai Tham
script) published in Thailand, but they may result from poor editing
or, possibly, be plain hyphens. Both words appear to be proper nouns.
The book has several examples of clear words broken across lines without
any hyphenation.

Are there any 'complex' or 'East Asian' scripts where U+00AD and U+200B
have the same visual effect but are used for different semantics?  An
obvious example would be for U+200B to mark word boundaries but for
U+00AD to mark line break opportunities within a word.


More information about the Unicode mailing list