proposal for new character 'soft/preferred line break'

Jukka K. Korpela jkorpela at
Tue Feb 11 01:25:35 CST 2014

2014-02-10 22:30, Philippe Verdy kirjoitti:

> No I make no confusion: <wbr> is a formatting HTML element, SHY (or
> ­ in HTML syntax for the defined entity) is a character. Both play
> equivalent roles in HTML,

Not at all.

> except that ­ has a defined behavior to
> insert an hyphen at end of broken lines, where <wbr> would adopt a
> language-dependant behavior (not all languages use hyphens at end of
> lines to mark words that have been split by breaking lines).

Quite the opposite. The effect of SOFT HYPHEN is expected to be 
language-dependent (though it hardly is in web browsers):
Normally, it causes hyphenation, with a hyphen inserted when adequate. 
This is quite different from a direct break opportunity, which is what 
<wbr> means in browser practice, being standardized in HTML5:
There is nothing language-dependent about <wbr>, in theory or in 
practice. It is never expected to result in the addition of a hyphen, 
and it never does that.

When Netscape invented <wbr> long ago, they chose a cryptic name, which, 
when expanded (to “word break”), has seriously misled many people into 
thinking that it is for suggesting breaks inside human-language words. 
Its primary use is for breaks inside strings that are *not* words. 
(Exceptionally, it sometimes has use inside words: you might wish to 
write e.g. tax-<wbr>free, but there the point is that a simple string 
break is OK, since the “-” is part of the word and no hyphen needs to be 
added when the word is divided.)


More information about the Unicode mailing list