Hyphenation
Jukka K. Korpela
jkorpela at cs.tut.fi
Wed Feb 4 12:57:38 CST 2015
2015-02-04, 19:58, Cameron Dutro wrote:
> It is often the case, especially on smaller screens, that long words
> must be hyphenated so they wrap in a natural way. As far as I can tell,
> the CLDR data set does not define hyphenation rules.
That is correct. And they cannot really be described using the
techniques currently deployed in CLDR.
> I'm not even really
> sure what the hyphenation rules should be for English.
They vary by version of English (and by authority).
> The implementation I've seen uses a dictionary - maybe it's identifying
> potential breaks at syllable boundaries?
Some simple hyphenators are dictionary-driven. But this does not work
well even for English, since any word not in the dictionary would remain
unhyphenated. It does not work well at all for languages that have, say,
a thousand inflected forms for each verb or noun – but may have simple
algorithmic rules for hyphenation.
Hyphenation strategies vary greatly by language. At present, the best
you can do is to try to find suitable hyphenation software for the
languages that are relevant to you.
Yucca
More information about the CLDR-Users
mailing list