CLDR proposal: Unicode algorithms should fall back to root, not to unrelated default locale
markus.icu at gmail.com
Thu Apr 3 22:01:40 CDT 2014
On Thu, Apr 3, 2014 at 1:21 PM, Richard Wordingham <
richard.wordingham at ntlworld.com> wrote:
> Would language matching data take preference over either?
Language matching should happen earlier. You would match a desired language
against the list of known available languages. Then when you open a service
object there with the resulting language, you don't get into this situation.
How are break iteration rules meant to interact with dictionary-based
> word and line-breakers?
In CLDR and ICU, the rules specify the set of characters that need
dictionary support. (It's triggered by script, not by language.)
I expect that there will generally be data for language-specific
exceptions, overrides and such for more languages than character-level
segmentation rules. Those low-level rules should always fall back to root
when there is no language-specific data. I think the higher-level
exceptions should probably also avoid going through some default language.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the CLDR-Users