tg-Cyrl-TJ and tk-TM

Philippe Verdy verdy_p at
Mon Feb 2 21:35:58 CST 2015

Note that this is the same case for the likely value of "en", which is
"en-Latn-US" (but you could argue that "en" represents just the
international form of English, without the US specific jargon) : you don't
need data for "en-Latn" and "en-Latn-US" or "en-US".

Similar case for "zh", but with more branches :

(1) Its likely value is "zh-Hani-CN", but more precisely "cmn-Hans-CN"
(Mandarin being the predomnant language in the Chinese macrolanguage, and
predominently written with the the simplified sinographic script variant).
So with "zh" data you don't need additional data for "zh-Hani",
"zh-Hani-CN", "zh-Hans", "zh-Hans-CN", "zh-CN", "cmn", "cmn-Hani",
"cmn-Hani-CN", "cmn-Hans", "cmn-Hans-CN" or "cmn-CN".

(2) But you can have specific data for "zh-Hant", futher specialized with
additional data:

* for either "zh-Hani-TW" or "zh-Hant-TW", or just "zh-TW" (given that the
likely script variant in Taiwan is traditional)
* for either "zh-Hani-MO" or "zh-Hant-MO", or just "zh-MO" (given that
the likely script variant in Taiwan is traditional)
* for either "zh-Hani-SG" or "zh-Hans-SG", or just "zh-SG" (given that
the likely script variant in Taiwan is simplified)
A lot of combinations of BCP47 subtags can be used in localization data,
but CLDR data concentrates on the default for the root of all branches, and
provides specialized data only for specific branches needing them (it
assumes that you'll use them with the standard fallback resolution
mechanism of BCP47. So you should understand how the fallback mechanism
works: as soon as "likely" subtags are registered in the IANA database for
BCP47 it removes the need to make many specializations for various
combinations (and this is the best role of these "likely" declarations).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the CLDR-Users mailing list