Question on BCP 47

Philippe Verdy via CLDR-Users cldr-users at unicode.org
Sat Nov 18 12:26:57 CST 2017


Where are these "BCP47"-like codes documented on Windows (not really
conforming as collation subtags are encoded as if they were language
variant codes) as equivalent to the Windows internal LCID ? May be it's up
to Microsoft to cleanup its MSDN documentation and add notes about legacy
codes that should no longer be used in applications and replaced by the
registered BCP47 collation subtags.
In BCP47, collation codes should use the locale extension subtags.

Same question about legacy locale codes used in Unix/Linux (also using
non-conforming extensions such as "@charset").

2017-11-18 18:48 GMT+01:00 Patrick Andries via CLDR-Users <
cldr-users at unicode.org>:

>
> In Windows, certain cultures have an "alternate sort" order (collation).
> This list , according to the LCID reference, is:
>
> LCID
>
> Language tag
>
> Collation type
>
> 0x1007F
>
> x-IV-mathan
>
> Math alphanumeric
>
> 0x10407
>
> de-DE_phoneb
>
> Phonebook
>
> 0x1040E
>
> hu-HU_tchncl
>
> Technical
>
> 0x10437
>
> ka-GE_modern
>
> Modern
>
> 0x20804
>
> zh-CN_stroke
>
> Stroke count
>
> 0x21404
>
> zh-MO_stroke
>
> "
>
> 0x21004
>
> zh-SG_stroke
>
> "
>
> 0x30404
>
> zh-TW_pronun
>
> Pronunciation
>
> 0x40404
>
> zh-TW_radstr
>
> Radical/stroke
>
> 0x40411
>
> ja-JP_radstr
>
> "
>
> 0x40C04
>
> zh-HK_radstr
>
> "
>
> 0x41404
>
> zh-MO_radstr
>
> "
>
> Some of these "collation"s  have equivalent entries among the collation
> identifiers defined in the "Unicode locale extension"
> (See: unicode.org/repos/
> <https://www.unicode.org/repos/cldr/tags/latest/common/bcp47/collation.xml>
> cldr
> <https://www.unicode.org/repos/cldr/tags/latest/common/bcp47/collation.xml>
> /tags/latest/common/bcp47/collation.xml
> <https://www.unicode.org/repos/cldr/tags/latest/common/bcp47/collation.xml>
> ):
>
> Identifier
>
> Description
>
> big5han
>
> Pinyin ordering for Latin, big5 charset ordering for CJK characters (used
> in Chinese)
>
> compat
>
> A previous version of the ordering, for compatibility
>
> dict
>
> Dictionary style ordering
>
> ducet
>
> The default Unicode collation element table order
>
> emoji
>
> Recommended ordering for emoji characters
>
> eor
>
> European ordering rules
>
> gb2312
>
> Pinyin ordering for Latin, gb2312han charset ordering for CJK characters (
> used in Chinese)
>
> phonebk
>
> Phonebook style ordering (such as in German)
>
> phonetic
>
> Phonetic ordering (sorting based on pronunciation)
>
> pinyin
>
> Pinyin ordering for Latin and for CJK characters (used in Chinese)
>
> reformed
>
> Reformed ordering (such as in Swedish)
>
> search
>
> Special collation type for string search
>
> searchjl
>
> Special collation type for Korean initial consonant search
>
> standard
>
> Default ordering for each language
>
> stroke
>
> Pinyin ordering for Latin, stroke order for CJK characters (used in
> Chinese)
>
> trad
>
> Traditional style ordering (such as in Spanish)
>
> unihan
>
> Pinyin ordering for Latin, Unihan radical-stroke ordering for CJK
> characters (used in Chinese)
>
> zhuyin
>
> Pinyin ordering for Latin, zhuyin order for Bopomofo and CJK characters (
> used in Chinese)
>
>
> The question is thus the following: if one wants to create a BCP 47 string
> representing the locale and the options a user has chosen in a Windows
> environment, one should be able to represent the "Windows alternate sorts"
> in BCP 47 syntax. Some such as "phoneb" have equivalent entries ("phonebk")
> but some don't apparently.
>
> If some Windows alternate sorts do not have equivalent entries, should we
> request for these to be added to the CLDR, or rather use a "variant tag",
> or yet use a "private use" tag in the BCP 47 format?
>
> Patrick Andries
>
>
>
> ------------------------------
> [image: Avast logo] <https://www.avast.com/antivirus>
>
> L'absence de virus dans ce courrier électronique a été vérifiée par le
> logiciel antivirus Avast.
> www.avast.com <https://www.avast.com/antivirus>
>
> <#m_5473554254161149119_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20171118/9e161391/attachment.html>


More information about the CLDR-Users mailing list