adding all of iso639_3 to either en.xml or root.xml

Philippe Verdy verdy_p at wanadoo.fr
Tue Jul 15 12:20:06 CDT 2014


ISO639-3 does not matter for our goal. What we need is the names in the
IANA subtags registry (which does not necessarily agree with ISO 639-3 as
well); CLDR focuses on "locales", not really "languages" under the ISO 639
definition, so CLDR (like almost all computing and networking protocols and
languages) is based on BCP47.

Leave ISO 639 only for bibliographic classifications, it is not stable
enough for our goals and not sufficiently evolutive with backward
compatibility and clear paths for data migrations (if possible) or for
handling ambiguities remaining across epochs and evolutions of languages
and their so-called "dialects". It is not usable for localisation and
preserving data tagging in archived documents (even many bibliophiles do
not like ISO 639 as it requires them too much maintenance).

Sometimes it's hard to ket people know that ISO639 is not important. BCP47
is less known because it has too often been referenced by its evolving RFC
numbers. ISO639 is wrllknown for its complete lack of interoperability and
its own contradictiosn and instability; it's best to forget it here
completely for the CLDR project (notably because like BCP47 we will ignore
many incoherent parts of ISO639).


2014-07-15 18:16 GMT+02:00 John Emmons <emmo at us.ibm.com>:

>  Another potential problem here is that en.xml and iso639-3 don't always
> agree 100% on the names.  Maybe in root - but I think it is definitely
> going to be hard to maintain.
> I put it on the agenda for the next TC mtg.
>
>
> Regards,
>
> John C. Emmons
> Globalization Architect & Unicode CLDR TC Chairman
> IBM Software Group
> Internet: emmo at us.ibm.com
>
>
> [image: Inactive hide details for "Mckenna, Mike" ---07/15/2014 10:30:29
> AM---I know we at PayPal would certainly be fans of getting al]"Mckenna,
> Mike" ---07/15/2014 10:30:29 AM---I know we at PayPal would certainly be
> fans of getting all of iso639-3 in CLDR. We are currently cob
>
> From: "Mckenna, Mike" <mimckenna at paypal.com>
> To: Mark Davis ☕️ <mark at macchiato.com>
> Cc: Martin Hosken <martin_hosken at sil.org>, "cldr-users at unicode.org" <
> cldr-users at unicode.org>, "Steven R. Loomis" <srl at icu-project.org>
> Date: 07/15/2014 10:30 AM
> Subject: Re: adding all of iso639_3 to either en.xml or root.xml
> Sent by: "CLDR-Users" <cldr-users-bounces at unicode.org>
> ------------------------------
>
>
>
> I know we at PayPal would certainly be fans of getting all of iso639-3 in
> CLDR. We are currently cobbling lists together in English and then
> translating to target languages. I would have no problem with having the
> English names in root since these and the French are the official ISO
> entries.
>
> We use the lists for pull-downs on postal address entry forms and need to
> present them in user language for selection, local language for domestic
> delivery and English for international postal mail.
>
> Thanks,
>
> Mike___
>
> Sent from my iPhone
>
>
> On Jul 14, 2014, at 11:56 PM, "Mark Davis ☕️" <*mark at macchiato.com*
> <mark at macchiato.com>> wrote:
>
>    I'm not sure it would be worth it. People can always pick up a copy of
>    the language subtag registry and use it to back-fill.
>
>    We do keep a copy of the registry in our tooling data directory, and
>    that's what we do in our tooling, such as myCldrFile.getName(language).
>
>
>    *Mark* <https://google.com/+MarkDavis>
>
>    *— Il meglio è l’inimico del bene —*
>
>
>    On Tue, Jul 15, 2014 at 6:52 AM, Steven R. Loomis <
>    *srl at icu-project.org* <srl at icu-project.org>> wrote:
>       If anything should be in en and not root.
>
>       Wonder if it could go into seed/en or something.
>
>       It's not in en right now because of translation burden. But I'd
>       think we could set controls via coverage.
>
>       En.xml is hand curated now, that would be another distinction.
>
>       Steven
>
>       Enviado desde nuestro iPhone.
>
>       El jul 14, 2014, a las 9:47 PM, Martin Hosken <
>       *martin_hosken at sil.org* <martin_hosken at sil.org>> escribió:
>        Dear All,
>
>          I notice that en.xml only contains
>          localeDisplayNames/languages/language entries for a subset of iso639-3. Is
>          there a case for filling out the list based on iso639-3 reference names so
>          that people don't have to fallback to data not in the CLDR? Or, given
>          iso639 has these reference names, is there a case for putting them into the
>          root. I realise it's a bit odd to put what amounts to English names into
>          root.xml. OTOH these are the official reference names and so act as
>          fallback for all languages, so perhaps it would be appropriate. I'm happy
>          either way. But I think CLDR would benefit from having the complete
>          reference name mapping of iso639-3 in it.
>
>          Yours,
>          Martin
>          _______________________________________________
>          CLDR-Users mailing list
> *CLDR-Users at unicode.org* <CLDR-Users at unicode.org>
> *http://unicode.org/mailman/listinfo/cldr-users*
>          <http://unicode.org/mailman/listinfo/cldr-users>
>
>       _______________________________________________
>       CLDR-Users mailing list
> *CLDR-Users at unicode.org* <CLDR-Users at unicode.org>
> *http://unicode.org/mailman/listinfo/cldr-users*
>       <http://unicode.org/mailman/listinfo/cldr-users>
>
>    _______________________________________________
>    CLDR-Users mailing list
> *CLDR-Users at unicode.org* <CLDR-Users at unicode.org>
> *http://unicode.org/mailman/listinfo/cldr-users*
>    <http://unicode.org/mailman/listinfo/cldr-users>
>    _______________________________________________
>    CLDR-Users mailing list
>    CLDR-Users at unicode.org
>    http://unicode.org/mailman/listinfo/cldr-users
>
>
>
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140715/0615aae6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140715/0615aae6/attachment.gif>


More information about the CLDR-Users mailing list