adding all of iso639_3 to either en.xml or root.xml

Tue Jul 15 16:27:24 CDT 2014

And this problem is being progressively solved, code by code. This should
have not been so long for that language to fix it with a private-use  code
or a code that does not conflict with standard BCP code formats (codes
starting with subtags longer than 3 characters are reserved, bu Wikimedia).
But this is not just a question of language code to fix, there's a need to
maintain domain names for a while, fixing all other wikis to resolve the
new code, fixing templates and pages in lots of places.

But this Wiki code has spread further than expected, outside Wikimedia
(e.g. in OSM databases, and in other projects translated in
translatewiki.net).

Like all major sites have have histories to preserve, this is a slow
process, evey one has to manage his own legacy usages. That's why there's
now a Language committee to approve new codes, and why admins no longer
accept new project codes at the first request without reviewing it and
looking for comments.

Wikimedia is not alone, most users of ISO 639 have inventted their own
local use (including for bibliographic purposes, before ISO 639-3 was
published, and BCP47 was revized with stricter rules for extensions, e.g.
"be-x-old" is still used instead of the newer be-tarask, but at least it is
conforming to BC47 and causes little problems; same thing about
"zh-classical" even if "lzh" is prefered, or "de-formal" instead of
"de-x-formal", or "simple" instead of "en-x-simple", which cause little
problems but are still used in domain names; "zh-yue" us still ised as
domain name but the prefered "yue" code is also recognized as an alias, and
both are valid, so the problem is solved).

I'm not pragmantic up to the point to propose to adopt "nrm" as used in
Wikimedia and translatewiki.net. But at least Wikimedia admins know the
problem and have to solve it progressively with the community. This is the
only severe conflict remaining (if it has still not been solved it's
because the language is still not encoded in standards, and admins don't
want to migrate the sites twice). I've asked them to request an allocation
for Norman but they could not get decisive opinions about its dialects (and
notably with Jersiais official as a language in Jersey). May be a separate
code should be requested for Jersiais itself, even if Norman gets its own
code, as a macrolanguage encompassing Jersiais, Guernésiais, Continental
Norman.

The other problem is that Continental Normal is still considered as a
variant of French (unlike Picard which includes its Ch'timi variant in
French Flanders and has close relation with Wallon in Belgium). Linguists
have different point of views. But Picard is also considered a variant of
French by the same people that think Norman is French, and associate
Jersiais directly to French.

May be "fr" should be considered a macrolanguage too, to encompass its
regional or historic variants (including "frc" = "French Cajun" spoken in
Louisiana, USA), and "standard modern Parisian French in France" would have
then its own new code too within that macrolanguage.

2014-07-15 21:50 GMT+02:00 Doug Ewell <doug at ewellic.org>:

> Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
>
> > Let's be pragmatic and use the best tools. Even if you don't like
> > Wikipedia itself for its content (or the tone of its local
> > discussions), it does not mean eveything is bad, I personnly like this
> > diversity which permits technical innovations to appear, and very
> > bright things like Wikitionary, that evolves at the same time as
> > people in the world using the languages we would like to coordinate.
>
> Thank goodness CLDR doesn't apply the Wikipedia model of inventing new
> "Standard X" code elements that step on the reserved code space of
> Standard X, ignoring any private-use mechanism built into Standard X, as
> Wikipedia does with language codes and ISO 639-3 and BCP 47.
>
> The hijacking of 'nrm' by Wikipedia for Norman, described by Philippe
> earlier, is a perfect example of this. In ISO 639-3 and in BCP 47, 'nrm'
> is the code element for Narom, spoken in Malaysia.
>
> I'm otherwise a fan of Wikipedia, but this example of "Wikipedia
> exceptionalism" is just about the worst possible approach for either
> stability or interoperability.
>
> --
> Doug Ewell | Thornton, CO, USA
> http://ewellic.org | @DougEwell
>
>
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140715/e7fe11cc/attachment.html>