Missing locales (lrc, hrx)

Philippe Verdy verdy_p at wanadoo.fr
Sun Jun 15 01:19:06 CDT 2014


2014-06-15 7:07 GMT+02:00 Steven R. Loomis <srl at icu-project.org>:

> Cldr doesn't handle "autonyms" separately, they are just a part of core
> data.
>

You don't understand what I mean: an "autonym" is a language name defined
in that language itself.

To be able to define an autonym for a language, it requires support of its
associated locale.

This is not required when translating language names to other languages
like English, so the English name "Northern Luri" can be easily defined in
the existing survey (with just the addition of the language code to the
root locale in the list of languages), and can be named in al languages
that have a supported locale (e.g. Persan here), but you won't be able to
input the autonym without the locale.

So yes, an autonym requires some core data to be filled (the language code,
the initial suggested language name before it is surveyed, the script (and
implicitly its direction), the examplar characters (can be surveyed too),
the set of digits and some basic punctuation (can also be surveyed), and a
few supplementary data (population by country... with some very rough
estimation, not essential on fact for creating a locale), and the plural
rule (cannot be surveyed : in PO/POT locales, this is in fact the only core
data really needed).

So I think that you don't even really need more than just the plural rule
and the numeric system used (however some validation tests in the survey
tool are trying to check the characters in data :
* this could be ignored by starting without the examplar or auxiliary set ;
* after initial survey, if there's still no agreement on the examplar set,
* all the data would remain in "draft" state (not published in the release)
because data would have not been checked to be using the recommended subset
of their script, or because there could have been diagreements about the
script to use, meaning that locale variants may be needed, or data for
transliterations be specified); even the country location may be left of
the CLDR Survey tool did not use it to subclass locales in groups : it
could have a "other" group or could even start with no coutnry at all,
meaning that there won't be initially per-country variants of the base
locale for that language).

The initial submission for new language could use the same CLDR tool, but
with input checks relaxed, and probably in a separate draft database.

All that would be required would be to define the number and type of plural
forms and the language code. Even the language name would not be necessary
(but it will likely be prefilled with a suggested English name (for easy
selection of the locale) and the suggested autonym. But both will be
surveyed: the English name (or French, German, Arabic, Persan, etc.) will
be surveyed in the main database (because English has a supported locale),
the autonym would be surveyed in the draft database.

With the benefit of less administrative cost to initiate a new locale and
less difficulty for initial requester to provide everything you're asking
for the initial data (which is much less than the "Core" data we see in the
Survey !)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140615/db69fd38/attachment.html>


More information about the CLDR-Users mailing list