Locale bringup and barriers for entry
Marcel Schneider via CLDR-Users
cldr-users at unicode.org
Tue Sep 25 14:11:49 CDT 2018
Thanks for the links to documentation. The first page:
http://cldr.unicode.org/index/cldr-spec/plural-rules
contains new instructions stating that gender is irrelevant except if
two nouns of different gender are needed to cover all plural categories.
This results in replacing “Prenez la {0}re à droite; Prenez le {0}er à droite”
with a sentence like you suggested: “Prenez au {0}er feu à droite puis la {0}re à droite”
Still I don’t understand why information is to be packed into arbitrary phrases instead
of being stored in a more formal way, using appropriate data structures differentiating
the values by transparent criteria, like what is already done for number with data
stored in the supplemental/ directory:
https://www.unicode.org/repos/cldr/tags/latest/common/supplemental/plurals.xml
https://www.unicode.org/repos/cldr/tags/latest/common/supplemental/ordinals.xml
which is what I looked for.
Values like "few" and "many" seem to be used as convenient labels to get more categories.
Eg Gujarati has "two" for 2 and 3, "few" for 4, and "many" for 6, while 5 and 7 upwards
are "other". Understandably "many" is used for Italian to label the category dedicated
to numbers starting with a vowel.
The supplemental/ folder contains many things, among which I stumbled over
attributeValueValidity.xml. The 2ⁿᵈ through 4ᵗʰ comment in this file are
contradicting the very subject of this thread, so I suggest to remove these
PRIOR to the v34 release…
Regards,
Marcel
On 25/09/18 13:21 Philippe Verdy wrote:
>
Note that the supplemental data is OK for the "cardinal" and "range" type of categories, but largely failing almost everywhere for the "ordinal" type.
E.g. in French: "Prenez la 1re à droite" (this assumes the feminine gender, which is ok for "rue"="street", "avenue", or "sortie"="exit", but wrong for "feu"="trafic light" or "stop"
which are masculine, as in "Tournez au 1er feu à droite", where "1er" and "1re" change depending on the gender of the explicit or implicit noun)
>
Yes ordinals (but also fractions) need derivation by gender (as well as grammatical case) including for abbreviated forms (e.g. in French, Italian, Spanish, but even in English
with inflected leading articles like "a" vs. "an", which depends on the numeric value of the ordinal).
>
And I see little use of these "ordinal" types except in strict isolation assuming a nominal use (outside of real sentences where they will be inserted) without any relation with the
noun (or nominal group) to which they refer (note: this noun or nominal group may be outside the curent isolated "paragraph", such as a column heading, or other info such as
resulting ranks in sportive competition for women, vs. the same table for men.
>
Basically this means that CLDR just provides baic data that still needs to be tuned and localized again for specific applications, even if this tuning is generic. What CLDR can do
however is to monitor if there are stable applications desiring to interchange their localized data containign gender or case differences: if their localisation data is large enough to
cover enough locales for a significant part of the world and theyr want to interoperate, they will create a defacto standard that can be integrated (after being proposed to CLDR
with enough examplar data and open licencing).
>
Such applications already exist (notably across wikis, ven if this still requires much work to have them cooperate together to stabilize some issues and agree to some common
formats, and efficicently track the translations problems remaining and how to manage the remaining incoherences, as well as accepting some deviations for specific uses in
more specific pages they don't want to break).
>
Le mar. 25 sept. 2018 à 13:02, Philippe Verdy a écrit :
>
>
>
Le mar. 25 sept. 2018 à 11:32, Marcel Schneider a écrit :
>
On 25/09/18 10:00 Philippe Verdy wrote:
> > Plural rules are documented. These are defined as minimal data needed to start any new locale.
>
> That seems to be one of those barriers that Steven is now questioning, or even the main barrier for entry.
> For me that would remain a barrier as long as I cannot get clear insight nor see straightforward structures to fill in.
>
>
See the documentation:
http://cldr.unicode.org/index/cldr-spec/plural-rules
>
And the supplemental data which gives a list per locale:
http://www.unicode.org/cldr/charts/latest/supplemental/language_plural_rules.html
>
>
More information about the CLDR-Users
mailing list