Locale bringup and barriers for entry

Marcel Schneider via CLDR-Users cldr-users at unicode.org
Tue Sep 25 14:11:49 CDT 2018


Thanks for the links to documentation. The first page:

http://cldr.unicode.org/index/cldr-spec/plural-rules

contains new instructions stating that gender is irrelevant except if 
two nouns of different gender are needed to cover all plural categories.

This results in replacing “Prenez la {0}re à droite; Prenez le {0}er à droite”
with a sentence like you suggested: “Prenez au {0}er feu à droite puis la {0}re à droite”

Still I don’t understand why information is to be packed into arbitrary phrases instead 
of being stored in a more formal way, using appropriate data structures differentiating 
the values by transparent criteria, like what is already done for number with data 
stored in the supplemental/ directory:

https://www.unicode.org/repos/cldr/tags/latest/common/supplemental/plurals.xml
https://www.unicode.org/repos/cldr/tags/latest/common/supplemental/ordinals.xml

which is what I looked for.

Values like "few" and "many" seem to be used as convenient labels to get more categories.
Eg Gujarati has "two" for 2 and 3, "few" for 4, and "many" for 6, while 5 and 7 upwards
are "other". Understandably "many" is used for Italian to label the category dedicated 
to numbers starting with a vowel.


The supplemental/ folder contains many things, among which I stumbled over 
attributeValueValidity.xml. The 2ⁿᵈ through 4ᵗʰ comment in this file are 
contradicting the very subject of this thread, so I suggest to remove these
PRIOR to the v34 release…

Regards,

Marcel

On 25/09/18 13:21 Philippe Verdy wrote:

>
Note that the supplemental data is OK for the "cardinal" and "range" type of categories, but largely failing almost everywhere for the "ordinal" type.
E.g. in French: "Prenez la 1re à droite" (this assumes the feminine gender, which is ok for "rue"="street", "avenue", or "sortie"="exit", but wrong for "feu"="trafic light" or "stop" 
which are masculine, as in "Tournez au 1er feu à droite", where "1er" and "1re" change depending on the gender of the explicit or implicit noun)

>
Yes ordinals (but also fractions) need derivation by gender (as well as grammatical case) including for abbreviated forms (e.g. in French, Italian, Spanish, but even in English 
with inflected leading articles like "a" vs. "an", which depends on the numeric value of the ordinal).

>
And I see little use of these "ordinal" types except in strict isolation assuming a nominal use (outside of real sentences where they will be inserted) without any relation with the 
noun (or nominal group) to which they refer (note: this noun or nominal group may be outside the curent isolated "paragraph", such as a column heading, or other info such as 
resulting ranks in sportive competition for women, vs. the same table for men.

>
Basically this means that CLDR just provides baic data that still needs to be tuned and localized again for specific applications, even if this tuning is generic. What CLDR can do 
however is to monitor if there are stable applications desiring to interchange their localized data containign gender or case differences: if their localisation data is large enough to 
cover enough locales for a significant part of the world and theyr want to interoperate, they will create a defacto standard that can be integrated (after being proposed to CLDR 
with enough examplar data and open licencing).

>
Such applications already exist (notably across wikis, ven if this still requires much work to have them cooperate together to stabilize some issues and agree to some common 
formats, and efficicently track the translations problems remaining and how to manage the remaining incoherences, as well as accepting some deviations for specific uses in 
more specific pages they don't want to break).

>
Le mar. 25 sept. 2018 à 13:02, Philippe Verdy  a écrit :
>
> 
>
Le mar. 25 sept. 2018 à 11:32, Marcel Schneider  a écrit :
>
On 25/09/18 10:00 Philippe Verdy wrote:
> > Plural rules are documented. These are defined as minimal data needed to start any new locale.
> 
> That seems to be one of those barriers that Steven is now questioning, or even the main barrier for entry.
> For me that would remain a barrier as long as I cannot get clear insight nor see straightforward structures to fill in.
> 
>
See the documentation:
http://cldr.unicode.org/index/cldr-spec/plural-rules

>
And the supplemental data which gives a list per locale:
http://www.unicode.org/cldr/charts/latest/supplemental/language_plural_rules.html
>

>









More information about the CLDR-Users mailing list