Locale bringup and barriers for entry

Mark Davis ☕️ via CLDR-Users cldr-users at unicode.org
Mon Sep 24 15:18:26 CDT 2018


Mark


On Mon, Sep 24, 2018 at 12:52 PM Marcel Schneider via CLDR-Users <
cldr-users at unicode.org> wrote:

> On 24/09/18 18:52 Steven R. Loomis wrote:
> […]
> > I'm not sure what is meant by 'extensions to the DTD'.  In any event,
> CLDR pluralization has proven to be largely successful in practice.
> > Do you have any specific concern about CLDR plurals? Is there a bug
> filed?
>
> I’d filed this bug about French plurals:
>
> https://unicode.org/cldr/trac/ticket/11302
> Ordinal minimal pairs for French
>
> Although as noted there, most other locales are unaffected.
> I’ve just extrapolated from this that some issues may be awaiting new
> locales, and that when facing barriers,
> getting them out of the way may require the DTD to be extended, so
> submitters should be ready to file tickets,
> as we’re often prompted to do by the SurveyTool information panel.
>
> […]
> > >
> > > Before including this functionality in SurveyTool, where it belongs
> in, I think that the spec should be redesigned, and the documentation
> updated
> > > accordingly. That could eventually result in extended language support
> by CLDR/ICU, which would do no harm but only raise the product value.
> >
> > Redesigned how? Again - do you have any specific concern about CLDR
> plurals? Is there a bug filed?
>
> My concern is that CLDR seems not to take gender into account when
> providing plural rules, but I was told that gender is not inside the scope.
> The fact is that nouns may inflect differently depending on whether they
> are feminine or masculine.
>

The focus for plurals in CLDR is "what would change if I change a number to
another number in a placeholder". So if I have a message with a masculine
noun, I have two versions:

one: "{number} libro è selezionato"
other: "{number} libri sono selezionati"

vs also 2 versions with a feminine noun.

one: "{number} nota è selezionata"
other: "{number} note è selezionato

Now, there are some languages (eg Russian) that only exhibit differences
for one of the plural categories if there is certain gender involved. So
the plural categories themselves need to be the maximal partition across
the possible genders, cases, and other features.

What is NOT in scope for CLDR at this time is to both change gender and
number. Typically that requires many other changes in the rest of the text.

one: "{number} {thing} è selezionata"
...

ICU has a mechanism for doing a SELECT using gender, but there the gender
has to be supplied as a parameter, and a sub-message supplied for each of
the (say) 3 genders x 4 plural-categories.

Actually detecting the gender of nouns and modifying sentences on that
basis is out of scope (and a very tricky problem in general).


> > > >  - allowing some locales to 'get started' without plural rules?
> > >
> > > I think that any locale may get started in CLDR when providing date
> and time formats, while correctly displaying a reminder of a shopping cart
> > > may be left over for a later stage.
> >
> > That's the general idea. (And a good way to put it, as a 'shopping
> cart'.)
>
> The idea isn’t mine. Here is the documentation locus where I got it from:
>
>
> http://cldr.unicode.org/index/cldr-spec/plural-rules#TOC-Non-inflecting-Nouns-Pronouns
>
> > Perhaps any data item that depends on plurals ( currency category,
> compact decimal category, etc. )
> > would be 'locked' until it is unlocked by the input of plural data.
>
> Provided that “locking” an item won’t cause a blank or another sort of
> bug.
> When a user sees an item not pluralized where it is expected to be plural,
> then simply inferring that pluralization isn’t ready might be
> straightforward.
> There will surely be some IF in the code to prevent the app from crashing.
>

What we have considered (there is a ticket for this somewhere) is
disallowing any data/votes to be entered in a row with a "count" or
"ordinal" attribute until the rules (resp. plural or ordinal) are supplied.
The row would either be grayed out or just omitted.

So data could be entered in the locale for other fields, but the locale
couldn't reach moderate or modern coverage without the rules. So
applications not requiring that coverage level could include the locale,
but those requiring that coverage level would omit it.

>
> Glad that the discussion has restarted. Perhaps I was too impatient.
>
> Regards,
>
>
> Marcel
>
>
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20180924/7489c6c7/attachment.html>


More information about the CLDR-Users mailing list