From kipcole9 at gmail.com Fri Dec 2 03:52:31 2016 From: kipcole9 at gmail.com (Kip Cole) Date: Fri, 2 Dec 2016 20:52:31 +1100 Subject: RBNF rule semantics for a ruleset when there is no negative number rule Message-ID: <380C325D-7377-4B4A-8CE1-F54138A0FAEC@gmail.com> I am writing some software using the CLDR. I am stuck on working out the right semantics for formatting a number using RBNF and there is no matching rule within a specified ruleset. For example (I?m writing this in Elixir but I think the intent is clear) formatting the negative integer -50 in the locale ?hr?: iex> Cldr.Rbnf.Spellout.spellout_ordinal_neuter(-50, "hr?) returns an error because in the locale ?hr?, the ruleset for spellout-ordinal-neuter has the following rules (in 30.0.2, using the json github content): "%spellout-ordinal-neuter": { "0": "=%%spellout-ordinal-base=o;", "3": "=%%spellout-ordinal-base=e;", "4": "=%%spellout-ordinal-base=o;" } So that by my understanding, a negative number can?t be formatted in this ruleset for this locale. The nearest understanding I can get is from http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html which says: ? If the number is negative, use the negative-number rule. ? If the number has a fractional part and is greater than 1, use the improper fraction rule. ? If the number has a fractional part and is between 0 and 1, use the proper fraction rule. ? Binary-search the rule list for the rule with the highest base value less than or equal to the number. If that rule has two substitutions, its base value is not an even multiple of its divisor, and the number is an even multiple of the rule's divisor, use the rule that precedes it in the rule list. Otherwise, use the rule itself. Given a negative integer in this context then: 1. There is no negative number rule 2. There is no rule that satisfies "Binary-search the rule list for the rule with the highest base value less than or equal to the number.? Are then any additional semantics intended to cover this case or is an error the appropriate response? Many thanks. From verdy_p at wanadoo.fr Fri Dec 2 10:45:46 2016 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Fri, 2 Dec 2016 17:45:46 +0100 Subject: RBNF rule semantics for a ruleset when there is no negative number rule In-Reply-To: <380C325D-7377-4B4A-8CE1-F54138A0FAEC@gmail.com> References: <380C325D-7377-4B4A-8CE1-F54138A0FAEC@gmail.com> Message-ID: rule for "4" matches the "Binary-search the rule list for the rule with the highest base value less than or equal to the number.", in fact all rules for "0", "3" and "4" have a base value less than or equal to the number 50. The binary search will point you just above rule for "4", which is the highest base value to use. 2016-12-02 10:52 GMT+01:00 Kip Cole : > I am writing some software using the CLDR. I am stuck on working out the > right semantics for formatting a number using RBNF and there is no matching > rule within a specified ruleset. > > For example (I?m writing this in Elixir but I think the intent is clear) > formatting the negative integer -50 in the locale ?hr?: > > iex> Cldr.Rbnf.Spellout.spellout_ordinal_neuter(-50, "hr?) > > returns an error because in the locale ?hr?, the ruleset for > spellout-ordinal-neuter has the following rules (in 30.0.2, using the json > github content): > > "%spellout-ordinal-neuter": { > "0": "=%%spellout-ordinal-base=o;", > "3": "=%%spellout-ordinal-base=e;", > "4": "=%%spellout-ordinal-base=o;" > } > > So that by my understanding, a negative number can?t be formatted in this > ruleset for this locale. The nearest understanding I can get is from > http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html > which says: > > ? If the number is negative, use the negative-number rule. > ? If the number has a fractional part and is greater than 1, use > the improper fraction rule. > ? If the number has a fractional part and is between 0 and 1, use > the proper fraction rule. > ? Binary-search the rule list for the rule with the highest base > value less than or equal to the number. If that rule has two substitutions, > its base value is not an even multiple of its divisor, and the number is an > even multiple of the rule's divisor, use the rule that precedes it in the > rule list. Otherwise, use the rule itself. > > Given a negative integer in this context then: > > 1. There is no negative number rule > 2. There is no rule that satisfies "Binary-search the rule list for the > rule with the highest base value less than or equal to the number.? > > Are then any additional semantics intended to cover this case or is an > error the appropriate response? > > Many thanks. > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From verdy_p at wanadoo.fr Fri Dec 2 10:47:30 2016 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Fri, 2 Dec 2016 17:47:30 +0100 Subject: RBNF rule semantics for a ruleset when there is no negative number rule In-Reply-To: References: <380C325D-7377-4B4A-8CE1-F54138A0FAEC@gmail.com> Message-ID: Note that there should be a negative-number rule there to format the sign and separately the absolute value 50. 2016-12-02 17:45 GMT+01:00 Philippe Verdy : > rule for "4" matches the "Binary-search the rule list for the rule with > the highest base value less than or equal to the number.", in fact all > rules for "0", "3" and "4" have a base value less than or equal to the > number 50. The binary search will point you just above rule for "4", which > is the highest base value to use. > > 2016-12-02 10:52 GMT+01:00 Kip Cole : > >> I am writing some software using the CLDR. I am stuck on working out the >> right semantics for formatting a number using RBNF and there is no matching >> rule within a specified ruleset. >> >> For example (I?m writing this in Elixir but I think the intent is clear) >> formatting the negative integer -50 in the locale ?hr?: >> >> iex> Cldr.Rbnf.Spellout.spellout_ordinal_neuter(-50, "hr?) >> >> returns an error because in the locale ?hr?, the ruleset for >> spellout-ordinal-neuter has the following rules (in 30.0.2, using the json >> github content): >> >> "%spellout-ordinal-neuter": { >> "0": "=%%spellout-ordinal-base=o;", >> "3": "=%%spellout-ordinal-base=e;", >> "4": "=%%spellout-ordinal-base=o;" >> } >> >> So that by my understanding, a negative number can?t be formatted in this >> ruleset for this locale. The nearest understanding I can get is from >> http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html >> which says: >> >> ? If the number is negative, use the negative-number rule. >> ? If the number has a fractional part and is greater than 1, use >> the improper fraction rule. >> ? If the number has a fractional part and is between 0 and 1, use >> the proper fraction rule. >> ? Binary-search the rule list for the rule with the highest base >> value less than or equal to the number. If that rule has two substitutions, >> its base value is not an even multiple of its divisor, and the number is an >> even multiple of the rule's divisor, use the rule that precedes it in the >> rule list. Otherwise, use the rule itself. >> >> Given a negative integer in this context then: >> >> 1. There is no negative number rule >> 2. There is no rule that satisfies "Binary-search the rule list for the >> rule with the highest base value less than or equal to the number.? >> >> Are then any additional semantics intended to cover this case or is an >> error the appropriate response? >> >> Many thanks. >> _______________________________________________ >> CLDR-Users mailing list >> CLDR-Users at unicode.org >> http://unicode.org/mailman/listinfo/cldr-users >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From verdy_p at wanadoo.fr Fri Dec 2 10:48:58 2016 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Fri, 2 Dec 2016 17:48:58 +0100 Subject: RBNF rule semantics for a ruleset when there is no negative number rule In-Reply-To: References: <380C325D-7377-4B4A-8CE1-F54138A0FAEC@gmail.com> Message-ID: But in fact this negative rule is inherited from the default locale, which will insert the negative sign. You should then find a matching negative-number rule in the inherited locales. 2016-12-02 17:47 GMT+01:00 Philippe Verdy : > Note that there should be a negative-number rule there to format the sign > and separately the absolute value 50. > > 2016-12-02 17:45 GMT+01:00 Philippe Verdy : > >> rule for "4" matches the "Binary-search the rule list for the rule with >> the highest base value less than or equal to the number.", in fact all >> rules for "0", "3" and "4" have a base value less than or equal to the >> number 50. The binary search will point you just above rule for "4", which >> is the highest base value to use. >> >> 2016-12-02 10:52 GMT+01:00 Kip Cole : >> >>> I am writing some software using the CLDR. I am stuck on working out >>> the right semantics for formatting a number using RBNF and there is no >>> matching rule within a specified ruleset. >>> >>> For example (I?m writing this in Elixir but I think the intent is clear) >>> formatting the negative integer -50 in the locale ?hr?: >>> >>> iex> Cldr.Rbnf.Spellout.spellout_ordinal_neuter(-50, "hr?) >>> >>> returns an error because in the locale ?hr?, the ruleset for >>> spellout-ordinal-neuter has the following rules (in 30.0.2, using the json >>> github content): >>> >>> "%spellout-ordinal-neuter": { >>> "0": "=%%spellout-ordinal-base=o;", >>> "3": "=%%spellout-ordinal-base=e;", >>> "4": "=%%spellout-ordinal-base=o;" >>> } >>> >>> So that by my understanding, a negative number can?t be formatted in >>> this ruleset for this locale. The nearest understanding I can get is from >>> http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html >>> which says: >>> >>> ? If the number is negative, use the negative-number rule. >>> ? If the number has a fractional part and is greater than 1, use >>> the improper fraction rule. >>> ? If the number has a fractional part and is between 0 and 1, >>> use the proper fraction rule. >>> ? Binary-search the rule list for the rule with the highest base >>> value less than or equal to the number. If that rule has two substitutions, >>> its base value is not an even multiple of its divisor, and the number is an >>> even multiple of the rule's divisor, use the rule that precedes it in the >>> rule list. Otherwise, use the rule itself. >>> >>> Given a negative integer in this context then: >>> >>> 1. There is no negative number rule >>> 2. There is no rule that satisfies "Binary-search the rule list for the >>> rule with the highest base value less than or equal to the number.? >>> >>> Are then any additional semantics intended to cover this case or is an >>> error the appropriate response? >>> >>> Many thanks. >>> _______________________________________________ >>> CLDR-Users mailing list >>> CLDR-Users at unicode.org >>> http://unicode.org/mailman/listinfo/cldr-users >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From verdy_p at wanadoo.fr Fri Dec 2 10:51:26 2016 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Fri, 2 Dec 2016 17:51:26 +0100 Subject: RBNF rule semantics for a ruleset when there is no negative number rule In-Reply-To: References: <380C325D-7377-4B4A-8CE1-F54138A0FAEC@gmail.com> Message-ID: Additionally the "ordinal number" are very strange if you lok for them with negative values. I think we are in a very fuzzy use case: ordinals have only been really tested for integers higher than 0, exclusing negative numbers, zero, and fractional parts. 2016-12-02 17:48 GMT+01:00 Philippe Verdy : > But in fact this negative rule is inherited from the default locale, which > will insert the negative sign. You should then find a matching > negative-number rule in the inherited locales. > > 2016-12-02 17:47 GMT+01:00 Philippe Verdy : > >> Note that there should be a negative-number rule there to format the sign >> and separately the absolute value 50. >> >> 2016-12-02 17:45 GMT+01:00 Philippe Verdy : >> >>> rule for "4" matches the "Binary-search the rule list for the rule with >>> the highest base value less than or equal to the number.", in fact all >>> rules for "0", "3" and "4" have a base value less than or equal to the >>> number 50. The binary search will point you just above rule for "4", which >>> is the highest base value to use. >>> >>> 2016-12-02 10:52 GMT+01:00 Kip Cole : >>> >>>> I am writing some software using the CLDR. I am stuck on working out >>>> the right semantics for formatting a number using RBNF and there is no >>>> matching rule within a specified ruleset. >>>> >>>> For example (I?m writing this in Elixir but I think the intent is >>>> clear) formatting the negative integer -50 in the locale ?hr?: >>>> >>>> iex> Cldr.Rbnf.Spellout.spellout_ordinal_neuter(-50, "hr?) >>>> >>>> returns an error because in the locale ?hr?, the ruleset for >>>> spellout-ordinal-neuter has the following rules (in 30.0.2, using the json >>>> github content): >>>> >>>> "%spellout-ordinal-neuter": { >>>> "0": "=%%spellout-ordinal-base=o;", >>>> "3": "=%%spellout-ordinal-base=e;", >>>> "4": "=%%spellout-ordinal-base=o;" >>>> } >>>> >>>> So that by my understanding, a negative number can?t be formatted in >>>> this ruleset for this locale. The nearest understanding I can get is from >>>> http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html >>>> which says: >>>> >>>> ? If the number is negative, use the negative-number rule. >>>> ? If the number has a fractional part and is greater than 1, >>>> use the improper fraction rule. >>>> ? If the number has a fractional part and is between 0 and 1, >>>> use the proper fraction rule. >>>> ? Binary-search the rule list for the rule with the highest >>>> base value less than or equal to the number. If that rule has two >>>> substitutions, its base value is not an even multiple of its divisor, and >>>> the number is an even multiple of the rule's divisor, use the rule that >>>> precedes it in the rule list. Otherwise, use the rule itself. >>>> >>>> Given a negative integer in this context then: >>>> >>>> 1. There is no negative number rule >>>> 2. There is no rule that satisfies "Binary-search the rule list for the >>>> rule with the highest base value less than or equal to the number.? >>>> >>>> Are then any additional semantics intended to cover this case or is an >>>> error the appropriate response? >>>> >>>> Many thanks. >>>> _______________________________________________ >>>> CLDR-Users mailing list >>>> CLDR-Users at unicode.org >>>> http://unicode.org/mailman/listinfo/cldr-users >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hugh_paterson at sil.org Fri Dec 2 11:28:51 2016 From: hugh_paterson at sil.org (Hugh Paterson) Date: Fri, 2 Dec 2016 09:28:51 -0800 Subject: Dataset for all ISO639 code sorted by country/territory? In-Reply-To: References: <488D0FBC-4540-4B62-968D-54537B85F919@icu-project.org> <520C6D97-128E-405F-BCAF-FAFA126DD244@icu-project.org> Message-ID: I was poking around in in a library published by SIL under MIT license in their github repo. It has a nice list of countries with the languages spoken by them. I don't think this is a direct relicensing of the ethnologue tables. Their might be some alteration in the library from ethnologue tables. (Corporation internal, the data source may be the same, but the manifestation and expressions are different and released under different licenses.) Here is a link to the file I am referencing: https://raw.githubusercontent.com/sillsdev/libpalaso/master/SIL.WritingSystems/Resources/LanguageIndex.txt Here is a link to the library: https://github.com/sillsdev/libpalaso Here is a link to the documentation: https://github.com/sillsdev/libpalaso/wiki/SIL.WritingSystems - Hugh On Thu, Nov 24, 2016 at 2:31 PM, Mats Blakstad wrote: > > > On 24 November 2016 at 19:28, Chris Leonard wrote: > >> Just so you know there are other sources of indigenous language data >> that is locally developed for First Languages Australia. >> >> http://firstlanguages.org.au/ >> >> at >> >> http://gambay.com.au/map >> >> >> Thank you for this tips! > > I also started to check around for other data sets that can be used to try > validate or elaborate on the data from Glottalog, so other suggestions are > also helpful. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kipcole9 at gmail.com Fri Dec 2 13:32:27 2016 From: kipcole9 at gmail.com (Kip Cole) Date: Sat, 3 Dec 2016 06:32:27 +1100 Subject: RBNF rule semantics for a ruleset when there is no negative number rule In-Reply-To: References: <380C325D-7377-4B4A-8CE1-F54138A0FAEC@gmail.com> Message-ID: <6A14F4C5-5FF6-4C81-A38E-FBA7DEB1A02F@gmail.com> Phillipe, thank you, overall that makes sense. I?m far from being comfortable with the CLDR inheritance rules, but it seems that the ldml2json conversion merges most (all?) of the inheritance chain leaving only the locale called ?root? as a potential parent for all locales. Is that a correct understanding? If I look in root.json rbnf I see for the rule group OrdinalRules: "OrdinalRules": { "%digits-ordinal": { "-x": "?>>;", "0": "=#,##0=.;" } }, which unsurprisingly doesn?t have a rule set called ?spellout_ordinal_neuter?. Therefore I?m not sure if: 1. ?hr? can be said to inherit from ?root? (this being the json data) 2. what rule set would ?spellout_ordinal_neuter? be considered to inherit from? I recognise that ordinal negative numbers might not be a normative case, it?s an example for my learning as much as anything. > On 3 Dec 2016, at 3:51 AM, Philippe Verdy wrote: > > Additionally the "ordinal number" are very strange if you lok for them with negative values. I think we are in a very fuzzy use case: ordinals have only been really tested for integers higher than 0, exclusing negative numbers, zero, and fractional parts. > > 2016-12-02 17:48 GMT+01:00 Philippe Verdy >: > But in fact this negative rule is inherited from the default locale, which will insert the negative sign. You should then find a matching negative-number rule in the inherited locales. > > 2016-12-02 17:47 GMT+01:00 Philippe Verdy >: > Note that there should be a negative-number rule there to format the sign and separately the absolute value 50. > > 2016-12-02 17:45 GMT+01:00 Philippe Verdy >: > rule for "4" matches the "Binary-search the rule list for the rule with the highest base value less than or equal to the number.", in fact all rules for "0", "3" and "4" have a base value less than or equal to the number 50. The binary search will point you just above rule for "4", which is the highest base value to use. > > 2016-12-02 10:52 GMT+01:00 Kip Cole >: > I am writing some software using the CLDR. I am stuck on working out the right semantics for formatting a number using RBNF and there is no matching rule within a specified ruleset. > > For example (I?m writing this in Elixir but I think the intent is clear) formatting the negative integer -50 in the locale ?hr?: > > iex> Cldr.Rbnf.Spellout.spellout_ordinal_neuter(-50, "hr?) > > returns an error because in the locale ?hr?, the ruleset for spellout-ordinal-neuter has the following rules (in 30.0.2, using the json github content): > > "%spellout-ordinal-neuter": { > "0": "=%%spellout-ordinal-base=o;", > "3": "=%%spellout-ordinal-base=e;", > "4": "=%%spellout-ordinal-base=o;" > } > > So that by my understanding, a negative number can?t be formatted in this ruleset for this locale. The nearest understanding I can get is from http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html which says: > > ? If the number is negative, use the negative-number rule. > ? If the number has a fractional part and is greater than 1, use the improper fraction rule. > ? If the number has a fractional part and is between 0 and 1, use the proper fraction rule. > ? Binary-search the rule list for the rule with the highest base value less than or equal to the number. If that rule has two substitutions, its base value is not an even multiple of its divisor, and the number is an even multiple of the rule's divisor, use the rule that precedes it in the rule list. Otherwise, use the rule itself. > > Given a negative integer in this context then: > > 1. There is no negative number rule > 2. There is no rule that satisfies "Binary-search the rule list for the rule with the highest base value less than or equal to the number.? > > Are then any additional semantics intended to cover this case or is an error the appropriate response? > > Many thanks. > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From verdy_p at wanadoo.fr Fri Dec 2 19:31:31 2016 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Sat, 3 Dec 2016 02:31:31 +0100 Subject: RBNF rule semantics for a ruleset when there is no negative number rule In-Reply-To: <6A14F4C5-5FF6-4C81-A38E-FBA7DEB1A02F@gmail.com> References: <380C325D-7377-4B4A-8CE1-F54138A0FAEC@gmail.com> <6A14F4C5-5FF6-4C81-A38E-FBA7DEB1A02F@gmail.com> Message-ID: You're right there's no fallback for ?spellout_ordinal_neuter? defined in Croatian, for such borderline case (negative or null or non-integer values), (this would be needed there, because the ?spellout_ordinal_neuter? form does not exist in all languages (just consider the "neuter" grammatical criteria). It should be noted that what we call "ordinal" is the adjective form (possibly substantified and used as a noun without necessarily adding a pronoun such as "one" in English "The first is..."~"The first one is...") But numeral date elements are also considered ordinals ; they are usually counted inclusively from their base in most calendars, in forward or backward time ; but there exists derived calendars (in scientific contexts or in internal computations) using them as cardinal values (using a zero base and negative values, instead of counting backward in time). This concerns the year, month (or moon), day, week number, sometimes the weekday (for some languages not assigning them distinctive names, or in ISO 8601 format using only "W1".."W7" with such embedded ordinals), and other counting based on religous fests (not forgetting the Roman Republican calendar doing it too for counting dates backward inclusively relative to the current or next calende, ide or none). These ordinals do not use the adjective form but the same (numeral or spelled) form as cardinals. 2016-12-02 20:32 GMT+01:00 Kip Cole : > Phillipe, thank you, overall that makes sense. I?m far from being > comfortable with the CLDR inheritance rules, but it seems that the > ldml2json conversion merges most (all?) of the inheritance chain leaving > only the locale called ?root? as a potential parent for all locales. Is > that a correct understanding? > > If I look in root.json rbnf I see for the rule group OrdinalRules: > > "OrdinalRules": { > "%digits-ordinal": { > "-x": "?>>;", > "0": "=#,##0=.;" > } > }, > > which unsurprisingly doesn?t have a rule set called > ?spellout_ordinal_neuter?. Therefore I?m not sure if: > > 1. ?hr? can be said to inherit from ?root? (this being the json data) > 2. what rule set would ?spellout_ordinal_neuter? be considered to inherit > from? > > I recognise that ordinal negative numbers might not be a normative case, > it?s an example for my learning as much as anything. > > > On 3 Dec 2016, at 3:51 AM, Philippe Verdy wrote: > > Additionally the "ordinal number" are very strange if you lok for them > with negative values. I think we are in a very fuzzy use case: ordinals > have only been really tested for integers higher than 0, exclusing negative > numbers, zero, and fractional parts. > > 2016-12-02 17:48 GMT+01:00 Philippe Verdy : > >> But in fact this negative rule is inherited from the default locale, >> which will insert the negative sign. You should then find a matching >> negative-number rule in the inherited locales. >> >> 2016-12-02 17:47 GMT+01:00 Philippe Verdy : >> >>> Note that there should be a negative-number rule there to format the >>> sign and separately the absolute value 50. >>> >>> 2016-12-02 17:45 GMT+01:00 Philippe Verdy : >>> >>>> rule for "4" matches the "Binary-search the rule list for the rule >>>> with the highest base value less than or equal to the number.", in fact all >>>> rules for "0", "3" and "4" have a base value less than or equal to the >>>> number 50. The binary search will point you just above rule for "4", which >>>> is the highest base value to use. >>>> >>>> 2016-12-02 10:52 GMT+01:00 Kip Cole : >>>> >>>>> I am writing some software using the CLDR. I am stuck on working out >>>>> the right semantics for formatting a number using RBNF and there is no >>>>> matching rule within a specified ruleset. >>>>> >>>>> For example (I?m writing this in Elixir but I think the intent is >>>>> clear) formatting the negative integer -50 in the locale ?hr?: >>>>> >>>>> iex> Cldr.Rbnf.Spellout.spellout_ordinal_neuter(-50, "hr?) >>>>> >>>>> returns an error because in the locale ?hr?, the ruleset for >>>>> spellout-ordinal-neuter has the following rules (in 30.0.2, using the json >>>>> github content): >>>>> >>>>> "%spellout-ordinal-neuter": { >>>>> "0": "=%%spellout-ordinal-base=o;", >>>>> "3": "=%%spellout-ordinal-base=e;", >>>>> "4": "=%%spellout-ordinal-base=o;" >>>>> } >>>>> >>>>> So that by my understanding, a negative number can?t be formatted in >>>>> this ruleset for this locale. The nearest understanding I can get is from >>>>> http://www.icu-project.org/apiref/icu4c/classRuleBasedNumber >>>>> Format.html which says: >>>>> >>>>> ? If the number is negative, use the negative-number rule. >>>>> ? If the number has a fractional part and is greater than 1, >>>>> use the improper fraction rule. >>>>> ? If the number has a fractional part and is between 0 and 1, >>>>> use the proper fraction rule. >>>>> ? Binary-search the rule list for the rule with the highest >>>>> base value less than or equal to the number. If that rule has two >>>>> substitutions, its base value is not an even multiple of its divisor, and >>>>> the number is an even multiple of the rule's divisor, use the rule that >>>>> precedes it in the rule list. Otherwise, use the rule itself. >>>>> >>>>> Given a negative integer in this context then: >>>>> >>>>> 1. There is no negative number rule >>>>> 2. There is no rule that satisfies "Binary-search the rule list for >>>>> the rule with the highest base value less than or equal to the number.? >>>>> >>>>> Are then any additional semantics intended to cover this case or is an >>>>> error the appropriate response? >>>>> >>>>> Many thanks. >>>>> _______________________________________________ >>>>> CLDR-Users mailing list >>>>> CLDR-Users at unicode.org >>>>> http://unicode.org/mailman/listinfo/cldr-users >>>>> >>>> >>>> >>> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From verdy_p at wanadoo.fr Fri Dec 2 19:39:16 2016 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Sat, 3 Dec 2016 02:39:16 +0100 Subject: RBNF rule semantics for a ruleset when there is no negative number rule In-Reply-To: References: <380C325D-7377-4B4A-8CE1-F54138A0FAEC@gmail.com> <6A14F4C5-5FF6-4C81-A38E-FBA7DEB1A02F@gmail.com> Message-ID: Additionally some languages have other variants for their use of ordinals, notably for titling chapters, or for naming kings/queens/emperors, popes: the ordinal form may be used only for the first element, but not the following ones. This is most frequent case in French: - "chapitre premier", "article premier"... (ordinal form) instead of "chapitre un" (cardinal form, correct but less frequent), but then "chapitre deux" (cardinal form), instead of "chapitre deuxi?me" (ordinal form correct but rarely used) - "Fran?ois Premier" (ordinal form) and never "Fran?ois Un", but then "Fran?ois Deux" (cardinal form) and never "Fran?ois Deuxi?me"... These special case ordinals are not handled in CLDR. The special cases for date elements however are in CLDR with other date formatting items for each language and calendar. 2016-12-03 2:31 GMT+01:00 Philippe Verdy : > You're right there's no fallback for ?spellout_ordinal_neuter? defined in > Croatian, for such borderline case (negative or null or non-integer > values), (this would be needed there, because the ?spellout_ordinal_neuter? > form does not exist in all languages (just consider the "neuter" > grammatical criteria). > > It should be noted that what we call "ordinal" is the adjective form > (possibly substantified and used as a noun without necessarily adding a > pronoun such as "one" in English "The first is..."~"The first one is...") > But numeral date elements are also considered ordinals ; they are usually > counted inclusively from their base in most calendars, in forward or > backward time ; but there exists derived calendars (in scientific contexts > or in internal computations) using them as cardinal values (using a zero > base and negative values, instead of counting backward in time). > > This concerns the year, month (or moon), day, week number, sometimes the > weekday (for some languages not assigning them distinctive names, or in ISO > 8601 format using only "W1".."W7" with such embedded ordinals), and other > counting based on religous fests (not forgetting the Roman Republican > calendar doing it too for counting dates backward inclusively relative to > the current or next calende, ide or none). These ordinals do not use the > adjective form but the same (numeral or spelled) form as cardinals. > > > 2016-12-02 20:32 GMT+01:00 Kip Cole : > >> Phillipe, thank you, overall that makes sense. I?m far from being >> comfortable with the CLDR inheritance rules, but it seems that the >> ldml2json conversion merges most (all?) of the inheritance chain leaving >> only the locale called ?root? as a potential parent for all locales. Is >> that a correct understanding? >> >> If I look in root.json rbnf I see for the rule group OrdinalRules: >> >> "OrdinalRules": { >> "%digits-ordinal": { >> "-x": "?>>;", >> "0": "=#,##0=.;" >> } >> }, >> >> which unsurprisingly doesn?t have a rule set called >> ?spellout_ordinal_neuter?. Therefore I?m not sure if: >> >> 1. ?hr? can be said to inherit from ?root? (this being the json data) >> 2. what rule set would ?spellout_ordinal_neuter? be considered to >> inherit from? >> >> I recognise that ordinal negative numbers might not be a normative case, >> it?s an example for my learning as much as anything. >> >> >> On 3 Dec 2016, at 3:51 AM, Philippe Verdy wrote: >> >> Additionally the "ordinal number" are very strange if you lok for them >> with negative values. I think we are in a very fuzzy use case: ordinals >> have only been really tested for integers higher than 0, exclusing negative >> numbers, zero, and fractional parts. >> >> 2016-12-02 17:48 GMT+01:00 Philippe Verdy : >> >>> But in fact this negative rule is inherited from the default locale, >>> which will insert the negative sign. You should then find a matching >>> negative-number rule in the inherited locales. >>> >>> 2016-12-02 17:47 GMT+01:00 Philippe Verdy : >>> >>>> Note that there should be a negative-number rule there to format the >>>> sign and separately the absolute value 50. >>>> >>>> 2016-12-02 17:45 GMT+01:00 Philippe Verdy : >>>> >>>>> rule for "4" matches the "Binary-search the rule list for the rule >>>>> with the highest base value less than or equal to the number.", in fact all >>>>> rules for "0", "3" and "4" have a base value less than or equal to the >>>>> number 50. The binary search will point you just above rule for "4", which >>>>> is the highest base value to use. >>>>> >>>>> 2016-12-02 10:52 GMT+01:00 Kip Cole : >>>>> >>>>>> I am writing some software using the CLDR. I am stuck on working out >>>>>> the right semantics for formatting a number using RBNF and there is no >>>>>> matching rule within a specified ruleset. >>>>>> >>>>>> For example (I?m writing this in Elixir but I think the intent is >>>>>> clear) formatting the negative integer -50 in the locale ?hr?: >>>>>> >>>>>> iex> Cldr.Rbnf.Spellout.spellout_ordinal_neuter(-50, "hr?) >>>>>> >>>>>> returns an error because in the locale ?hr?, the ruleset for >>>>>> spellout-ordinal-neuter has the following rules (in 30.0.2, using the json >>>>>> github content): >>>>>> >>>>>> "%spellout-ordinal-neuter": { >>>>>> "0": "=%%spellout-ordinal-base=o;", >>>>>> "3": "=%%spellout-ordinal-base=e;", >>>>>> "4": "=%%spellout-ordinal-base=o;" >>>>>> } >>>>>> >>>>>> So that by my understanding, a negative number can?t be formatted in >>>>>> this ruleset for this locale. The nearest understanding I can get is from >>>>>> http://www.icu-project.org/apiref/icu4c/classRuleBasedNumber >>>>>> Format.html which says: >>>>>> >>>>>> ? If the number is negative, use the negative-number rule. >>>>>> ? If the number has a fractional part and is greater than 1, >>>>>> use the improper fraction rule. >>>>>> ? If the number has a fractional part and is between 0 and 1, >>>>>> use the proper fraction rule. >>>>>> ? Binary-search the rule list for the rule with the highest >>>>>> base value less than or equal to the number. If that rule has two >>>>>> substitutions, its base value is not an even multiple of its divisor, and >>>>>> the number is an even multiple of the rule's divisor, use the rule that >>>>>> precedes it in the rule list. Otherwise, use the rule itself. >>>>>> >>>>>> Given a negative integer in this context then: >>>>>> >>>>>> 1. There is no negative number rule >>>>>> 2. There is no rule that satisfies "Binary-search the rule list for >>>>>> the rule with the highest base value less than or equal to the number.? >>>>>> >>>>>> Are then any additional semantics intended to cover this case or is >>>>>> an error the appropriate response? >>>>>> >>>>>> Many thanks. >>>>>> _______________________________________________ >>>>>> CLDR-Users mailing list >>>>>> CLDR-Users at unicode.org >>>>>> http://unicode.org/mailman/listinfo/cldr-users >>>>>> >>>>> >>>>> >>>> >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mats.gbproject at gmail.com Sat Dec 3 10:49:35 2016 From: mats.gbproject at gmail.com (Mats Blakstad) Date: Sat, 3 Dec 2016 17:49:35 +0100 Subject: Dataset for all ISO639 code sorted by country/territory? In-Reply-To: References: <488D0FBC-4540-4B62-968D-54537B85F919@icu-project.org> <520C6D97-128E-405F-BCAF-FAFA126DD244@icu-project.org> Message-ID: Thank you so much for these links - really a lot of great open data there! There is also some very interesting mapping of languages to scripts: https://github.com/sillsdev/libpalaso/blob/master/SIL.WritingSystems/Resources/alltags.txt We now have 2 data sets we could use for initial mapping of languages to territories, I've updated the ticket; http://unicode.org/cldr/trac/ticket/9915 We managed to dig up a lot of statistics for language use in the Nordic countries now; http://unicode.org/cldr/trac/ticket/9919 However, with this amount of languages mapped to territories we should really clarify how we map their status within the different territories; http://unicode.org/cldr/trac/ticket/9916 On 2 December 2016 at 18:28, Hugh Paterson wrote: > I was poking around in in a library published by SIL under MIT license in > their github repo. It has a nice list of countries with the languages > spoken by them. I don't think this is a direct relicensing of the > ethnologue tables. Their might be some alteration in the library from > ethnologue tables. (Corporation internal, the data source may be the same, > but the manifestation and expressions are different and released under > different licenses.) > > Here is a link to the file I am referencing: https://raw. > githubusercontent.com/sillsdev/libpalaso/master/SIL. > WritingSystems/Resources/LanguageIndex.txt > Here is a link to the library: https://github.com/sillsdev/libpalaso > Here is a link to the documentation: https://github. > com/sillsdev/libpalaso/wiki/SIL.WritingSystems > > - Hugh > > > > On Thu, Nov 24, 2016 at 2:31 PM, Mats Blakstad > wrote: > >> >> >> On 24 November 2016 at 19:28, Chris Leonard wrote: >> >>> Just so you know there are other sources of indigenous language data >>> that is locally developed for First Languages Australia. >>> >>> http://firstlanguages.org.au/ >>> >>> at >>> >>> http://gambay.com.au/map >>> >>> >>> Thank you for this tips! >> >> I also started to check around for other data sets that can be used to >> try validate or elaborate on the data from Glottalog, so other suggestions >> are also helpful. >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kipcole9 at gmail.com Wed Dec 7 03:44:08 2016 From: kipcole9 at gmail.com (Kip Cole) Date: Wed, 7 Dec 2016 15:14:08 +0530 Subject: RBNF for money spellout Message-ID: >From time to time its useful to be able to spellout a currency (think writing a cheque/check; not that they?re so common any more). For example: ?Two thousand and two dollars and twenty-three cents?. I don?t see that defined in any of the cldr rule groups but would like to check the community thinking on this before I define my own rule group. Many thanks for any advice or input. From kent.karlsson14 at telia.com Wed Dec 7 09:27:26 2016 From: kent.karlsson14 at telia.com (Kent Karlsson) Date: Wed, 07 Dec 2016 16:27:26 +0100 Subject: RBNF for money spellout In-Reply-To: Message-ID: For which combinations? It comes out as very many if you take all. Note that in several languages different currencies have different grammatical class (of some kine). E.g. "pund" is neuter, and "dollar" is reale in Swedish; likewise "krona" is reale, and "?re" is neuter. So the spellouts are (or should be) already there, but the combination with currency (and fractional currency) names should be done outside of the RBNF mechanism. It is similar to the now deprecated time spellout, that used to be part of the RBNF mechanism. We should not introduce, or actually much worse. This kind of combination, while useful both for cheques/similar and for general speech generation, should be kept outside of the RBNF machinery itself. /Kent Karlsson Den 2016-12-07 10:44, skrev "Kip Cole" : > From time to time its useful to be able to spellout a currency (think writing > a cheque/check; not that they?re so common any more). For example: ?Two > thousand and two dollars and twenty-three cents?. I don?t see that defined > in any of the cldr rule groups but would like to check the community thinking > on this before I define my own rule group. Many thanks for any advice or > input. _______________________________________________ CLDR-Users mailing > list CLDR-Users at unicode.org http://unicode.org/mailman/listinfo/cldr-users From cjl at sugarlabs.org Wed Dec 7 15:46:22 2016 From: cjl at sugarlabs.org (Chris Leonard) Date: Wed, 7 Dec 2016 16:46:22 -0500 Subject: RBNF for money spellout In-Reply-To: References: Message-ID: Are talking about ISO 4217? That will also list the country-of-issuance. Some already have localization at http://pkg-isocodes.alioth.debian.org/ from http://translationproject.org/domain/iso_4217.html On Wed, Dec 7, 2016 at 4:44 AM, Kip Cole wrote: > From time to time its useful to be able to spellout a currency (think writing a cheque/check; not that they?re so common any more). For example: ?Two thousand and two dollars and twenty-three cents?. I don?t see that defined in any of the cldr rule groups but would like to check the community thinking on this before I define my own rule group. > > Many thanks for any advice or input. > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users