en_GB.xml Gregorian Date Formats

Philippe Verdy via CLDR-Users cldr-users at unicode.org
Wed Mar 14 18:53:08 CDT 2018


I don't think these fallbacks are structured in the correct direction: this
should go from a single child to its single parent and not from some parent
to the full list of its children (which actually has no use for
localization purpose).

We should not need to update the "root" locale for example, which would be
the only locale without any parent specified, or even would not need to
enumerate all locales in the "parentLocales" supplemental data (which could
be deprecated completely and is in fact not needed at all if each locale
specifies its own "parent" locale, or an ordered list of candidate
fallbacks to search first, before searching recursively each candidated
with their respective BCP47 parents, except "root", then finally search for
"root").

Note: this is related to some experience that I made in Wikimedia Commons
to use BCP47 fallback mechanism more coherently and allow easier tuning of
fallbacks: this is always organized first from a child locale specifying
its prefered fallbacks. There's interesting discussions about this in
Module:Fallback (it is still in a sandbox version, still not deployed
completely, but tests are succcessfully handling all cases, including the
need to tune fallbacks locally for a project, here Wikimedia Commons, then
use more generic fallback mechanisms across diverse wikis via Mediawiki
default fallbacks, then enfore the BCP 47 conformance, then using a "root"
= "default" locale for specific needs (basically for handling missing
translations and track them), then some local safe default (the content
language of the local wiki, then basic English which is used as the last
chance).

In all projects, we use locale fallbacks in the direction from child to
parent, never the reverse which is not maintainable.


2018-03-15 0:04 GMT+01:00 Steven R. Loomis via CLDR-Users <
cldr-users at unicode.org>:

> You're quoting from https://www.unicode.org/reports/tr35/tr35.html#Parent_
> Locales
>
> > <identity>…    <parent locale="en_001"/> …</identity>
>
> > There are many things in the locale files that are not strictly
> localizable. Here's an example:
> > <dayPeriodWidth type="narrow">
>
>  "narrow" here is a distinguishing attribute  ( see
> https://www.unicode.org/reports/tr35/tr35.html#Definitions ) and is part
> of the identity of the element content that follows.
>
> I think the point of the quote is that the "parent locale" is structural
> and not part of the identity of the specific xml file. If you look at the
> parent locales in supplemental, they are organized from the point of view
> of the parent, for setting "which locales inherit from en-150?"
>
> Parsing the supplementalData is critical to processing CLDR data.   The
> CLDR Java tooling is available in the source repository, it could be a
> source of comparison for file handling.
>
> Steven
>
>
> <parentLocales>
> <parentLocale parent="root" locales="az_Arab az_Cyrl bm_Nkoo bs_Cyrl
> en_Dsrt en_Shaw ha_Arab iu_Latn mn_Mong ms_Arab pa_Arab shi_Latn sr_Latn
> uz_Arab uz_Cyrl vai_Latn zh_Hant yue_Hans"/>
> <parentLocale parent="en_001" locales="en_150 en_AG en_AI en_AU en_BB
> en_BE en_BM en_BS en_BW en_BZ en_CA en_CC en_CK en_CM en_CX en_CY en_DG
> en_DM en_ER en_FJ en_FK en_FM en_GB en_GD en_GG en_GH en_GI en_GM en_GY
> en_HK en_IE en_IL en_IM en_IN en_IO en_JE en_JM en_KE en_KI en_KN en_KY
> en_LC en_LR en_LS en_MG en_MO en_MS en_MT en_MU en_MW en_MY en_NA en_NF
> en_NG en_NR en_NU en_NZ en_PG en_PH en_PK en_PN en_PW en_RW en_SB en_SC
> en_SD en_SG en_SH en_SL en_SS en_SX en_SZ en_TC en_TK en_TO en_TT en_TV
> en_TZ en_UG en_VC en_VG en_VU en_WS en_ZA en_ZM en_ZW"/>
> <parentLocale parent="en_150" locales="en_AT en_CH en_DE en_DK en_FI en_NL
> en_SE en_SI"/>
> <parentLocale parent="es_419" locales="es_AR es_BO es_BR es_BZ es_CL es_CO
> es_CR es_CU es_DO es_EC es_GT es_HN es_MX es_NI es_PA es_PE es_PR es_PY
> es_SV es_US es_UY es_VE"/>
> <parentLocale parent="pt_PT" locales="pt_AO pt_CH pt_CV pt_GQ pt_GW pt_LU
> pt_MO pt_MZ pt_ST pt_TL"/>
> <parentLocale parent="zh_Hant_HK" locales="zh_Hant_MO"/>
> </parentLocales>
>
>
> On Wed, Mar 14, 2018 at 2:47 PM, George S. via CLDR-Users <
> cldr-users at unicode.org> wrote:
>
>> Personally, I find the reasoning to be circular:
>>
>> "Since parentLocale information is not localizable on a per locale basis,
>> the parentLocale information is contained in CLDR’s supplemental data."
>> <http://unicode.org/reports/tr35/tr35-info.html>
>> There are many things in the locale files that are not strictly
>> localizable. Here's an example:
>>
>> <dayPeriodWidth type="narrow">
>>
>> saying you're not going to put the parent locale in because it's not
>> localizable is kind of silly when you have lot's of data in the file that's
>> not localizable.
>>
>> I'm suggesting:
>>
>> <identity>
>>     <version number="$Revision: 13722 $"/>
>>     <language type="en"/>
>>     <territory type="GB"/>
>>     <parent locale="en_001"/></identity>
>>
>> But it's your guys' project.
>>
>>
>> On 3/14/2018 3:26 PM, Steven R. Loomis wrote:
>>
>> George,
>> > It would be nice if the en_GB.xml file referenced it's parent
>>
>> I appreciate the idea, however, the XML files are not designed to
>> be looked at in isolation. That's why we put this notice at the top:
>>
>> " CLDR data files are interpreted according to the LDML specification (
>> http://unicode.org/reports/tr35/) "
>>
>> Is there a better way to word this?
>>
>> Please also see the Implementer's guide and FAQ at
>> https://github.com/unicode-org/cldr-implementers-guide/  - if you think
>> this would be a good FAQ can you open an issue, or better yet a pull
>> request there?
>>
>>
>> On Wed, Mar 14, 2018 at 1:57 PM, George S. via CLDR-Users <
>> cldr-users at unicode.org> wrote:
>>
>>> Thanks for responding. I knew I'd gone down this road before. Drat.
>>>
>>> I'll make the same comment I made three years ago:
>>>
>>> It would be nice if the en_GB.xml file referenced it's parent so that
>>> mortals might have some idea of where to look. Having the relationship
>>> squirreled away in a file in another directory with a non-obvious name
>>> isn't very handy.
>>>
>>>
>>>
>>> On 3/14/2018 2:38 PM, Peter Edberg wrote:
>>>
>>> en_GB inherits from en_001, not from en.
>>>
>>> - Peter E
>>>
>>> On Mar 14, 2018, at 12:48 PM, George S. via CLDR-Users <
>>> cldr-users at unicode.org> wrote:
>>>
>>> I'm looking at the file comm/main/en_GB.xml and I'm really confused. I'm
>>> looking at the Gregorian calendar section, and there's no
>>>
>>> dateFormats / dateFormatLength=short
>>>
>>> the value in en.xml is
>>>
>>> M/d/yy
>>>
>>> If I look at en_AU.xml there is an entry with a value of "d/M/yy".
>>>
>>> Similarly, en_IE.xml there is no short dateFormatLength value.
>>>
>>> Can anyone help me understand how this all works? I'm using a library
>>> that generates it's localization files from LDML, and it's coming up with a
>>> lot of wrong answers. Before I go to them, I'd like to understand why
>>> things are formatted in this way.
>>>
>>>
>>> --
>>> George S.
>>> *MH Software, Inc.*
>>> Voice: 303 438 9585 <%28303%29%20438-9585>
>>> http://www.mhsoftware.com
>>> _______________________________________________
>>> CLDR-Users mailing list
>>> CLDR-Users at unicode.org
>>> http://unicode.org/mailman/listinfo/cldr-users
>>>
>>>
>>>
>>> --
>>> George S.
>>> *MH Software, Inc.*
>>> Voice: 303 438 9585 <%28303%29%20438-9585>
>>> http://www.mhsoftware.com
>>>
>>> _______________________________________________
>>> CLDR-Users mailing list
>>> CLDR-Users at unicode.org
>>> http://unicode.org/mailman/listinfo/cldr-users
>>>
>>>
>>
>> --
>> George S.
>> *MH Software, Inc.*
>> Voice: 303 438 9585 <(303)%20438-9585>
>> http://www.mhsoftware.com
>>
>> _______________________________________________
>> CLDR-Users mailing list
>> CLDR-Users at unicode.org
>> http://unicode.org/mailman/listinfo/cldr-users
>>
>>
>
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20180315/d0466f0d/attachment-0001.html>


More information about the CLDR-Users mailing list