LDML data for en_IE

Philippe Verdy verdy_p at wanadoo.fr
Sat Feb 14 09:11:42 CST 2015


The custom inheritance of "en-IE" from "en-GB" instead of "en", is a bit
questionable
It may look convenint only for the current needs within CLDR data itelf,
but it is an exception to the default inheritance from "en" that one would
expect for more general data.
I fear that inserting the inheritance of "en-IE" first via "en-GB" before
"en", this could generate unexpected issues in other applications that have
highly customized their own "en-GB" data (outside CLDR data) in a way not
compatible with "en-IE".
If there a way in CLDR data to indicate that this custom inheritance is
purely internal to CLDR data and that it does not apply as a standard for
all kind of data that applications could need to remain separated for
"en-GB" and "en-IE" (as described and assumed *by default* in standard
BCP.47 fallback resolution mechanism)

If there's no way to indicate that this is a purely internal inheritance
for CLDR data itsefl, we should better duplicate the necessary data entries
from "en-GB" into "en-IE" and maintain them separately (both witll still
inherit by default from 'en", and "en" itself from "root"). This is safer
for longer term maintenance even if there is some data duplication (but
most duplication is already avoided by the data already inherited by
"en-GB" from "en" and by the data inherited by "en" from "root").

At least, the duplication also allows saying that instead of being
inherited (so with a local draft status), that data is "confirmed" in that
locale (but instead of duplicating the data value, we would just insert the
entry needed only to confirm that the value in that specialization comes
from another referenced locale).

So in the top level element of the "en-IE" locale:

    <use status="default" fromLocale="en" />
    <use status="draft" fromLocale="en-GB" />

and for a specific entry:

    <dateFormatLength type="short">
        <dateFormat status="default" fromLocale="en" />
        <dateFormat status="draft" fromLocale="en-GB" />
        <dateFormat status="unconfirmed">
            <pattern>M/d/yy</pattern>
        </dateFormat>
    </dateFormatLength>

Or something similar (for completeness only, I added above the entries for
status="default" but it should be implicit with BCP47 rules and is not
really needed).
The idea being to be able to track with high level of granualrity (not just
for the whole locale) the confirmation status and maintain alternate
proposals in "unconfirmed" status than the one with "draft" status (still
not confirmed formally but having the best votes for now : applications may
decide to discard "unconfirmed" entries, or could use it only as alternate
solutions when there's no succes with normal entries with implicit cofirmed
status or with default status, for example when trying to parse dates with
a lenient parser; a strict date input parser would always reject input not
matching the implicit "confirmed" format or the "default" format).


2015-02-06 17:21 GMT+01:00 George Sexton <georges at mhsoftware.com>:

>
> On 2/6/2015 8:59 AM, Rafael Xavier wrote:
>
>  Thus the value would come from en.xml, which would be:
>
>
>  Shouldn't it be en_GB.xml, which is its parent locale?
>
>
> Gosh, I looked through the en_IE.xml file and there's no parentLocale
> element in the file? Surely, the standard is better than to have some
> useful inheritance data that's required squirreled away in some uselessly
> named file like "supplementalData.xml" in an entirely different directory.
>
> Seriously, parentLocale should be part of the identity block in the
> common/main/ll_CC.xml file. Not having it there is silly.
>
> However, it would appear you're right.
>
>
>
> On Fri, Feb 6, 2015 at 12:26 AM, George Sexton <georges at mhsoftware.com>
> wrote:
>
>>  I'm looking at the LDML data for common/main/en_IE.xml.  In this file,
>> in the gregorian section there is only a full date format entry.
>>
>> As documented somewhat ironically in section 4 of Unicode Technical
>> Standard #35 Unicode Locale Data Markup Language (LDML), a lookup for
>> dateFormatLength short should follow inheritance. Thus the value would come
>> from en.xml, which would be:
>>
>>  <dateFormatLength type="short">
>> 	<dateFormat>        	<pattern>M/d/yy</pattern>
>> 	</dateFormat></dateFormatLength>
>>
>>
>> However examining the JSON file of cldr data, main/en-IE/ca-gregorian.js,
>> it contains:
>>
>>  "short": "dd/MM/y"
>>
>>  I've also had a person who is a native of that country inform me that
>> M/d/yy is not correct.
>>
>> Can someone help me understand why the LDML data implicitly contains (to
>> my understanding) an incorrect definition of the short date format for the
>> en-IE locale?
>>
>>
>>
>> --
>> George Sexton
>> *MH Software, Inc.*
>> Voice: 303 438 9585
>> http://www.mhsoftware.com
>>
>> _______________________________________________
>> CLDR-Users mailing list
>> CLDR-Users at unicode.org
>> http://unicode.org/mailman/listinfo/cldr-users
>>
>>
>
>
> --
>  +55 (16) 98138-1582, +1 (415) 568-5854, skype: rxaviers
> http://rafael.xavier.blog.br
>
>
> _______________________________________________
> CLDR-Users mailing listCLDR-Users at unicode.orghttp://unicode.org/mailman/listinfo/cldr-users
>
>
> --
> George Sexton
> *MH Software, Inc.*
> Voice: 303 438 9585
> http://www.mhsoftware.com
>
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20150214/3ff476b9/attachment.html>


More information about the CLDR-Users mailing list