en_GB.xml Gregorian Date Formats

Philippe Verdy via CLDR-Users cldr-users at unicode.org
Fri Mar 16 22:18:23 CDT 2018


Me too. This format causes also a constant chicken-and-egg problem for
versioning (let's remember that the way locale data work and are updated,
not all locales are changed at the same time, and one may still want to
tune one locale without touching the rest, but here the current format
makes that each locale depends on this global file which itself depends on
all locales: it's impossible to get a coherent view when we just want to
update one locale, without having to rebuild or recheck all other locales
because of this "stupid" backward dependency (and I don't see why we could
not mix locale data files from several version of CLDR: e.g. updating all
locales except a few that a project has decided to tune specifically and
that werre based on previous versions possibly depending on a different
parent/fallback locales: e.g. a local could have initially depended on
"root", then was changed later to depend on "en", then later to "cy", then
back to "en": fallbacks are subject to change and each project may have
preferrences, depending on the intended public their are targetting and the
amount of translations/localisations made in specific locales; some newer
versions of CLDR data may depend on features still not implemented in their
locale libraries, or could contain characters not supported on their
rendering).

Making parent/fallback locales directly a data for each locale allows much
better flexibility, and removes the undesired dependency of all LDML files
to be all in the same CDLR dataset version assumed by the single
supplemental file. All projects that want to remove this stupid dependency
will need to parse this supplemental file once, only to integrate a single
"fallback" element in each LDML file, **before** versioning it for the
intended project. Then if ever CLDR changes this supplemental file, it will
be ignored in locales that have laready been used an integrated, even if
they are refreshed: the fallback will not be overwritten from the new
supplemental file as it can cause severe problems or it will irritate final
users.

An in all cases, this single global dependency certainly does not help
maintaining CLDR itself to experiment new locales to add or integrate (or
possibly remove or put back to "draft", for insufficient level of
completeness or because there are too many signaled problems in core
elements). Yuo cannot easily create custom "branches" in the project for
tests, and then reintegrate the branch later after tests as there's no
clear way to decide which fallbacks from different versions of the global
file to keep for all other locales! It's just simpler to decide that for a
single tested locale.

This does not prohibit the CLDR project to create an assembly later
containing the generated supplemental file, and compressing datas by
eliminating identical data in child locales that are identical to the data
inherited either directly from the designated parent/fallback (and then
recursively to the grand parent) and then from the standard BCP47 mechanism
for fallbacks from the target locale (infered only by the format of BCP47
locale codes), then again recursively on every parent, then grand-parent,
until we reach the "root" locale (ignored in all previous steps, but which
will be the last locale tested; after that point an application may opt to
choose some arbitrary locale such as the website default language, or the
OS default localisation, or arbitrarily may choose the language used
natively by programmers or in the first non-localized  versions of the
application).


2018-03-17 3:44 GMT+01:00 Martin Hosken <martin_hosken at sil.org>:

> Dear All,
>
> I would like to add my support to this proposal of moving the parentLocal
> information back into the ldml file. When dealing with adding a new locale,
> it is very helpful if everything about that locale is stored in the one
> file and so can be edited independently and submitted as a unit rather than
> having to change other files in order to add a new locale. In my analysis
> and limited experience of working with LDML I would suggest that there are
> two areas where we have overlaid a solution using our namespace:
>
> 1. Parent locale
>
> It may be convenient from an overall database perspective to have the
> parent child relationships stored as supplemental data, but I think this
> was a retrograde step and would love to see the <fallback> element
> reinstated. Adding a new locale then is simply a process of adding a new
> file rather than adding a file and editing the supplemental data and
> merging and managing that. Supplemental data should be supplemental not
> required for the interpretation / flattening of an LDML file.
>
> I am thinking that perhaps the parentLocale was pulled out of the ldml
> file and into supplementalData because it was thought to be useful in
> language tag processing. I'm not sure that it is and as mentioned is more
> trouble than it is worth being there rather than back in the ldml file.
>
> 2. Language names
>
> When adding a new locale, very often the language name or more likely the
> variant for that language, needs to be given for key languages like English
> and French. Changing these core locales is not something we want to
> encourage those creating emerging locales to be involved in. So instead we
> have added the ability to specify those strings for just the locale itself,
> in the locale itself. I realise that this is doing the opposite of what I
> said in the previous section. But it allows the new locale to be self
> contained. At the point the locale gets integrated, then it is a simple
> matter to move the information out of the new locale into the other locales
> where it really belongs. So this is an interim solution that again works to
> keep a single LDML file editable as a unit rather than seeing changes to a
> locale as edits to a large database of files.
>
> I realise that philosophically to the CLDR technical team, the CLDR is
> just one big database and that it is the integrity and management of that
> database as a whole that is key. But for many language groups, their view
> of that database is their LDML file and they would like to have full
> control over information from that one file rather than needing to be given
> write access to globally shared files.
>
> Yours,
> Martin
>
>
>
> > No duplication at all: in one case there's a supplemenal file and we
> always
> > need infer reverse data.
> > You can do the opposite: put this data directly in the per-locale files,
> > and then infer the supplemental file by generating it only for
> > compatiblity, but most tools will never need that supplemental file which
> > will just be informational.
> > You'll avoid also a source of errors in the existing file (e.g. missing
> > codes in the lists, duplicate codes assigned to the same parent or to
> > different parents).
> >
> > It will just be simpler to validate each locale separately without
> editing
> > a long line of codes in the supplemental file. If new locales are added
> by
> > teams, no need to synchronize your work a team can update one locale and
> > another update and validate another one. No one needs to touch this
> > supplemtal file which will just be automatically infered after collecting
> > the dataset for all locales to publish.
> > But tools will no longer need this file (they will not need it at all if
> > the parent locale is already found and specified in per-locale files).
> >
> > No need to parse all the content of the supplemental file (which is
> > compeltely unusable without parsing it completely and reversing the
> > mapping). This supplkemetnal file does not work like normal BCP47
> fallback
> > resolution mechanism, it works in the incorrect direction.
> >
> > So yes : deprecate it, make it only informational but no longer required.
> > Inform that in some future future versions (e.g. 5 years after notice) it
> > will be completely removed. No applicationat all really need it!
> >
> >
> > 2018-03-17 0:14 GMT+01:00 Steven R. Loomis <srl at icu-project.org>:
> >
> > >
> > > El El jue, mar. 15, 2018 a las 2:33 p. m., Philippe Verdy via
> CLDR-Users <
> > > cldr-users at unicode.org> escribió:
> > >
> > >>
> > >> So I'm also in favor of deprecating the old supplemental file and
> > >> integrate what it currently contains directly within the data of each
> > >> relevant child locale: this will be clearer, and immediately usable
> by all
> > >> CLDR-using applications and libraries, with also less maintenance
> (which is
> > >> complex to do in a separate global files containining long lits of
> codes
> > >> (possibly forgetting some, not coherent with BCP 47 fallback
> mechanisms,
> > >> and in fact unnecessarily long to process when applications just need
> ONE
> > >> parent locale which is specific to each locale, without processing
> **all**
> > >> the supplemental data file to locate if a locale has some parent).
> > >>
> > >
> > > This means duplicating data, which makes maintenance more complicated
> for
> > > no real benefit, as well as breaking existing consumers. As I wrote,
> there
> > > are tools which already generate fully resolved locale data with all
> the
> > > inheritance filled in. Perhaps that would be of more interest in
> > > consumption than the unresolved source data.
> > >
> > >
> > >
> > >> 2018-03-15 21:35 GMT+01:00 George S. via CLDR-Users <
> > >> cldr-users at unicode.org>:
> > >>
> > >>> On 3/14/2018 5:04 PM, Steven R. Loomis via CLDR-Users wrote:
> > >>>
> > >>> You're quoting from https://www.unicode.org/
> > >>> reports/tr35/tr35.html#Parent_Locales
> > >>>
> > >>> > <identity>…    <parent locale="en_001"/> …</identity>
> > >>>
> > >>> > There are many things in the locale files that are not strictly
> > >>> localizable. Here's an example:
> > >>> > <dayPeriodWidth type="narrow">
> > >>>
> > >>>  "narrow" here is a distinguishing attribute  ( see
> > >>> https://www.unicode.org/reports/tr35/tr35.html#Definitions ) and is
> > >>> part of the identity of the element content that follows.
> > >>>
> > >>> I think the point of the quote is that the "parent locale" is
> structural
> > >>> and not part of the identity of the specific xml file.
> > >>>
> > >>>
> > >>> I can think of few things more structural than where does this
> locale's
> > >>> defaults originate from. Without that identity, the child file's
> definition
> > >>> is incomplete. Placing the relationship data in another file in a
> different
> > >>> directory entirely requires novices like myself to do a tremendous
> amount
> > >>> of research to understand what's going on. Even though the
> maintainers may
> > >>> have had really excellent reasons for this structure, from the
> developer
> > >>> standpoint it's not sensible.
> > >>>
> > >>> If you look at the parent locales in supplemental, they are organized
> > >>> from the point of view of the parent, for setting "which locales
> inherit
> > >>> from en-150?"
> > >>>
> > >>> _______________________________________________
> > >> CLDR-Users mailing list
> > >> CLDR-Users at unicode.org
> > >> http://unicode.org/mailman/listinfo/cldr-users
> > >>
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20180317/702651dc/attachment.html>


More information about the CLDR-Users mailing list