Clarifying the LDML data model

Mark Davis ☕️ mark at macchiato.com
Mon May 19 13:51:58 CDT 2014


A couple of quick notes


Mark <https://google.com/+MarkDavis>

 *— Il meglio è l’inimico del bene —*


On Mon, May 19, 2014 at 7:11 PM, Steven R. Loomis <srl at icu-project.org>wrote:

> On 05/19/2014 09:31 AM, Jon Skeet wrote:
> > Hi folks,
> >
> > I'm trying to get my head round the LDML data model in as clear a way
> > as possible, and I have a few questions - basically around how to
> > interpret TR-35 part 1. For the moment I'm only interested in
> > non-blocking elements (although at some point I'm going to need to get
> > my head round the exact meaning of serialElements and blockingItems...)
>
> Diagrams might help in explaining serialElements and blockingItems.
>
> > 1. Are nondistinguishing attributes ever valid for non-end-nodes? (I
> > can imagine the draft attribute being one exception to this.)
>
> Yes, "draft" as you noted, and for example "references" on
> <collation>.   Some normalization happens as part of the CLDR release,
> however, the question here is what is valid for LDML.
>
> > It makes life easier if we can think of the "value" for a node as
> > being the non-distinguishing attributes of just the /deepest/ element
> > in the chain, along with the text content of that element.
>
> I don't know what the context of your processing is, but you might not
> want to  consider the non-distinguishing attributes at all, they are for
> informative purposes only.


​Be careful here. While the non-distinguishing attributes are "mostly"
informative for common/main, they are vital for essentially all​ other
files, like supplemental.

Or, perform your processing ignoring
> non-distinguishing attributes, and then look up the non-distinguishing
> attributes on an as needed basis.
>

​Again, mostly valid only for common/main.
​

>
> > 2.Is it valid for a nondistinguishing attribute to occur on an element
> > whose content is an <alias> element? If so, do the nondistinguishing
> > attributes of that element override those in the target of the alias?
> > As an example, consider:
> >   <foo>
> >     <element type="x" bar="bar-on-x">
> >       <alias source="locale" path="../element[@type='y']" />
> >     </element>
> >     <element type="y" bar="bar-on-y">
> >       text value
> >     </element>
> >   </foo>
> >
> > If I ask for the nondistinguishing attribute "bar" on
> > //foo/element[@type='x'] would I get bar-on-x or bar-on-y?
> This seems at first glance to not be defined by the spec.
> > 3. Is there any way to tell the difference between an end-node with an
> > empty text value and a node which /could/ have child elements, but
> > happens not to for a specific locale?
> >
> > As an aside, while the spec talks about a locale data file as being a
> > /list/ of element-chain/value pairs, I'm finding it hard to shake the
> > idea of it being a tree (or possibly a forest). If anyone has any
> > feedback about whether that's likely to cause me problems later, I'd
> > welcome it.
> >
>
> Those are actually related questions. Quoting  4.2.1 definitions
> http://www.unicode.org/reports/tr35/#Definitions/  - "An LDML file can
> be thought of as an ordered list of //element pairs//: <element chain,
> data>, where the element chains are all the chains for the end-nodes.
> (This works because of restrictions on the structure of LDML, including
> that it does not allow mixed content.) The ordering is the ordering that
> the element chains are found in the file, and thus determined by the DTD."/
>
> //No, there's no way to tell the difference by inspecting the XML, BUT
> the DTD, and especially the supplemental metadata, will tell you what is
> valid at that level.
>
> So,
>
> <ldml><identity><!-- ... OMITTED ...--></identity>
>         <localeDisplayNames/>
> </ldml>
>
> .. is valid by the DTD, but
>
> <ldml><identity><!-- ... OMITTED ...--></identity>
>         <localeDisplayNames>Foo</localeDisplayNames>
> </ldml>
>
> .. is not.
>
> So if you were to represent the first (valid) example as <element chain,
> data>  you would completely omit the "<localeDisplayNames/>" - it has no
> value as LDML, basically,  and I think that CLDR tools would strip it
> away completely.
>
> Hope this helps.
>
> --
>
> IBMer but all opinions are mine.
> https://www.ohloh.net/accounts/srl295 // fingerprint @
> https://ssl.icu-project.org/trac/wiki/Srl
>
>
>
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140519/936216c1/attachment.html>


More information about the CLDR-Users mailing list