Clarifying the LDML data model

Steven R. Loomis srl at icu-project.org
Mon May 19 12:11:47 CDT 2014


On 05/19/2014 09:31 AM, Jon Skeet wrote:
> Hi folks,
>
> I'm trying to get my head round the LDML data model in as clear a way
> as possible, and I have a few questions - basically around how to
> interpret TR-35 part 1. For the moment I'm only interested in
> non-blocking elements (although at some point I'm going to need to get
> my head round the exact meaning of serialElements and blockingItems...)

Diagrams might help in explaining serialElements and blockingItems.

> 1. Are nondistinguishing attributes ever valid for non-end-nodes? (I
> can imagine the draft attribute being one exception to this.)

Yes, "draft" as you noted, and for example "references" on
<collation>.   Some normalization happens as part of the CLDR release,
however, the question here is what is valid for LDML.

> It makes life easier if we can think of the "value" for a node as
> being the non-distinguishing attributes of just the /deepest/ element
> in the chain, along with the text content of that element.

I don't know what the context of your processing is, but you might not
want to  consider the non-distinguishing attributes at all, they are for
informative purposes only. Or, perform your processing ignoring
non-distinguishing attributes, and then look up the non-distinguishing
attributes on an as needed basis.

> 2.Is it valid for a nondistinguishing attribute to occur on an element
> whose content is an <alias> element? If so, do the nondistinguishing
> attributes of that element override those in the target of the alias?
> As an example, consider:
>   <foo>
>     <element type="x" bar="bar-on-x">
>       <alias source="locale" path="../element[@type='y']" />
>     </element>
>     <element type="y" bar="bar-on-y">
>       text value
>     </element>
>   </foo>
>
> If I ask for the nondistinguishing attribute "bar" on
> //foo/element[@type='x'] would I get bar-on-x or bar-on-y?
This seems at first glance to not be defined by the spec.
> 3. Is there any way to tell the difference between an end-node with an
> empty text value and a node which /could/ have child elements, but
> happens not to for a specific locale?
>
> As an aside, while the spec talks about a locale data file as being a
> /list/ of element-chain/value pairs, I'm finding it hard to shake the
> idea of it being a tree (or possibly a forest). If anyone has any
> feedback about whether that's likely to cause me problems later, I'd
> welcome it.
>

Those are actually related questions. Quoting  4.2.1 definitions
http://www.unicode.org/reports/tr35/#Definitions/  - "An LDML file can
be thought of as an ordered list of //element pairs//: <element chain,
data>, where the element chains are all the chains for the end-nodes.
(This works because of restrictions on the structure of LDML, including
that it does not allow mixed content.) The ordering is the ordering that
the element chains are found in the file, and thus determined by the DTD."/

//No, there's no way to tell the difference by inspecting the XML, BUT
the DTD, and especially the supplemental metadata, will tell you what is
valid at that level.

So,

<ldml><identity><!-- ... OMITTED ...--></identity>
        <localeDisplayNames/>
</ldml>

.. is valid by the DTD, but

<ldml><identity><!-- ... OMITTED ...--></identity>
        <localeDisplayNames>Foo</localeDisplayNames>
</ldml>

.. is not.

So if you were to represent the first (valid) example as <element chain,
data>  you would completely omit the "<localeDisplayNames/>" - it has no
value as LDML, basically,  and I think that CLDR tools would strip it
away completely.

Hope this helps.

-- 

IBMer but all opinions are mine.
https://www.ohloh.net/accounts/srl295 // fingerprint @ https://ssl.icu-project.org/trac/wiki/Srl


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 555 bytes
Desc: OpenPGP digital signature
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140519/d18f427b/attachment.asc>


More information about the CLDR-Users mailing list