Transform Rule Syntax

Cameron Dutro cameron at lumoslabs.com
Wed Dec 16 18:25:21 CST 2015


Hey cldr-users,

I'm working with the CLDR transform rules and finding myself flummoxed.
Specifically I'm looking at this rule
<http://unicode.org/cldr/trac/browser/tags/release-28-d05/common/transforms/es-es_FONIPA.xml#L138>
in the es-es_FONIPA transform rule set. In this rule, we see what appears
to be a Unicode set or character class from a regular expression: [-\ ]
Either way, this does not appear to be valid syntax. Hyphens are used in
character classes to denote ranges of characters, for example [a-z].
Literal hyphens must be escaped. The hyphen in question is neither part of
a range nor escaped. Why is this? Finally, it appears the character class
contains an escaped space character. Space characters are not required to
be escaped in character classes.

My suspicion is that this syntax is to be treated in a special way since it
is used in the context of transformation rules. Please let me know if this
is the case. I have been unable to find any documentation regarding the
special treatment of hyphens in UTS #35 or other documents.

Thanks!

-Cameron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20151216/87628d7b/attachment.html>


More information about the CLDR-Users mailing list