CLDR proposal: Move collator CLDR settings into ICU format
Markus Scherer
markus.icu at gmail.com
Fri Apr 3 15:59:50 CDT 2015
Dear CLDR team & users,
I would like to propose the following spec & data changes for CLDR 28.
Please provide *feedback by next Thursday, 2015-apr-09*.
CLDR ticket: http://unicode.org/cldr/trac/ticket/8289
Proposal:
- Deprecate XML elements under <collation>:
import, settings, suppress_contractions, optimize
together with their specific attributes
- Change the CLDR collation tailorings data to
replace the use of these XML elements with equivalent ICU syntax
For example:
<settings caseFirst="upper"/>
<import source="da" type="standard"/>
<suppress_contractions>[เ-ไ ເ-ໄ ꪵ ꪶ ꪹ ꪻ ꪼ]</suppress_contractions>
<settings normalization="on" alternate="shifted" reorder="Thai"/>
->
[caseFirst upper]
[import da-u-co-standard]
[suppressContractions [เ-ไ ເ-ໄ ꪵ ꪶ ꪹ ꪻ ꪼ]]
[normalization on][alternate shifted][reorder Thai]
Rationale:
The LDML collation spec
<http://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Element>
provides for two ways for parametric settings and special rules in
collation tailoring data: via special XML elements, or as part of the ICU
syntax rules in <cr><![CDATA[...]]></cr>. See the underlined elements in
the following line copied from the spec:
<!ELEMENT collation (alias | ( *import*, settings?, suppress_contractions?,
optimize?*, cr*, special*)) >
Two ways of doing the same thing lead to inconsistencies.
CLDR tools and tests would not have to convert these elements to ICU syntax
any more.
The spec would be simpler.
This change makes it clearer that the settings get *import*ed too, not just
the rules.
Note that CLDR 24
<http://www.unicode.org/reports/tr35/tr35-33/tr35.html#Modifications>
deprecated the XML syntax for rules and replaced the XML syntax rules data
with equivalent ICU syntax rules.
Sincerely,
markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20150403/6eb7f45a/attachment.html>
More information about the CLDR-Users
mailing list