Locale bringup and barriers for entry
Marcel Schneider via CLDR-Users
cldr-users at unicode.org
Sat Sep 22 06:07:29 CDT 2018
I didn’t aim at doing what I’ve ended up doing, ie summing up a bunch of tickets already under process
in a thread launched to welcome newcomers and showing ways of expanding CLDR support to all of
the world’s locales.
Indeed over the details I forgot my first thoughts:
On 22/09/18 02:37 Steven R. Loomis via CLDR-Users wrote:
[…]
> - what are the best ways to coordinate efforts between the language users and different technical experts?
I can only encourage everyone to first make up our minds individually by taking a close look at the latest Charts,
especially — as of learning how to include *new* locales — at the By-Type overviews of the set of locales that
have already had the chance of making it into CLDR:
http://cldr.unicode.org/index/downloads
http://www.unicode.org/cldr/charts/latest/
https://www.unicode.org/cldr/charts/latest/by_type/index.html
https://www.unicode.org/cldr/charts/latest/by_type/core_data.alphabetic_information.html
https://www.unicode.org/cldr/charts/latest/by_type/core_data.alphabetic_information.main.html
https://www.unicode.org/cldr/charts/latest/by_type/core_data.alphabetic_information.punctuation.html
and so on.
Another important step is to read through the Information Hub for Linguists, the main documentation resource:
http://cldr.unicode.org/translation
from where we can access the detailed pages linked also from the information pane in SurveyTool.
Eg about plurals:
http://cldr.unicode.org/translation/plurals
I happened to start uninformed discussions prior to noticing that the documentation already provided
sufficient instructions, or prior to sorting out what was already covered or what clarifications I needed…
A good way to prepare — if not already done — is also to learn XML and more specifically LDML, the
Unicode Locale Data Markup Language, in order to be able to read and submit data in that format:
http://cldr.unicode.org/index/cldr-spec
linking:
http://www.unicode.org/reports/tr35/
Eg to understand how inheritance works:
http://www.unicode.org/reports/tr35/#Locale_Inheritance
That is key knowledge to understand what happens to us when working in SurveyTool,
and to detect eventual inheritance display bugs — unlikely to happen anymore, though.
Now we’re ready for a take on the raw data, as downloaded or found in the online repository:
http://www.unicode.org/repos/cldr/tags/latest/
https://www.unicode.org/repos/cldr/tags/latest/common/
https://www.unicode.org/repos/cldr/tags/latest/common/main/
where we may wish to pick the locale that is closest to our new data, or that we know best among
the precursors, or simply English for reference:
https://www.unicode.org/repos/cldr/tags/latest/common/main/en.xml
(Emoji-related data are in a separate repository:
https://www.unicode.org/repos/cldr/tags/latest/common/annotations/en.xml
)
I think best is to download a whole set of data in a zipped folder ; latest as of now are in:
http://www.unicode.org/Public/cldr/33.1/
and then open relevant files in a text editor with syntaxic highlighting and XML syntax checker.
Here’s finally my answer to the quoted question about how to coordinate efforts between users and experts:
All interested people may communicate by any available means all over the year, given SurveyTool fora have
limited access and accept posts only during surveys, while being read-only for accredited people the rest of
the time. Likewise, SurveyTool submission forms are read-only except during relatively short windows of
opportunity extending over 4..7 weeks two times a year.
Results of discussions may then be committed to a file in LDML/XML format. The easiest way is to take
the English files, cut off eventually unreviewed parts, and replace English content with locale content.
The resulting files may then be submitted individually by each coordinated vetter using
the SurveyTool bulk data upload feature:
http://cldr.unicode.org/index/survey-tool
http://cldr.unicode.org/index/survey-tool/guide
http://cldr.unicode.org/index/survey-tool/guide#TOC-Advanced-Features
http://cldr.unicode.org/index/survey-tool/upload
I think we’ll look whether we’ll try this out for French / fr-FR when the next rush starts on December 1ˢᵗ.
Good luck!
Marcel
More information about the CLDR-Users
mailing list