Dataset for all ISO639 code sorted by country/territory?

Doug Ewell doug at ewellic.org
Sun Nov 20 16:29:13 CST 2016


Mats Blakstad wrote:

> Why do not every language have a local in CLDR? And should they not
> have?

Um, because gathering this data takes a lot of time and effort? More
than most people and organizations can justify for the 676th or 1,000th
or 7,000th most commonly spoken language in the world?

If this data were easy to gather and organize and there were few
controversies surrounding the data, I imagine much of this work would
have been done already.

Suggesting that this data should be made "open source" -- which means,
among other things, that anyone could change the data and the criteria
for inclusion and release the changed version without restriction --
does not change the amount of effort required to do this right. There
are surprisingly few people with the knowledge and expertise to collect
and present this sort of information about a language spoken in a single
remote village in Myanmar.

> Are the locals used not just those same used in the IANA subtag
> register?

There is a lot more to locale data than the language tag. Much, much
more. That would be like saying if I know your name, I know everything
about you.

> What are the criteria for a language to be included in CLDR?

You should start by reading the main CLDR page (cldr.unicode.org) and
the Process page.
 
--
Doug Ewell | Thornton, CO, US | ewellic.org




More information about the CLDR-Users mailing list