names, addresses, phone numbers
Edwin Hoogerbeets
ehoogerbeets at gmail.com
Thu Apr 21 18:34:35 CDT 2016
Chris, you can see the data at:
https://sourceforge.net/p/i18nlib/code/HEAD/tree/trunk/js/data/locale/
Under there is
https://sourceforge.net/p/i18nlib/code/HEAD/tree/trunk/js/data/locale/und/<countrycode>
directories which contain the phone files for 22 countries. The phone
files are phonefmt.json for the progressive formats designed to be used
for format partial and full numbers while dialing digits in a phone UI,
numplan.json for the basic numbering plan information, states.json which
is a generated trie used for parsing area codes, and area.json which
maps area codes to geolocations. A special case is the North American
Number Plan (NANP) countries (Canada, US, Bermuda, and many Caribbean
nations) which are all configured together in the
https://sourceforge.net/p/i18nlib/code/HEAD/tree/trunk/js/data/locale/und/US
directory for convenience.
Mike M, I can imagine that the area codes and geolocations change very
regularly, but the formats do not. "(XXX) XXX-XXXX" has been the de
facto standard American format for many, many years for example. Ilib
contains multiple styles of format as well, since the format is often a
matter of user preference instead of government mandate. See
https://sourceforge.net/p/i18nlib/code/HEAD/tree/trunk/js/data/locale/und/DE/phonefmt.json
for a country with 5 different possible styles.
Also under
https://sourceforge.net/p/i18nlib/code/HEAD/tree/trunk/js/data/locale/und/<countrycode>
are the address.json files. These are meta-information plus a list of
regular expressions and hard-coded lists used to parse the addresses. It
doesn't get it right all the time (the US one has problems with two word
localities like "San Francisco" for example), but it gets it reasonably
close, and pretty much every country in the world is covered.
Under 55 of the locale dirs are the name.json files which configure the
name formats and settings for those languages. The top level contains a
western-centric fall-back file used when the language doesn't have its
own parser:
https://sourceforge.net/p/i18nlib/code/HEAD/tree/trunk/js/data/locale/name.json.
An example of Asian formats:
https://sourceforge.net/p/i18nlib/code/HEAD/tree/trunk/js/data/locale/ja/name.json
Almost all of the phone data was gleaned either from the documents on
the International Telecommunications Union site which has the officially
published numbering plan documents for many countries, as well as
wikipedia which has information about the formats. The address and name
information is gleaned almost exclusively from wikipedia.
Edwin
On 04/20/2016 11:27 PM, Chris Leonard wrote:
> On Thu, Apr 21, 2016 at 1:34 AM, Edwin Hoogerbeets
> <ehoogerbeets at gmail.com> wrote:
>> I heard talk 2 or 3 years ago about a proposal to add name, address, and
>> phone number formats to CLDR. What ever happened to those efforts? I don't
>> really see data in CLDR 29 about those.
>>
>> In my i18n library for JS called "ilib", I have data about the address
>> formats for practically every country in the world, as well as the phone
>> formats and name formats for many countries. I would love to contribute this
>> data to CLDR and then later leverage other people's local knowledge to fill
>> in the gaps where my data is lacking...
>>
>> Can someone direct me to the folks who are working on these? Thanks,
>>
>
>
> Dear Edwin.
>
>
> I'd be interested in comparing your data to that in the glibc locales.
>
> Is there a link to your repo you can provide?
>
> cjl
More information about the CLDR-Users
mailing list