UN/LOCODE perspective on character sets

Mark Davis ☕️ mark at macchiato.com
Fri Dec 18 01:26:44 CST 2015

Haven't looked it over in detail, but here is the notice:


>From a quick scan: They've added latitude/longitude (to the minute, ~2km);
that's great because often the names of locations are ambiguous.

They still have deviations from the IATA codes, and various strange
omissions. And (as you note) they don't include the native name, unless it
can be spelled with a *subset* of Latin-1 characters (ugg). They list the
ISO subdivision code sometimes, but no consistent inclusion relations for
other codes (eg, they do have that San Francisco is in California, but they
miss many other similar relations in other countries). And the
latitude/longitude is often missing.

More at http://www.unece.org/cefact/locode/welcome.html


On Thu, Dec 17, 2015 at 10:19 PM, Doug Ewell <doug at ewellic.org> wrote:

> UN/LOCODE version 2015-2 has been released [1], and the Manual still
> contains the following about character sets:
> "27. Place names in UN/LOCODE are given in their national language
> versions as expressed in the Roman alphabet using the 26 characters of
> the character set adopted for international trade data interchange, with
> diacritic signs, when practicable (cf. Paragraph 3.2.2 of the UN/LOCODE
> Manual). International ISO Standard character sets are laid down in ISO
> 8859-1 (1987) and ISO10646-1 (1993). (The standard United States
> character set (437), which conforms to these ISO standards, is also
> widely used in trade data interchange)."
> Spot the errors.
> [1] http://www.unece.org/cefact/codesfortrade/codes_index.html
> --
> Doug Ewell | http://ewellic.org | Thornton, CO ����
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20151218/11a25899/attachment.html>

More information about the Unicode mailing list