Representing Additional Types of Flags

Mark Davis ☕️ mark at
Thu Jul 2 11:10:50 CDT 2015

I'll try to answer a few of these.

Mark <>

*— Il meglio è l’inimico del bene —*

On Tue, Jun 30, 2015 at 11:57 PM, Doug Ewell <doug at> wrote:

> Re-posting my comments and questions on this PRI to the list. I've
> already submitted them as formal feedback.
> .
> I support this proposal. I have the following questions:
> 1. The existing RIS-based flag mechanism is based on ISO 3166-1 (TUS 7.0
> §22.10). In this proposal, "valid" tag sequences would instead be
> determined by CLDR data and LDML specification. Is there any precedent
> for CLDR to define the validity of Unicode character sequences?

​We already have, in tr51, the unicode_region_codes being used for validity
testing of flags:​

​Those are typically the same as the ISO codes, but do add XK​

> 2. What is the policy on generating flag tags with deprecated
> unicode_region_subtag or unicode_subdivision_subtag values, such as
> "[flag]UK"? How "discouraged" would such a tag be? Should tools allow
> users to create such a tag?

CLDR treats UK as deprecated. When a code is deprecated, we strongly
discourage its use in new data, but normally allow it for old data. But the
UK is somewhat different, since it really shouldn't ever be valid as it
stands. The purpose for UK in CLDR metadata is so that locale ID
canonicalization can map en-UK (which occurs quite often) to en-GB, and so
on. (We do this also for overlong codes like eng-GB => en-GB.)

But you're right; we need to be able to distinguish this case (and ones
like it.) I filed​


> 3. The subdivisions.xml file contains a "subtype" hierarchy, reflecting
> the "parent subdivision" relationship in ISO 3166-2. So region 'FR'
> contains subdivision 'J' (Île-de-France), which itself contains
> subdivision '75' (Paris). Is there any significance to the "subtype"
> hierarchy as far as flag tags are concerned, or are "[flag]FRJ" and
> "[flag]FR75" equally valid?

​No, there isn't. But see also E.5 in

> 4. The entry for "001" in subdivisions.xml contains each of the
> two-letter codes for regions (countries) that have their own
> subdivisions. This is less than the set of all regions; for example,
> Anguilla (AI) does not have ISO 3166-2 subdivisions and so is not
> listed. This implies that a tag like "[flag]001US" is valid (and
> equivalent to "US" spelled with RIS, which is preferred) but
> "[flag]001AI" is not valid. Is this intended? If not, can it be
> clarified?

​Good catch, the 001 shouldn't even exist in the subdivisionContainment.
This is now fixed in trunk.

(The subdivision addition will only be final in September, so feedback on
it now would be great.  People can file tickets at

> 5. Will any preliminary examples of CLDR 4-character subdivision codes
> be made available before any such codes are actually assigned?

​The only purpose for the 4-character subdivision codes is stability. So
let's suppose that Colorado decides to join Canada (thereby deprecating CO
ISO 3166-2
), and British Columbia decides to join the US (getting the code CO in
ISO 3166-2
). In that case, CLDR would keep the old code CO (but deprecated) and
create a new 4-letter code for BC, such as XXCO. This is just for
illustration, of course, I've heard no rumors about either political

> .
> The PRI #299 mechanism is clearly and intentionally oriented toward
> representing flags of well-defined geopolitical entities.
> Any proposal to extend the mechanism to cover the many other types of
> flags -- for historical regions, NGOs, maritime, sports, or social or
> political causes -- must be systematic and well-planned, not ad-hoc or
> haphazard, to assure interoperability and extensibility.

​Firmly agreed.

> The documentation for the PRI #299 mechanism should state clearly that
> (e.g.) the Confederate battle flag, the Olympic flag, the Esperanto
> flag, the LGBT rainbow flag, and the naval flags used to spell out
> "ENGLAND EXPECTS" can be represented only via a proper extension to the
> mechanism, not by ad-hoc means such as the use of unassigned or
> private-use combinations. This is at least as important as ensuring the
> stable coding of geopolitical flags.

​Yes, again a good point.

6. What is the policy on generating flag tags with unicode_region_subtag
values corresponding to private-use BCP 47 subtags, other than those
given special semantics by CLDR? Are they invalid or merely discouraged?
Should tools allow users to create such a tag? Is there any provision
for a "private agreement," similar to that defined in Unicode for PUA

​We'll have to address that. My view is that they should not be valid: if
someone wants a PU flag, of any source, they have over 130,000 Unicode PU
character​s to play with.


> --
> Doug Ewell | | Thornton, CO ����
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list