[OT] RE: Flag tags with U+1F3F3 and subtypes

Philippe Verdy verdy_p at wanadoo.fr
Mon May 18 17:25:33 CDT 2015

2015-05-18 23:55 GMT+02:00 Doug Ewell <doug at ewellic.org>:

> Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
> > If ever the country codes used in BCP47 becomes full (all pairs of
> > letters used), just some time before this happens, we could see new
> > prefixes added before a new range of code. It is possible to use a
> > 1-letter prefix for new country/territory code extensions, but with
> > some maintenance of BCP47 parsing rules (notably the letter used
> > should not be reordered with other singleton prefixes)
> This would be a major revision to BCP 47, it would have nothing to do
> with reordering,

It woiuld have to do because all subtags after the pricmary language subtag
in BCP47 are optional, and you can distincguish them only by their length
*or* by the role assigned to specific singletons: there's already the "x"
singleton exception (that is ordered at end), but other singletons are
currently described to use a canonical order but it is used only for
encoding variants unrelated to region subtags or even to the languages.

Very few singletons are used in fact (the singleton subtags occuring at
start of ther tag are also treated separately from others: it could also be
used to support new syntaxes for BCP47 tags, but fow we just have "i-",
deprecated but still valid, and "x-" for private use; for all other letters
there's no parsing defined for now, their syntax is unknown and they are
not interchangeable without a standard, so they are used only for private
use; another constraint comes from the length limit of subtags: the first
subtag is either a special singleton, or a primary language code using 2 or
3 letters for now; some BCP47 use an empty first subtag, i.e. the tag
starts by an hyphen; double hyphens could be used as extensions to chhange
locally the parsing rules and possibly return to the next logical subtag
and could be used to encode international organization without needing a
formal "exceptional reservation" in ISO 3166-1; for example "*-EU" in could
have been encoded as "--O-EU" and we could have the same system for NATO,
EEA, EFTA... There's still ample space for extensions of parsing rules in
BCP47, but not in ISO3166.)

ISO 3166 also encodes some 4-letter codes but they are not used in BCP47
(so there's no confusion with 4-letter script codes).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150519/1ff67d01/attachment.html>

More information about the Unicode mailing list