Use of tag characters in a private encoding - is it valid please?
Erik Carvalhal Miller
ecm.unicode at gmail.com
Mon May 6 22:00:39 CDT 2024
On Fri, May 3, 2024 at 5:22 AM William_J_G Overington via Unicode <
unicode at corp.unicode.org> wrote:
> The analysis in this post shows that the encoding that I am using for the
> glyph that I designed is plain text and therefore in principle the encoding
> could, if the Unicode Technical Committee so decides, be encoded as plain
> text in The Unicode Standard as a sequence of a base character, some tag
> characters, and a CANCEL TAG.
>
“Could” and “should” are very different animals. Assuming the UTC does end
up deciding to accept your symbol (presumably a distinct symbol character,
not merely a glyph, for Unicode encodes characters, not glyphs) for
encoding (after considering a proposal fulfilling the usual applicable
criteria, submitted in the prescribed manner), why should it choose the
elaborate encoding you describe instead of a single code point?
Currently there is only one variety of valid tag sequences, that of the
regional (subnational) flags such as the Welsh flag you cited. I donʼt
know much about the decision process that was involved, but I take it that
the encoding is a compromise born partly of the desire to keep the UTC out
of some rather political and potentially never‐ending business by taking
advantage of an existing international standard thatʼs beyond Unicodeʼs
purview. The encoding has some advantages and some disadvantages, the
latter including length. There are other cases in which Unicode has chosen
a code‐point sequence, rather than a single code point, to represent a
single character; but single code points are by far the norm. What would
be the rationale for a nine-point sequence for your single character? and
an unusually arbitrary‐looking sequence at that?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240506/bba1df6b/attachment.htm>
More information about the Unicode
mailing list