Use of tag characters in a private encoding - is it valid please?

William_J_G Overington wjgo_10009 at btinternet.com
Fri May 3 04:17:04 CDT 2024


A glyph of The Welsh Flag is encoded in Unicode as a sequence of a base 
character followed by some tag characters and a CANCEL TAG.
 
As far as I am aware, this is regarded as a plain text encoding. I am 
not aware of that encoding ever having been referred to as markup.
 
U+1F91F is listed in
 
https://www.unicode.org/charts/PDF/U1F900.pdf 
<https://www.unicode.org/charts/PDF/U1F900.pdf>
 
as
 
I LOVE YOU HAND SIGN
 
and in the same document,
 
U+1F98B is listed as BUTTERFLY
 
If those two characters are in a block of text, not necessarily next to 
each other, and the text is to be transcribed as all alphanumeric text, 
how should those two characters be transcribed? Or used in a text to 
speech system? What if the text in the original document is in French, 
how should those two pictographs be transcribed? Or spoken?
 
Please consider an encoding of a glyph that has been designed and 
assigned a meaning by an artist. (Yes, a hobbyist artist, but artists 
and novelists are not expected to be representing an organization, so 
their output is "recognized" rather than being discriminated 
against.)
 
There is such a glyph that has been assigned the following meaning, 
intended for use in seeking information about relatives and friends 
after a disaster
 
Is there any information about the following person please?
 
Suppose that that glyph is encoded as follows.
 
U+10FFFD followed by the tag versions of !313125 and a CANCEL TAG.
 
Then it seems to me that that is a plain text encoding, based on the 
precedents of the encoding of the glyph of The Welsh Flag and of the 
encoding of a glyph with a meaning not obvious from its appearance.
 
The glyph is displayed in Chapter 42 of my first novel, on page 2.
 
http://www.users.globalnet.co.uk/~ngo/novel_plus.htm 
<http://www.users.globalnet.co.uk/~ngo/novel_plus.htm>
 
That novel was completed in 2019 and there have been some developments 
since then, but the chapter contains lots of symbols that may be of 
interest as to how indications of the assigned meanings information is 
packed into the various glyphs.
 
The analysis in this post shows that the encoding that I am using for 
the glyph that I designed is plain text and therefore in principle the 
encoding could, if the Unicode Technical Committee so decides, be 
encoded as plain text in The Unicode Standard as a sequence of a base 
character, some tag characters, and a CANCEL TAG.
 
William Overington
 
Friday 3 May 2024
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240503/1e23d034/attachment.htm>


More information about the Unicode mailing list