What should or should not be encoded in Unicode? (from Re: Egyptian Hieroglyph Man with a Laptop)
firstname.lastname@example.org via Unicode
unicode at unicode.org
Thu Feb 13 09:41:41 CST 2020
Hans Åberg >>> From the point of view of Unicode, it is simpler: If the
character is in use or have had use, it should be included somehow.
Shawn Steele >> That bar, to me, seems too low. Many things are only
used briefly or in a private context that doesn';t really require
Hans Åberg > That is a private use area for more special use.
I have used the Private Use Area, quite a lot over many years.
I have a licence for a fontmaking program, FontCreator. A good feature
of the Windows operating system is that all installed fonts can be used
in most installed programs. Private Use Area code points are official
Unicode code points. These three factors together allow me to design and
produce TrueType fonts for new symbols each encoded at a Private Use
Area code point (a different code point for each such novel symbol),
install the fonts, and use them in various programs, including a desktop
publishing program and thereby make PDF (Portable Document Format)
documents that include both ordinary text and the novel symbols. These
PDF documents are then suitable for placing on the web and for Legal
Deposit with The British Library.
Yet a Private Use Area encoding at a particular code point is not
unique. Thus, except with care amongst people who are aware of the
particular encoding, there is no interoperability, such as with regular
Unicode encoded characters.
However faced with a need for interoperability for my research project,
I have found a solution making use of the Glyph Substitution capability
of an OpenType font.
The solution is to invent my own encoding space. This sits on top of
Unicode, could be (perhaps?) called markup, but it works!
I am hoping that at some future time the results of my research will
become encoded as an International Standard, and that my encoding space
will then after that become integrated into Unicode, thus achieving
fully standardized unique interoperable encoding as part of Unicode.
Quite a dream, but the way to achieve such a fully standardized unique
interoperable encoding as part of Unicode is from a technological point
of view, quite straightforward. There are details of this in the
Accumulated Feedback on Public Review Issue #408.
Yet having my encoding space in this manner is just something that I
have done on my own initiative. Anybody can have his or her own encoding
space if he or she so chooses. With a little care and consideration for
others these encodings need not clash one with another and all could
even coexist in one document.
Having my own encoding space has enabled me to make progress with my
Thursday 13 February 2020
More information about the Unicode