What should or should not be encoded in Unicode? (from Re: Egyptian Hieroglyph Man with a Laptop)

wjgo_10009@btinternet.com via Unicode unicode at unicode.org
Thu Feb 13 09:41:41 CST 2020


Hans Åberg >>> From the point of view of Unicode, it is simpler: If the 
character is in use or have had use, it should be included somehow.

Shawn Steele >> That bar, to me, seems too low.  Many things are only 
used briefly or in a private context that doesn';t really require 
encoding.

Hans Åberg > That is a private use area for more special use.

I have used the Private Use Area, quite a lot over many years.

I have a licence for a fontmaking program, FontCreator. A good feature 
of the Windows operating system is that all installed fonts can be used 
in most installed programs. Private Use Area code points are official 
Unicode code points. These three factors together allow me to design and 
produce TrueType fonts for new symbols each encoded at a Private Use 
Area code point (a different code point for each such novel symbol), 
install the fonts, and use them in various programs, including a desktop 
publishing program and thereby make PDF (Portable Document Format) 
documents that include both ordinary text and the novel symbols. These 
PDF documents are then suitable for placing on the web and for Legal 
Deposit with The British Library.

Yet a Private Use Area encoding at a particular code point is not 
unique. Thus, except with care amongst people who are aware of the 
particular encoding, there is no interoperability, such as with regular 
Unicode encoded characters.

However faced with a need for interoperability for my research project, 
I have found a solution making use of the Glyph Substitution capability 
of an OpenType font.

The solution is to invent my own encoding space. This sits on top of 
Unicode, could be (perhaps?) called markup, but it works!

I am hoping that at some future time the results of my research will 
become encoded as an International Standard, and that my encoding space 
will then after that become integrated into Unicode, thus achieving 
fully standardized unique interoperable encoding as part of Unicode. 
Quite a dream, but the way to achieve such a fully standardized unique 
interoperable encoding as part of Unicode is from a technological point 
of view, quite straightforward. There are details of this in the 
Accumulated Feedback on Public Review Issue #408.

https://www.unicode.org/review/pri408/

Yet having my encoding space in this manner is just something that I 
have done on my own initiative. Anybody can have his or her own encoding 
space if he or she so chooses. With a little care and consideration for 
others these encodings need not clash one with another and all could 
even coexist in one document.

Having my own encoding space has enabled me to make progress with my 
research project.

William Overington

Thursday 13 February 2020





More information about the Unicode mailing list