What should or should not be encoded in Unicode? (from Re: Egyptian Hieroglyph Man with a Laptop)

Fri Feb 14 16:52:25 CST 2020

>> The solution is to invent my own encoding space. This sits on top of 
>> Unicode, could be (perhaps?) called markup, but it works!

> It may be perilous, because some software may enforce the strict 
> official code point limits.

I  have now realized that what I wrote before is ambiguous.

When I wrote "sits on top of Unicode" I was not meaning at some code 
points above U+10FFFF in the Unicode map, though I accept that it could 
quite reasonably be read as meaning that.

My encoding space sits on top of Unicode in the sense that it uses a 
sequence of regular Unicode characters for each code point in my 
encoding space.

For example

∫⑦⑧①

or

!781

or

a character sequence of a base character, followed by a tag exclamation 
mark followed by three tag digits and a cancel tag.

All three examples above have the same meaning.

∫⑦⑧① is useful as more unlikely otherwise than !123, though !123 is 
easier to use and could be used in a GS1-128 barcode.

The tag sequence has the potential to become incorporated into Unicode 
for universal standardization of unambiguous interoperability 
everywhere. That is a long term goal for me.

The example above uses a three-digit code number. My encoding space 
allows for various numbers of digits, with a minimum of three digits and 
a much larger theoretical maximum. The most digits in use at present in 
my research project in any one code number is six.

William Overington

Friday 14 February 2020