Use of tag characters in a private encoding - is it valid please?

James Kass jameskass at code2001.com
Fri May 3 14:59:28 CDT 2024



On 2024-05-03 12:29 AM, Asmus Freytag via Unicode wrote:
> On 5/2/2024 4:25 PM, James Kass via Unicode wrote:
>> Wouldn’t this kind of private use agreement be considered a higher 
>> level protocol?
>
> No. You can agree to use a font that displays a certain glyph at a 
> certain PUA position. That's a private agreement, but not a "higher 
> level protocol". The way I like to think about it, PUA characters, in 
> contrast to images inserted into the flown text, constitute plain text 
> (as long as you don't append the font selection instructions via some 
> private tag, e.g. <font pua="use-this.ttf">.

Maybe we're talking about different things.  Of course PUA characters 
are plain-text by definition.  Even when people map all kinds of 
non-textual items to the PUA.  But I'm referring to the substitution of 
a glyph/image for a string of plain-text characters.  This sort of thing 
is very common in fonts.

Any private agreement is an alternate protocol regardless of its 
altitude.  I consider this kind of agreement (substitution of a text 
string with something different) to be "higher level" because it's 
over-and-above.


>>
>> [HTML]
>> Yadda yadda <img src="aardvark.jpg"> et cetera.
>>
>> [tags shown using encircled alphanumerics]
>> Yadda yadda 🆔Ⓠ④⑥②①② et cetera.
> The minute you agree to show different glyphs for non-PUA characters, 
> you are no longer simply conforming to Unicode.

Sorry for not understanding this.  Both examples above involve the 
computer system substituting an image/glyph for a string of text. Both 
examples should be considered conformant.  In either case, the 
underlying encoded text does not get changed.  The higher level protocol 
only affects how that text is displayed.

 > If you create elaborate conventions for the use of tag
 > characters you are creating a markup language. It's no
 > different from re-using ASCII characters for syntax
 > in addition to text.

It's also true when re-using any text characters, public or private, for 
the same purpose.



More information about the Unicode mailing list