Unicode fundamental character identity

Fri Jan 31 17:37:57 CST 2025


On 2025-01-31 11:15 PM, piotrunio-2004 at wp.pl via Unicode wrote:
>
> Dnia 31 stycznia 2025 23:45 James Kass <jameskass at code2001.com> 
> napisał(a):
>
>     (Hi Piotr, I sent this to the list about 45 minutes ago but it has not
>     come through yet so I'm sending it along to you directly.  Hope this
>     helps.  -James)
>
>     -------- Forwarded Message --------
>     Subject: Re: Odp: RE: Re: Re: Unicode fundamental character identity
>     Date: Fri, 31 Jan 2025 22:01:54 +0000
>     From: James Kass <jameskass at code2001.com>
>     To: unicode at corp.unicode.org
>
>
>
>
>
>     On 2025-01-31 9:28 PM, piotrunio-2004 at wp.pl via Unicode wrote:
>
>         The proposal L2/25-037 already shows a difference in plain
>         text of the
>         HP 264x characters, where 0x12 (2) connects below vertical or
>         perpendicular diagonal, whereas 0x18 (8) connects below
>         diagonal of
>         same direction. Those are different types of connections which
>         is a
>         plain text distinction of box drawings.
>
>     A "smart" font dedicated to these characters would provide appropriate
>     glyphs based on context.  This would result in a plain-text display
>     identical to the original display.
>
> That doesn't make sense because on a fundamental level, in a legacy 
> computing semigraphical environment, each character tile is drawn 
> independently, and only affects the area of the screen dedicated to 
> that character. Having a context dependent system would overcomplicate 
> the renderer beyond the scope of the original system. Furthermore, on 
> the HP 264x system, the two characters can exist in isolation (as 
> shown in obGQ4Ie.png (1440×720) (imgur.com) 
> <https://i.imgur.com/obGQ4Ie.png>), and the user can in fact type the 
> two characters differently, with the 2 and 8 keys as shown in page 31 
> of 204 in 
> 02645-90005_2641A_2645A_2645S_N_Display_Station_Reference_Manual_Nov1978.pdf 
> (bitsavers.org) 
> <http://www.bitsavers.org/pdf/hp/terminal/264x/2645A/02645-90005_2641A_2645A_2645S_N_Display_Station_Reference_Manual_Nov1978.pdf>.
>
Sorry for the confusion.  I'm referring to a Unicode "smart" font 
working on a modern system displaying Unicode plain-text.  This is all 
automatic and handled by the rendering system.  If a dedicated font is 
used to display the text, contextual glyph substitution would make the 
display indistinguishable from the original display on the legacy 
system.  Also, on a modern system any "dumb" font supporting the 
characters would still produce a *legible* display, even though it might 
not be as pretty.  And legibility in plain-text is one of the factors 
driving encoding decisions.  (This might be why font selection was 
mentioned as a solution in the document referenced earlier.)

>         Data loss in round-tripping is implicitly evident from the
>         information
>         provided in the proposal: if an HP 264x Large Character set mode
>         document has the characters 0x12 0x18, it converts to Unicode as
>         U+1CE2B U+1CE2B, which converted back to HP 264x Large
>         Character set
>         mode is 0x12 0x12, which loses the distinction between the two
>         characters and will appear slightly differently than the original
>         document on HP 264x platform.
>
>     Yes, this is implicit in the proposal.  Any future proposal should
>     make
>     it explicit while referring to the earlier proposal for background.
>     Please keep in mind that the committee members must wade through many
>     different proposals covering all aspects of character encoding. 
>     Keep it
>     short, straightforward, and simple as possible to ease their burden.
>
> The character has already been proposed. What would any future 
> proposal have to do with that?
>
If my understanding is correct, the character has already been proposed 
and rejected.  It's not uncommon for a subsequent proposal to be 
submitted which addresses concerns raised during the rejection of an 
earlier proposal.  (If my understanding is not correct, someone will 
probably set me straight.)