Odp: Re: Unicode fundamental character identity

piotrunio-2004@wp.pl piotrunio-2004 at wp.pl
Fri Jan 31 17:15:38 CST 2025


Dnia 31 stycznia 2025 23:45 James Kass <jameskass at code2001.com> napisał(a):  (Hi Piotr, I sent this to the list about 45 minutes ago but it has not  come through yet so I'm sending it along to you directly.  Hope this  helps.  -James)   -------- Forwarded Message --------  Subject: 	Re: Odp: RE: Re: Re: Unicode fundamental character identity  Date: 	Fri, 31 Jan 2025 22:01:54 +0000  From: 	James Kass <jameskass at code2001.com>  To: 	unicode at corp.unicode.org       On 2025-01-31 9:28 PM, piotrunio-2004 at wp.pl via Unicode wrote:  The proposal L2/25-037 already shows a difference in plain text of the  HP 264x characters, where 0x12 (2) connects below vertical or  perpendicular diagonal, whereas 0x18 (8) connects below diagonal of  same direction. Those are different types of connections which is a  plain text distinction of box drawings.   A "smart" font dedicated to these characters would provide appropriate  glyphs based on context.  This would result in a plain-text display  identical to the original display.  That doesn't make sense because on a fundamental level, in a legacy computing semigraphical environment, each character tile is drawn independently, and only affects the area of the screen dedicated to that character. Having a context dependent system would overcomplicate the renderer beyond the scope of the original system. Furthermore, on the HP 264x system, the two characters can exist in isolation (as shown in  i.imgur.com obGQ4Ie.png (1440×720) (imgur.com) ), and the user can in fact type the two characters differently, with the 2 and 8 keys as shown in page 31 of 204 in  www.bitsavers.org 02645-90005_2641A_2645A_2645S_N_Display_Station_Reference_Manual_Nov1978.pdf (bitsavers.org) .   Data loss in round-tripping is implicitly evident from the information  provided in the proposal: if an HP 264x Large Character set mode  document has the characters 0x12 0x18, it converts to Unicode as  U+1CE2B U+1CE2B, which converted back to HP 264x Large Character set  mode is 0x12 0x12, which loses the distinction between the two  characters and will appear slightly differently than the original  document on HP 264x platform.  Yes, this is implicit in the proposal.  Any future proposal should make  it explicit while referring to the earlier proposal for background.   Please keep in mind that the committee members must wade through many  different proposals covering all aspects of character encoding.  Keep it  short, straightforward, and simple as possible to ease their burden.  The character has already been proposed. What would any future proposal have to do with that?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20250201/acf97fc3/attachment-0001.htm>


More information about the Unicode mailing list