From leob at mailcom.com Sat Aug 10 17:48:26 2024 From: leob at mailcom.com (Leo Broukhis) Date: Sat, 10 Aug 2024 15:48:26 -0700 Subject: Pictographic zodiacal symbols In-Reply-To: <1609502406.1138839.1721379602249@email.ionos.de> References: <1609502406.1138839.1721379602249@email.ionos.de> Message-ID: What's the semantic difference between the two sets? Without it, it's just different fonts. Leo On Fri, Jul 19, 2024 at 2:00?AM A. St?tzner via Unicode < unicode at corp.unicode.org> wrote: > > Besides the simple typographic set of 12 zodiac characters there is a > tradition of another set, consisting of pictographic symbols of the 12 > zodiac signs, which also play a role in typography (~ 16th c. onwards) > Has this set been proposed for encoding at any time in the past? > > greetings, > Andreas St?tzner > > __________________________________________________________________ > > *Andreas St?tzner* > Gestaltung ? Archivpflege ? Fontentwicklung > Klaufl?gelweg 21 ? 88400 Biberach a.d. Ri? > 0176-86823396 ? as at signographie.de > post at andronfonts.com ? Andronfonts.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Ko?hler 1710_x.jpg Type: image/jpeg Size: 37566 bytes Desc: not available URL: From asmusf at ix.netcom.com Sun Aug 11 19:59:15 2024 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sun, 11 Aug 2024 17:59:15 -0700 Subject: Pictographic zodiacal symbols In-Reply-To: References: <1609502406.1138839.1721379602249@email.ionos.de> Message-ID: <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> There's arguably a distinction between symbols and pictographs, even if both signify the same concept. This is different from the case two different sets of pictographs or two different sets of symbolic notation. Although, even in those cases it is useful to consider the question: can one of them be substituted for the other with the reader experiencing the choice as stylistic? A./ On 8/10/2024 3:48 PM, Leo Broukhis via Unicode wrote: > What's the semantic difference between the two sets? Without it, it's > just different fonts. > > Leo > > On Fri, Jul 19, 2024 at 2:00?AM A. St?tzner via Unicode > wrote: > > Besides the simple typographic set of 12 zodiac characters there > is a tradition of another set, consisting of pictographic symbols > of the 12 zodiac signs, which also play a role in typography (~ > 16th c. onwards) > Has this set been proposed for encoding at any time in the past? > greetings, > Andreas St?tzner > __________________________________________________________________ > *Andreas St?tzner* > Gestaltung ? Archivpflege ? Fontentwicklung > Klaufl?gelweg 21 ? 88400 Biberach a.d. Ri? > 0176-86823396 ? as at signographie.de > post at andronfonts.com ? Andronfonts.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Ko?hler 1710_x.jpg Type: image/jpeg Size: 37566 bytes Desc: not available URL: From marius.spix at web.de Mon Aug 12 02:55:09 2024 From: marius.spix at web.de (Marius Spix) Date: Mon, 12 Aug 2024 09:55:09 +0200 Subject: Aw: Re: Pictographic zodiacal symbols In-Reply-To: <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> References: <1609502406.1138839.1721379602249@email.ionos.de> <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 37566 bytes Desc: not available URL: From harjitmoe at outlook.com Mon Aug 12 04:01:59 2024 From: harjitmoe at outlook.com (Harriet Riddle) Date: Mon, 12 Aug 2024 09:01:59 +0000 Subject: Pictographic zodiacal symbols In-Reply-To: References: <1609502406.1138839.1721379602249@email.ionos.de> <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> Message-ID: Indeed, the zodiac pictographs that were previously missing were deliberately added in Unicode 8.0 (the zodiac symbols were already included).[1] (A pictograph for Ophiuchus remains absent, but thirteen-month zodiac is a bit obscure in practice anyway.) Another interesting thing to note is that the au-by-KDDI emoji set actually initially used pictographic zodiac signs, later changing them to the zodiac symbols for better compatibility with other vendors' emoji sets.[2]? These were initially unified to the Unicode zodiac symbols along with the other vendors' zodiac symbols (see e.g. `EmojiSources.txt`)[3], so the additions of emoji pictographs in Unicode 8.0 arguably count as disunifications. (Another outlying case is ?, which was one of the characters disunified in Unicode 8.0 having been unified with ? in Unicode 6.0, but where ? and ? already had separate code points in the GMail emoji private use area (U+FE1E3 and U+FE02E respectively), just not in any of the JCarrier vendors' private use areas.) [1] https://www.unicode.org/reports/tr51/tr51-3-archive.html#Faces_Hands_Zodiac [2] https://www.au.com/content/dam/au-com/mobile/service/emoji/pdf/taiohyo_03.pdf [3] https://www.unicode.org/Public/UCD/latest/ucd/EmojiSources.txt ________________________________________ From:?Unicode on behalf of Marius Spix via Unicode Sent:?12 August 2024 08:55 To:?unicode at corp.unicode.org ; as at signographie.de Subject:?Aw: Re: Pictographic zodiacal symbols ? All these zodiacs can be represented with existing characters: ? ? ? ? ? ? ? ?? ? ? ? ???? ??/?? ? Gesendet:?Montag, 12. August 2024 um 02:59 Uhr Von:?"Asmus Freytag via Unicode" An:?unicode at corp.unicode.org Betreff:?Re: Pictographic zodiacal symbols There's arguably a distinction between symbols and pictographs, even if both signify the same concept. ? This is different from the case two different sets of pictographs or two different sets of symbolic notation. ? Although, even in those cases it is useful to consider the question: can one of them be substituted for the other with the reader experiencing the choice as stylistic? ? A./ ? On 8/10/2024 3:48 PM, Leo Broukhis via Unicode wrote: What's the semantic difference between the two sets? Without it, it's just different fonts. ? Leo ? On Fri, Jul 19, 2024 at 2:00?AM A. St?tzner via Unicode wrote: ? Besides the simple typographic set of 12 zodiac characters there is a tradition of another set, consisting of pictographic symbols of the 12 zodiac signs, which also play a role in typography (~ 16th c. onwards) Has this set been proposed for encoding at any time in the past? ? greetings, Andreas St?tzner ? __________________________________________________________________ ? Andreas St?tzner Gestaltung ? Archivpflege ? Fontentwicklung Klaufl?gelweg 21 ? 88400 Biberach a.d. Ri? 0176-86823396 ? as at signographie.de post at andronfonts.com?? Andronfonts.com ? From ecm.unicode at gmail.com Mon Aug 12 08:43:00 2024 From: ecm.unicode at gmail.com (Erik Carvalhal Miller) Date: Mon, 12 Aug 2024 09:43:00 -0400 Subject: Pictographic zodiacal symbols In-Reply-To: References: <1609502406.1138839.1721379602249@email.ionos.de> <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> Message-ID: On Mon, Aug 12, 2024 at 4:01?AM Marius Spix via Unicode wrote: > > All these zodiacs can be represented with existing characters: > ???? Aquarius is traditionally the Water Bearer, not some water dweller. From mark at kli.org Mon Aug 12 18:57:24 2024 From: mark at kli.org (Mark E. Shoulson) Date: Mon, 12 Aug 2024 19:57:24 -0400 Subject: Pictographic zodiacal symbols In-Reply-To: References: <1609502406.1138839.1721379602249@email.ionos.de> <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> Message-ID: <37890960-87a4-46e9-b351-8de97069598b@kli.org> ? (from what I can see in Hebrew poetry etc, aquarius is translated as ???, "bucket.) ~mark On 8/12/24 09:43, Erik Carvalhal Miller via Unicode wrote: > On Mon, Aug 12, 2024 at 4:01?AM Marius Spix via Unicode > wrote: >> All these zodiacs can be represented with existing characters: >> ???? > Aquarius is traditionally the Water Bearer, not some water dweller. From asmusf at ix.netcom.com Tue Aug 13 00:21:08 2024 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Mon, 12 Aug 2024 22:21:08 -0700 Subject: Pictographic zodiacal symbols In-Reply-To: References: <1609502406.1138839.1721379602249@email.ionos.de> <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> Message-ID: <1065a560-ba66-4f2d-87c0-13e632e0ee13@ix.netcom.com> Somewhere we dropped the list. Adding back on. A "user-defined variation selector" makes no sense. Because Unicode will not reserve code points with predefined Default_Ignorable property for such a purpose. We've just had this discussion again and there's pretty strong consensus on that point. Now, if you were to use a PUA character and treat is like a variation selector, that's up to you (and people who subscribe to your PUA assignments) but it doesn't behave like a regular VS, which is ignorable if you don't / can't process it. Might as well use regular PUA characters. Which gets you back to the question whether these are/should be considered substitutable and/or whether there is significance in the choice, and if so, what it would be.? There's one other question that is typically ask, and that is whether there's a need for contrasting usage. The latter seems absent in this case. We've learned a painful lesson that identifying symbols (even borderline pictorial ones) with emoji was a very, very, very bad idea and even adding variation selectors did not fix that very, very, very bad idea. But we're stuck with it and the TCs vow to never, ever, ever, repeat this mistake. The question then is whether the distinction between a symbolic (schematic) representation and a full pictorial one is similar, and what the relation of the latter would / should be to emoji. Those questions don't have obvious answers, which means, there's a benefit of raising them in a well-reasoned proposal (but one that should carefully address the issues I've laid out here). This would give the TCs and WGs a chance to try to finetune the encoding principles in that area, as well as rule on the specific case. A./ On 8/12/2024 7:40 AM, Leo Broukhis wrote: > Then it looks like a perfect case for a (user-defined?) variation > selector. > > Leo > > On Sun, Aug 11, 2024 at 5:59?PM Asmus Freytag via Unicode > wrote: > > There's arguably a distinction between symbols and pictographs, > even if both signify the same concept. > > This is different from the case two different sets of pictographs > or two different sets of symbolic notation. > > Although, even in those cases it is useful to consider the > question: can one of them be substituted for the other with the > reader experiencing the choice as stylistic? > > A./ > > On 8/10/2024 3:48 PM, Leo Broukhis via Unicode wrote: >> What's the semantic difference between the two sets? Without it, >> it's just different fonts. >> >> Leo >> >> On Fri, Jul 19, 2024 at 2:00?AM A. St?tzner via Unicode >> wrote: >> >> Besides the simple typographic set of 12 zodiac characters >> there is a tradition of another set, consisting of >> pictographic symbols of the 12 zodiac signs, which also play >> a role in typography (~ 16th c. onwards) >> Has this set been proposed for encoding at any time in the past? >> greetings, >> Andreas St?tzner >> __________________________________________________________________ >> >> *Andreas St?tzner* >> Gestaltung ? Archivpflege ? Fontentwicklung >> Klaufl?gelweg 21 ? 88400 Biberach a.d. Ri? >> 0176-86823396 ? as at signographie.de >> post at andronfonts.com ? Andronfonts.com >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Ko?hler 1710_x.jpg Type: image/jpeg Size: 37566 bytes Desc: not available URL: From ecm.unicode at gmail.com Tue Aug 13 10:06:23 2024 From: ecm.unicode at gmail.com (Erik Carvalhal Miller) Date: Tue, 13 Aug 2024 11:06:23 -0400 Subject: Pictographic zodiacal symbols In-Reply-To: <37890960-87a4-46e9-b351-8de97069598b@kli.org> References: <1609502406.1138839.1721379602249@email.ionos.de> <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> <37890960-87a4-46e9-b351-8de97069598b@kli.org> Message-ID: On Mon, Aug 12, 2024 at 10:54?PM Mark E. Shoulson via Unicode wrote: > > ? > > (from what I can see in Hebrew poetry etc, aquarius is translated as > ???, "bucket.) Indeed, one of the pages Harriet Riddle linked to (https://www.unicode.org/reports/tr51/tr51-3-archive.html#Faces_Hands_Zodiac) contains a listing for U+1F3FA ??? AMPHORA, explicitly described as a representation of Aquarius; like a bucket, an amphora is a suitable container for water. (The font I?m seeing shows the amphora bearing a design that could be said to be a squarish depiction of waves of water; to evoke the astrological connection, the glyph would have been better served by replacing that design by something resembling U+2652 ??? AQUARIUS?alas.) Marius Spix?s merman may seem linguistically appropriate in German, where the word Wassermann (?water man?) can mean both ?merman? and ?Aquarius?. But the Latin word Aquarius means ?water carrier? ? however in English usually explained as ?Water Bearer? ? and the classical depiction is much as in the image in Andreas St?tzner?s message: a human figure (generally male, in accordance with the Latin ?us suffix and, I presume, with the gender roles of antiquity ? though depictions need not be exclusively male) pouring water from a fair?sized vessel, such as an amphora or a sack, evidently onto the ground or into more water. The constellation Aquarius is said to depict the water bearer pouring the water into the river constellation Eridanus. I suppose that for some, the amphora emoji works as a representation of Aquarius; but I feel that in excluding the human figure ? the aquarius himself ? and the action of pouring, the emoji misses the point. I suppose one could fake it with a sequence such as ?????; but even as a fallback for a ZWJ sequence, that seems pathetic. Some of the other suggestions in Marius? inspired list are also a mite suspicious. As the Zodiac?s Gemini are traditionally associated with the brothers Castor and Pollux, ??? would seem a better choice, though I suppose ??? and ??? could be serviceable alternates. ??? is a pretty good attempt at Sagittarius (also on the aforementioned page) ? but it?s missing the archer (the very meaning of the Latin word sagittarius)! Traditionally that archer is a centaur; a ZWJ sequence of centaur + bow & arrow could work, but I see that a centaur emoji proposal was declined a couple of years ago. (I?m not aware that there?s been a proposal for a centaur Archer specifically?) And the WOMAN emoji, often manifested as a head shot, seems a little too generic for the nubile and sometimes winged female figure usually associated with Virgo. ????, perhaps? But Aquarius seems to fare the worst when it comes to encoded pictographs. From mark at kli.org Tue Aug 13 10:14:37 2024 From: mark at kli.org (Mark E. Shoulson) Date: Tue, 13 Aug 2024 11:14:37 -0400 Subject: Pictographic zodiacal symbols In-Reply-To: References: <1609502406.1138839.1721379602249@email.ionos.de> <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> <37890960-87a4-46e9-b351-8de97069598b@kli.org> Message-ID: <3d35c9b5-2a5c-4d2d-80e5-fde277d5b4d7@kli.org> On 8/13/24 11:06, Erik Carvalhal Miller via Unicode wrote: > But the Latin word Aquarius means ?water carrier? ? however in English > usually explained as ?Water Bearer? ? I've always wondered, then, why the "thirteenth zodiac sign" is Ophiucus and not Serpentarius.? All the other zodiac signs are named in Latin, why this one in Greek?? I did once actually find, in a deck of cards showing the constellations, that constellation labeled "Ophiucus or Serpentarius", but really nowhere else. ~mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecm.unicode at gmail.com Tue Aug 13 10:26:09 2024 From: ecm.unicode at gmail.com (Erik Carvalhal Miller) Date: Tue, 13 Aug 2024 11:26:09 -0400 Subject: Pictographic zodiacal symbols In-Reply-To: <3d35c9b5-2a5c-4d2d-80e5-fde277d5b4d7@kli.org> References: <1609502406.1138839.1721379602249@email.ionos.de> <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> <37890960-87a4-46e9-b351-8de97069598b@kli.org> <3d35c9b5-2a5c-4d2d-80e5-fde277d5b4d7@kli.org> Message-ID: Ophiuchus is Latin ? or Latinized, anyway; the Greek word (transliterated) ends in ?os. But you?re right: If you?re going to go with Latin, why not do so whole hog (or whole snake, as it were)? On Tue, Aug 13, 2024 at 11:20?AM Mark E. Shoulson via Unicode wrote: > > On 8/13/24 11:06, Erik Carvalhal Miller via Unicode wrote: > > But the Latin word Aquarius means ?water carrier? ? however in English > usually explained as ?Water Bearer? ? > > I've always wondered, then, why the "thirteenth zodiac sign" is Ophiucus and not Serpentarius. All the other zodiac signs are named in Latin, why this one in Greek? I did once actually find, in a deck of cards showing the constellations, that constellation labeled "Ophiucus or Serpentarius", but really nowhere else. > > ~mark From avidseeker7 at protonmail.com Mon Aug 12 18:59:28 2024 From: avidseeker7 at protonmail.com (Avid Seeker) Date: Mon, 12 Aug 2024 23:59:28 +0000 Subject: Re-evaluate directionality of Arabic Forms-B characters Message-ID: Greetings, Reading Unicode Bidirectional Algorithm (https://www.unicode.org/reports/tr9/), I was wondering about setting Arabic Presentation Forms-B as strong LTR characters instead of RTL characters. Reasons: 1. According to https://www.unicode.org/versions/Unicode6.0.0/: The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text. 2. Forms-B characters are not used by GUI applications since they have their own lettershaping capabilities. 3. The only use case that is important is the support for Arabic in tty, old terminals, and terminals with no Bidi/lettershaping support. Programs (e.g: Vim, fribidi) in these terminals use Arabic Forms-B for applying lettershaping within TUI. And when those programs use Forms-B they assume they have the same directionality as LTR characters, please have a look on this discussion: https://gitlab.gnome.org/GNOME/vte/-/issues/2804#note_2194384 Regards, - Avid Seeker -------------- next part -------------- A non-text attachment was scrubbed... Name: publickey - avidseeker7 at protonmail.com - 0x7EECE85C.asc Type: application/pgp-keys Size: 669 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 249 bytes Desc: OpenPGP digital signature URL: From martin.vahi at softf1.com Tue Aug 13 15:04:28 2024 From: martin.vahi at softf1.com (Martin Vahi) Date: Tue, 13 Aug 2024 23:04:28 +0300 Subject: Have Characters that Depict Electronic Components been Discussed? Message-ID: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Dear readers of this list, I know that there is literally a shit emoji character, but when I tried to find characters for electronic components like diodes, capacitors, resistors, radio lamps, etc. then I failed to find any. The same with XOR gate, OR gate, AND gate, MUX, DEMUX, etc. Even ASCII had special characters for drawing the DOS era windows in console, so a wish for some characters that at least in some combined manner would allow to draw electrical schematics in console windows does not look too extreme to me. Some reference to some mail archive, where that topic has been discussed in the past, would be helpful. As a side-note, some modern era Linux terminals allow to display graphics, even videos, in pixel analogues called sixels. I even have a YouTube demo video about that: ("2022 06 17 images and videos on WSL Linux Terminal", 2023_03_08) https://www.youtube.com/watch?v=SBLSa7X8dEY Thank You for reading my letter and thank You for the answer(s). Yours sincerely, Martin.Vahi at softf1.com From list+unicode at jdlh.com Tue Aug 13 16:09:31 2024 From: list+unicode at jdlh.com (Jim DeLaHunt) Date: Tue, 13 Aug 2024 14:09:31 -0700 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: <5b088d9e-7749-4aaa-86c5-f4145826fe47@jdlh.com> Hello, Martin, and welcome to Unicode: On 2024-08-13 13:04, Martin Vahi via Unicode wrote: > > Dear readers of this list, > > I know that there is literally a shit emoji character, but when I tried > to find characters for electronic components like diodes, capacitors, > resistors, radio lamps, etc. then I failed to find any. The same with > XOR gate, OR gate, AND gate, MUX, DEMUX, etc. ?a wish for some > characters that at least in some combined > manner would allow to draw electrical schematics in console windows > does not look too extreme to me. Some reference to some mail archive, > where that topic has been discussed in the past, would be helpful. I am not aware of a discussion of encoding symbols for electrical schematics in Unicode. I am however aware of numerous proposals to encode various graphical symbols in general in Unicode. Those proposals, and the arguments against them, are so common that there are sections of The Unicode Standard and of the Emoji process which describe what gets encoded and what does not. Consider (re-)reading the following: * The Core Specification of The Unicode Standard, section 2.2 Unicode Design Principles . Consider especially the principles "Plain text" and "Characters, not glyphs". * Guidelines for Submitting Unicode? Emoji Proposals, especiall the "Selection Factors" section Some questions I would ask of anyone proposing to encode electrical symbols in Unicode: Are these symbols used in a plain text context??? Do people want to write in an email, ? Do you have evidence of people using such symbols in text outside of computer-based plain text?? For instance, do you have examples of people hand-writing text with electrical symbols mingled in the text? Do you have evidence of people trying to draw electrical schematics in console windows?? Why are those people trying to use text drawing mechanisms instead of graphics mechanisms like SVG? Overall, my reponse to > ?a wish for some characters that at least in some combined > manner would allow to draw electrical schematics in console windows > does not look too extreme to me.? is that it does look quite extreme to me.? Electrical schematics are a two-dimensional graphical representation of an electrical circuit. The right tool for the job is graphics, not text, it seems to me. Also, in the context of discussions about Unicode, certain words are terms of art and their meanings matter. So, > Even ASCII had special characters for drawing the DOS era windows in > console?. The word "ASCII" refers to a particular character encoding standard , aka ANSI_X3.4-1968, aka ISO_646.irv:1991. It has just 128 code points. I am not aware of any which are for drawing DOS era windows. Maybe you are referring to IBM Code page 437 , the character set of the original IBM PC. It included ASCII plus a set of line-drawing symbols and some icons. When talking about Unicode proposals, getting the terminology right for other encoding standards reduces confusion and speeds up the discussion. > As a side-note, some modern era Linux terminals allow to display > graphics, even videos, in pixel analogues called sixels. > I even have a YouTube demo video about that: > > ??? ("2022 06 17 images and videos on WSL Linux Terminal", 2023_03_08) > https://www.youtube.com/watch?v=SBLSa7X8dEY I have not watched this video all the way through. But the Wikipedia article on Sixel seems to say that sixels are an encoding of image data, in 6-bit units, as ASCII characters. Terminals which display sixel-encoded image data switch into a "sixel mode", in which they interpret the data stream as an image rather than as text. I see no intention that the images represented as sixels be legible together with, and mixed together with, text content. Thus sixel encoding seems to be a higher-level protocol which re-purposes text data channels to transmit graphical content, and not a form of text content. > Thank You for reading my letter and > thank You for the answer(s). > > Yours sincerely, > Martin.Vahi at softf1.com Is this the sort of answer you were looking for? Best regards, ??? ?Jim DeLaHunt -- . --Jim DeLaHunt,jdlh at jdlh.com http://blog.jdlh.com/ (http://jdlh.com/) multilingual websites consultant, Vancouver, B.C., Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: From marius.spix at web.de Tue Aug 13 16:31:11 2024 From: marius.spix at web.de (Marius Spix) Date: Tue, 13 Aug 2024 23:31:11 +0200 Subject: Aw: Re: Pictographic zodiacal symbols In-Reply-To: References: <1609502406.1138839.1721379602249@email.ionos.de> <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> <37890960-87a4-46e9-b351-8de97069598b@kli.org> Message-ID: An HTML attachment was scrubbed... URL: From ecm.unicode at gmail.com Tue Aug 13 16:34:48 2024 From: ecm.unicode at gmail.com (Erik Carvalhal Miller) Date: Tue, 13 Aug 2024 17:34:48 -0400 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: <5b088d9e-7749-4aaa-86c5-f4145826fe47@jdlh.com> References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> <5b088d9e-7749-4aaa-86c5-f4145826fe47@jdlh.com> Message-ID: On Tue, Aug 13, 2024 at 5:14?PM Jim DeLaHunt via Unicode wrote: > I am not aware of a discussion of encoding symbols for electrical schematics in Unicode. I am however aware of numerous proposals to encode various graphical symbols in general in Unicode. Those proposals, and the arguments against them, are so common that there are sections of The Unicode Standard and of the Emoji process which describe what gets encoded and what does not. > > Consider (re-)reading the following: > > The Core Specification of The Unicode Standard, section 2.2 Unicode Design Principles . Consider especially the principles "Plain text" and "Characters, not glyphs". > Guidelines for Submitting Unicode? Emoji Proposals, especiall the "Selection Factors" section Rather on point, regarding not so much the elaboration of those principles as their application, would be the following tidbit, from The Unicode Standard, chapter 22, ?22.7 ?Technical Symbols?, under ?Miscellaneous Technical: U+2300?U+23FF? on pg. 884 (pg. 41 of https://www.unicode.org/versions/Unicode15.0.0/ch22.pdf): ?? This block encodes technical symbols, including keytop labels such as U+232B ERASE TO THE LEFT. Excluded from consideration were symbols that are not normally used in one-dimensional text but are intended for two-dimensional diagrammatic use, such as most symbols for electronic circuits. ? From marius.spix at web.de Tue Aug 13 16:43:07 2024 From: marius.spix at web.de (Marius Spix) Date: Tue, 13 Aug 2024 23:43:07 +0200 Subject: Fw: Aw: Re: Pictographic zodiacal symbols References: <1609502406.1138839.1721379602249@email.ionos.de> <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> <37890960-87a4-46e9-b351-8de97069598b@kli.org> Message-ID: An HTML attachment was scrubbed... URL: From beckiergb at gmail.com Tue Aug 13 16:47:30 2024 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Tue, 13 Aug 2024 14:47:30 -0700 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: The Sharp MZ-700, a Japanese 8-bit microcomputer, has a set of electronic schematic symbols, and they will be included in Unicode 16.0. [image: Screenshot from 2024-08-13 13-43-58.png] The rationale for including these though is that of compatibility, and not because of any determination that schematic symbols qualify for encoding in Unicode. I was part of the group who proposed these symbols. I vaguely recall when we were discussing these particular symbols that there was some verbiage about Unicode encoding text and "notational systems," and one of us brought up that it could be argued that schematic symbols are a "notational system." However, this was a very minor theoretical point; no one has attempted such an argument and that was not the argument we were making. As far as this goes in practice "notational systems" seems limited to things like phonetic transcription, chess notation, sheet music, and other things that are at least vaguely text-like, as opposed to a fully two-dimensional diagram like an electronic schematic. -- Rebecca Bettencourt On Tue, Aug 13, 2024 at 1:20?PM Martin Vahi via Unicode < unicode at corp.unicode.org> wrote: > > Dear readers of this list, > > I know that there is literally a shit emoji character, but when I tried > to find characters for electronic components like diodes, capacitors, > resistors, radio lamps, etc. then I failed to find any. The same with > XOR gate, OR gate, AND gate, MUX, DEMUX, etc. > > Even ASCII had special characters for drawing the DOS era windows in > console, so a wish for some characters that at least in some combined > manner would allow to draw electrical schematics in console windows > does not look too extreme to me. Some reference to some mail archive, > where that topic has been discussed in the past, would be helpful. > > As a side-note, some modern era Linux terminals allow to display > graphics, even videos, in pixel analogues called sixels. > I even have a YouTube demo video about that: > > ("2022 06 17 images and videos on WSL Linux Terminal", 2023_03_08) > https://www.youtube.com/watch?v=SBLSa7X8dEY > > > Thank You for reading my letter and > thank You for the answer(s). > > Yours sincerely, > Martin.Vahi at softf1.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot from 2024-08-13 13-43-58.png Type: image/png Size: 139430 bytes Desc: not available URL: From harjitmoe at outlook.com Tue Aug 13 16:58:40 2024 From: harjitmoe at outlook.com (Harriet Riddle) Date: Tue, 13 Aug 2024 22:58:40 +0100 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: Message-ID: An HTML attachment was scrubbed... URL: From beckiergb at gmail.com Tue Aug 13 18:17:31 2024 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Tue, 13 Aug 2024 16:17:31 -0700 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: Message-ID: On Tue, Aug 13, 2024 at 3:02?PM Harriet Riddle via Unicode < unicode at corp.unicode.org> wrote: > Novel pseudographic characters don't generally get added to Unicode, but > pseudographic characters from pre-Unicode code pages often do. > I wouldn't say often. This has happened maybe three or four times in Unicode's over-30-year history, with increasing resistance (pardon the pun) each time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecm.unicode at gmail.com Tue Aug 13 18:31:40 2024 From: ecm.unicode at gmail.com (Erik Carvalhal Miller) Date: Tue, 13 Aug 2024 19:31:40 -0400 Subject: Fw: Aw: Re: Pictographic zodiacal symbols In-Reply-To: References: <1609502406.1138839.1721379602249@email.ionos.de> <035585cd-77be-4224-9b7a-270b6ead6f50@ix.netcom.com> <37890960-87a4-46e9-b351-8de97069598b@kli.org> Message-ID: On Tue, Aug 13, 2024 at 5:47?PM Marius Spix via Unicode wrote: > It really is a German folk etymology thing, that Aquarius is often depicted as a merman or trident in German-speaking publications, als both are translated as Wassermann. But the amphora is also often seen. Interesting, thank you for the information. I am an American and an Aquarius, and I am not used to seeing a lone amphora or similar container representing that sign; ordinarily I would expect either the symbol ??? or a pictograph as I described, with a human figure pouring water from some container. Since childhood I have often wondered at the name Water Bearer, since the depictions show a person better at spilling water than bearing it! The French name Verseau makes much more sense: etymologically verse?eau, ?pours water?? For kicks, I just did Google Images searches for ?Aquarius zodiac?, ?Wassermann Tierkreis?, and ?Verseau zodiaque? and looked at several screens? worth of results for each. All affirmed ??? as common. All three showed the Water Spiller (ahem) imagery, but it was much less common in the German results and most common in the English results. Predictably, merfolk often showed up in the German results and hardly figured in the English and French results. Amphorae and similar vessels sans bearer were an unmistakably recurrent presence in the results for all three languages, more so than I would have expected before this discussion; however, in contrast with the ??? glyphs I have come across, the bearer?less images mimicked the bearer images in almost uniformly depicting water pouring or overflowing from the vessels. Not many tridents were seen; most of the tridents that did show up were in the German results, and I think all the tridents were in images that contained at least one of the other elements heretofore described. Of course, all three searches also yielded pictures of the constellation, sometimes in association with the other imagery. On the basis of the search results, I would still suggest that Water Bearer imagery is a significant omission from the emoji/pictograph zodiac repertoire, though much more significant for English and French contexts than for German ones. The bearer?less amphora with water flowing out is also an issue; one approach is to update glyphs for U+1F3FA, and another is to juxtapose U+1F3FA with U+1F4A6 ??? SPLASHING SWEAT SYMBOL (which seems appropriate enough for water), possibly in a ZWJ sequence. Merfolk emoji are already available (as already noted), as is U+1F531 ??? TRIDENT EMBLEM. ??????????????????????????????????? After I wrote that Sagittarius is traditionally a centaur archer, I took another look at the image Andreas St?tzner provided; and I realized that its archer looks quite human ? and not like any Unicode emoji. However, U+1F3F9 ??? BOW AND ARROW still seems a good substitute, if not quite the same thing. I, uh, did not repeat the Google Images experiment; my eyeballs need some rest? From billposer at alum.mit.edu Tue Aug 13 18:42:43 2024 From: billposer at alum.mit.edu (Bill Poser) Date: Tue, 13 Aug 2024 16:42:43 -0700 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: Message-ID: Phonetic transcription (that is, IPA and IPA-like symbols) is not merely "text-like" but routinely appears within ordinary text in publications on linguistics and sometimes other fields such as anthropology. This contrasts with things like symbols for electronic schematics. On Tue, Aug 13, 2024 at 4:19?PM Rebecca Bettencourt via Unicode < unicode at corp.unicode.org> wrote: > On Tue, Aug 13, 2024 at 3:02?PM Harriet Riddle via Unicode < > unicode at corp.unicode.org> wrote: > >> Novel pseudographic characters don't generally get added to Unicode, but >> pseudographic characters from pre-Unicode code pages often do. >> > > I wouldn't say often. This has happened maybe three or four times in > Unicode's over-30-year history, with increasing resistance (pardon the pun) > each time. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Tue Aug 13 18:55:20 2024 From: doug at ewellic.org (Doug Ewell) Date: Tue, 13 Aug 2024 23:55:20 +0000 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: Message-ID: Harriet Riddle wrote: > Novel pseudographic characters don't generally get added to Unicode, > but pseudographic characters from pre-Unicode code pages often do. Our group, which was led by Rebecca Bettencourt and included several other contributors, was responsible for 214 legacy computing symbols added to Unicode 13.0 and another 736 to be added to Unicode 16.0. That?s a lot of characters, but I don?t think I would use the term ?often? to describe two proposals across three years. Furthermore, we are done; the group will not be proposing any additional sets of legacy computing symbols in the future. Not only do the two approved sets already cover every identifiable legacy platform that achieved non-trivial market share, but the group made a commitment to SEW and UTC not to come back with more, to allay fears that the set of such characters might be ?never-ending.? That?s not to say other legacy symbols might not turn up in the future, nor that they might present a legitimate case for encoding; but they will have to be proposed separately, by different individuals or groups, and not considered as any sort of follow-on to L2/19-025 and L2/21-235. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From egmont at gmail.com Wed Aug 14 05:23:48 2024 From: egmont at gmail.com (Egmont Koblinger) Date: Wed, 14 Aug 2024 12:23:48 +0200 Subject: Re-evaluate directionality of Arabic Forms-B characters Message-ID: Hello, I'd like to speak up here too that I believe this proposed change is a bad one which should be rejected. Directionality of Forms-B characters matter if you run UBA on a string containing such characters. I believe this shouldn't happen under normal circumstances, it signals some bigger underlying problem, like double-BiDi'ing a piece text. If / whenever this happens, the root cause should be addressed, rather than mitigating the symptoms. In (arguably broken) contexts where this happens anyway, the proposed fix would fix the layout of Arabic words, but would leave Hebrew text broken (reversed). It's unacceptable for any RTL-related fix to only address scripts that happen to have the (fundamentally unrelated) concept of shaping and not ones that don't. The suggested change would make "shaping" do more than just shaping, it would also influence a way more fundamental thing: the order of the characters. "Shaping" should remain "shaping" only. > 3. The only use case that is important is the support for Arabic in tty, old > terminals, and terminals with no Bidi/lettershaping support. This sentence is incorrect, for two vastly different reasons. One is that there's absolutely no problem in old terminals, in terminals that don't know anything about BiDi. There the directionality of Forms-B characters is irrelevant and changing it wouldn't change anyhing. It's exactly the opposite: The problem occurs in terminals that _do_ perform BiDi-shuffling. Based on the conversation in the VTE bugtracker, I believe this was a simple oversight by OP who wanted to mention the other category. The second problem is that the claim that there's no other use case is not backed up at all. We can't know what other existing software such a backwards-incompatible change would break. > And when those programs use Forms-B they assume they > have the same directionality as LTR characters Unicode is clear that Forms-B characters have RTL directionality (which I believe is a good thing, because this way the correct ordering of the letters remains orthogonal to shaping). If a piece of software assumes otherwise then that piece of software is not Unicode/UBA-conformant. The solution is to adjust those software to match Unicode, not the other way around. Please see my detailed arguments in the discussion already linked by OP (i.e. not just that particular linked comment but the entire discussion). I am truly hoping that after a careful analysis of the situation you will conclude that Unicode's current behavior here is much better than the proposed one would be. Thanks a lot, Egmont Koblinger (VTE and GNOME Terminal co-developer, author and VTE-implementer of the "BiDi in Terminal Emulators" proposal) From martin.vahi at softf1.com Wed Aug 14 14:37:17 2024 From: martin.vahi at softf1.com (Martin Vahi) Date: Wed, 14 Aug 2024 22:37:17 +0300 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: Thank You all for the answers, useful references and interesting history, but while reading Your nice answers that seem to revolve around the idea that Unicode is meant to be a standard for only a kind of text that people once wrote on paper, not for "2D drawing hacks" like ASCII art and diagrams, I devilishly stumbled upon the idea that plain text in computers has always been used for more than just classical literature. An example that nobody reads as written form of a human language, is an interactive progress bar that consists of dots like |0%........ 100%| or even a "teletype compatible" progress bar analogue like 0% 1% 2% 3% 4% 5% (and so on till 100%) Aren't such use cases LEGAL USE CASES for text in computers? I know that I sound a bit like a troll by asking such questions, but really, if text is being used for a lot of novel "hacks" like the ASCII art and progress bar in computers and it would be quite cheap from spent code points amount point of view to define some small set of special characters for doing almost arbitrary 2D drawing, then what's the harm of defining that small set of such "sprite role" characters, specially if there are already so many characters defined in Unicode? With such solution there might not be a need for many new characters in the future, because they might be drawn as combination of existing characters. For example, if a monospace character area is divided to 16 pixel rows and 8 pixel columns and each character of that drawing character set fills exactly one of those pixels, then there would be exactly 16*8=128 such one pixel depicting characters. Those 128 characters could be visually used on top of each other just like accents are rendered on top of a letter. What's the harm to the Unicode standard by defining such special characters? It would also solve the issue with font files, because a set of installed font files will never be able to contain fonts for absolutely all modern Unicode code points without being constantly updated, but fonts for those 128 characters could be installed once and then a "legacy computer" that has fonts for those 128 characters can still draw/render "email from the future" with "future characters" if the future email standard says that a sending email client can embed font file analogues for those "future characters" in the form of strings that consist of some combination of those 128 2D drawing characters? It's a 2D hack, but it can make plain text based software more future-proof and ideologically the 2D drawing characters are at the same category with the interactive text console progress bar: not usable in paper books and probably hard to pronounce in any human language. Thank You for Your answers and thank You for reading my letter(s). Sincerely Yours, Martin.Vahi at softf1.com From abrahamgross at disroot.org Wed Aug 14 14:53:36 2024 From: abrahamgross at disroot.org (ag disroot) Date: Wed, 14 Aug 2024 19:53:36 +0000 (UTC) Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: Characters like the box drawing characters would never be accepted these days. As in if it wouldn't be in unicode already and if old standards wouldntve added it, it would never be accepted bc it graphic by nature and you should use a higher level protocol for that. These 128 chars follow the same logic unfortunately. 2024-08-14T19:38:11Z Martin Vahi via Unicode : > 0 From junicode at jcbradfield.org Wed Aug 14 14:55:34 2024 From: junicode at jcbradfield.org (Julian Bradfield) Date: Wed, 14 Aug 2024 20:55:34 +0100 (BST) Subject: Have Characters that Depict Electronic Components been Discussed? References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: On 2024-08-14, Martin Vahi via Unicode wrote: > text that people once wrote on paper, not for "2D drawing hacks" like > ASCII art and diagrams, I devilishly stumbled upon the idea that plain > text in computers has always been used for more than just classical > literature. An example that nobody reads as written form of a human > language, is an interactive progress bar that consists of dots like > > |0%........ 100%| That is a use of *existing* plain text to approximate the desired graphical impression. > art and progress bar in computers and it would be quite cheap from spent > code points amount point of view to define some small set of special > characters for doing almost arbitrary 2D drawing, then what's the harm > of defining that small set of such "sprite role" characters, specially Unicode does not encode what *might* be used, it encodes what *has* been used. > For example, if a monospace character area is divided to 16 pixel > rows and 8 pixel columns and each character of that drawing character > set fills exactly one of those pixels, then there would be exactly > 16*8=128 such one pixel depicting characters. Those 128 characters > could be visually used on top of each other just like accents are You are William Overington and I claim my five pounds ... (British cultural reference, google Lobby Lud). From irgendeinbenutzername at gmail.com Wed Aug 14 15:05:00 2024 From: irgendeinbenutzername at gmail.com (Charlotte Eiffel Lilith Buff) Date: Wed, 14 Aug 2024 22:05:00 +0200 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: I feel like these examples are less ?valid? use cases for plain text and more the result of technical limitations that stopped being relevant long ago. Characters like the block elements or box drawing parts, or even simple ?hacks? like building a progress bar out of common punctuation marks, exist not because text-based pseudo-graphics were a good idea, but because developers at the time had no other choice but to use them if they wanted to achieve certain effects. They *would* have used actual graphics if they had been able to, in which case the concept of abusing text characters to imitate primitive graphics (and wasting precious character slots that could have been assigned to actual letters from actual languages instead) would have seemed ludicrous to them. Unicode acts as a time capsule for this era of computing as an inevitable consequence of its goal to be the universal, all-encompassing character set, but a time capsule is just that: a way to remember the past. Adding new pseudo-graphics with no history behind them to Unicode isn?t so much future-proofing as it is stubbornly clinging to days long gone. We no longer live in a world where on-screen text has to be divided into fixed-width cells of a handful of pixels each, or where text characters have to be defined solely based on their precise shape on a particular display device without any underlying semantics, or where the act of drawing a single image to the screen takes up so much memory and processing power that you have to ?write? your graphics instead if you want to do anything more complex than a shopping list ? and we should be thankful for that. Retro computing is cool and all, but I wouldn?t want to buy a PC with floppy disk drives nowadays. Am Mi., 14. Aug. 2024 um 21:40 Uhr schrieb Martin Vahi via Unicode < unicode at corp.unicode.org>: > > Thank You all for the answers, useful references and interesting > history, but while reading Your nice answers that seem to revolve around > the idea that Unicode is meant to be a standard for only a kind of > text that people once wrote on paper, not for "2D drawing hacks" like > ASCII art and diagrams, I devilishly stumbled upon the idea that plain > text in computers has always been used for more than just classical > literature. An example that nobody reads as written form of a human > language, is an interactive progress bar that consists of dots like > > |0%........ 100%| > > or even a "teletype compatible" progress bar analogue like > > 0% > 1% > 2% > 3% > 4% > 5% > (and so on till 100%) > > Aren't such use cases LEGAL USE CASES for text in computers? > > I know that I sound a bit like a troll by asking such questions, but > really, if text is being used for a lot of novel "hacks" like the ASCII > art and progress bar in computers and it would be quite cheap from spent > code points amount point of view to define some small set of special > characters for doing almost arbitrary 2D drawing, then what's the harm > of defining that small set of such "sprite role" characters, specially > if there are already so many characters defined in Unicode? With > such solution there might not be a need for many new characters in > the future, because they might be drawn as combination of existing > characters. > > For example, if a monospace character area is divided to 16 pixel > rows and 8 pixel columns and each character of that drawing character > set fills exactly one of those pixels, then there would be exactly > 16*8=128 such one pixel depicting characters. Those 128 characters > could be visually used on top of each other just like accents are > rendered on top of a letter. What's the harm to the Unicode standard by > defining such special characters? It would also solve the issue with > font files, because a set of installed font files will never be able > to contain fonts for absolutely all modern Unicode code points without > being constantly updated, but fonts for those 128 characters could be > installed once and then a "legacy computer" that has fonts for those 128 > characters can still draw/render "email from the future" with "future > characters" if the future email standard says that a sending email > client can embed font file analogues for those "future characters" in > the form of strings that consist of some combination of those 128 2D > drawing characters? It's a 2D hack, but it can make plain text based > software more future-proof and ideologically the 2D drawing characters > are at the same category with the interactive text console progress bar: > not usable in paper books and probably hard to pronounce in any human > language. > > Thank You for Your answers and thank You for reading my letter(s). > > Sincerely Yours, > Martin.Vahi at softf1.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Wed Aug 14 22:48:26 2024 From: doug at ewellic.org (Doug Ewell) Date: Thu, 15 Aug 2024 03:48:26 +0000 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: Martin Vahi wrote: > [...] if text is being used for a lot of novel "hacks" like the ASCII > art and progress bar in computers and it would be quite cheap from > spent code points amount point of view to define some small set of > special characters for doing almost arbitrary 2D drawing, then what's > the harm of defining that small set of such "sprite role" characters, > specially if there are already so many characters defined in Unicode? No proposal to encode any number of characters, one or a thousand, ever benefits from the argument that ?there is plenty of space in Unicode.? Every proposal is accepted or rejected on its own merits. And that is as it should be. I was certain there was an FAQ on the Unicode site about this, but I could not find one. If it has not yet been written, it should be; the topic does come up from time to time. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From cate at cateee.net Thu Aug 15 02:48:00 2024 From: cate at cateee.net (Giacomo Catenazzi) Date: Thu, 15 Aug 2024 09:48:00 +0200 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: <06e40875-c035-4907-96e0-927651171d20@cateee.net> On 2024-08-15 5:48, Doug Ewell via Unicode wrote: > No proposal to encode any number of characters, one or a thousand, > ever benefits from the argument that ?there is plenty of space in > Unicode.? Every proposal is accepted or rejected on its own merits. > And that is as it should be. But I would also add: each character has huge cost which should be carried with to the final version of Unicode. A character is nor just like a possibly nearly hidden page of Wikipedia with low cost (just database space/backups): every program must scan the list (so it must be saved in the system, and possibly many programs have own list, e.g. a browser may support more recent database compared other programs). So each phone must have at lease one copy, so also each computer and virtual machine. But also it add complexity to read the database. But also on font side. A character without a representation is not very useful (but for scholars, so ancient language may be ok). And that has a huge costs, also just to select what to model and what not. On 2024-08-14 21:53, ag disroot via Unicode wrote: > Characters like the box drawing characters would never be accepted these > days. As in if it wouldn't be in unicode already and if old standards > wouldntve added it, it would never be accepted bc it graphic by nature > and you should use a higher level protocol for that Not just that, but (again fonts): many fonts (I think also some default one) which support the technical block of Unicode are not much useable. Alignment is not where it should, etc. For me that proof that such block is not really used, and so nobody care (but just to have a glyph). And I would re-iterate: Unicode should requires a font for each new glyphs (and with a "free license", so it could be easier to derive characters). Not only it can show that people will invest on such character, but as I see in some discussion: I think it will reduce bugs in the Unicode Character Database (real example helps experts of the character to find bugs on character property). giacomo -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Thu Aug 15 03:59:17 2024 From: jameskass at code2001.com (James Kass) Date: Thu, 15 Aug 2024 08:59:17 +0000 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: <06e40875-c035-4907-96e0-927651171d20@cateee.net> References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> <06e40875-c035-4907-96e0-927651171d20@cateee.net> Message-ID: <05aa794b-dcad-437a-937d-ca651dcbdb29@code2001.com> On 2024-08-15 7:48 AM, Giacomo Catenazzi via Unicode wrote: > Not just that, but (again fonts): many fonts (I think also some > default one) which support the technical block of Unicode are not much > useable. Alignment is not where it should, etc. For me that proof that > such block is not really used, and so nobody care (but just to have a > glyph). Using a suitable font in an editor which allows the user to control the font selection should resolve any alignment issues.? One such font is Kreative Square, available here: https://www.kreativekorp.com/software/fonts/ksquare/ From hsivonen at mozilla.com Thu Aug 15 04:08:44 2024 From: hsivonen at mozilla.com (Henri Sivonen) Date: Thu, 15 Aug 2024 10:08:44 +0100 Subject: Hanb in domain labels Message-ID: UTS #39 is commonly used as the baseline for detecting IDN spoofs, and UTS #39 explicitly allows combining Han and Bopomofo. Considering that ? looks confusable with ? and ? looks confusable with ?, I?m wondering if it?s appropriate to explicitly allow this combination in the spoof detection context. Is combining Han and Bopomofo in one domain label something that occurs commonly enough in domains that aren?t intended to be spoofs for it being necessary not to treat the script combination as triggering spoof detection in the domain name context? -- Henri Sivonen hsivonen at mozilla.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From gwidion at gmail.com Thu Aug 15 09:07:04 2024 From: gwidion at gmail.com (Joao S. O. Bueno) Date: Thu, 15 Aug 2024 11:07:04 -0300 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: > Unicode does not encode what *might* be used, it encodes what *has* been used. A blatant problem with this affirmation is that from 2000s-forward, everything text related goes _through_ unicode - including the characters used in the (allegedly more serious than 2D diagrams or art) mission of writing systems for spoken languages. TL;DR: if this is true, than all innovation is writing is ultimately fated to come to an end as unicode asymptotically encodes whatever it deems worthy from pre-1999, and then all human writing and characters should be frozen forever. At some point this will obviously have to be reviewed. On Wed, Aug 14, 2024 at 4:59?PM Julian Bradfield via Unicode wrote: > > On 2024-08-14, Martin Vahi via Unicode wrote: > > text that people once wrote on paper, not for "2D drawing hacks" like > > ASCII art and diagrams, I devilishly stumbled upon the idea that plain > > text in computers has always been used for more than just classical > > literature. An example that nobody reads as written form of a human > > language, is an interactive progress bar that consists of dots like > > > > |0%........ 100%| > > That is a use of *existing* plain text to approximate the desired > graphical impression. > > > art and progress bar in computers and it would be quite cheap from spent > > code points amount point of view to define some small set of special > > characters for doing almost arbitrary 2D drawing, then what's the harm > > of defining that small set of such "sprite role" characters, specially > > Unicode does not encode what *might* be used, it encodes what *has* > been used. > > > For example, if a monospace character area is divided to 16 pixel > > rows and 8 pixel columns and each character of that drawing character > > set fills exactly one of those pixels, then there would be exactly > > 16*8=128 such one pixel depicting characters. Those 128 characters > > could be visually used on top of each other just like accents are > > You are William Overington and I claim my five pounds ... (British > cultural reference, google Lobby Lud). From cate at cateee.net Thu Aug 15 09:31:09 2024 From: cate at cateee.net (Giacomo Catenazzi) Date: Thu, 15 Aug 2024 16:31:09 +0200 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: On 2024-08-15 16:07, Joao S. O. Bueno via Unicode wrote: >> Unicode does not encode what *might* be used, it encodes what *has* > been used. > > A blatant problem with this affirmation is that from 2000s-forward, > everything text > related goes _through_ unicode - including the characters used in the > (allegedly more serious than 2D diagrams or art) mission of writing systems > for spoken languages. TL;DR: if this is true, than all innovation is writing > is ultimately fated to come to an end as unicode asymptotically encodes > whatever it deems worthy from pre-1999, and then all human writing and > characters should be frozen forever. I think you confuse some points. 2D diagrams or arts are not part of writting system, like bold characters, subscripting, cursive or printer characters (Unicode makes no difference). For complete description of writting we may need a image format (and possibly multi-layer). But it is reading a Shakespeare? in bold or in Helvetica change your enjoyment (or pain) of reading it? I do not think. In fact I find annoying reading old text with ? (long s). But Unicode had other scopes: in order to be used, it had to allow round-trip conversion with existing encodings. In my opinion is this feature which gave success to Unicode. But It is a diffent scope and a neccessity in order to fit the first pourpose. As I wrote earlier: we see that techical figures in Unicode are poorly (aka not useable) in most systems. So probably the Unicode encoding is mostly useless. Let's not add additional parts nobody will use. And on a text book, where you explain symbols, you need much more control on the symbols. It would be wrong to just add "resistance symbol", and in text describing as a zig-zag line, or a empty box, or ...? Like using just Unicode to learn to write Latin scripts: you may describe it in a way, but the font will use a different design (think a, g, or just cursive writting). If you want to change my idea: find many sources where Unicode is used for the technical symbols (already included in Unicode). Real use. Else we can wait until sombody find them, and possibly adapting to real uses). giacomo From kenwhistler at sonic.net Thu Aug 15 09:49:22 2024 From: kenwhistler at sonic.net (Ken Whistler) Date: Thu, 15 Aug 2024 07:49:22 -0700 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: <3073306f-e48a-4f30-8667-35d762e416fe@sonic.net> This statement is manifestly incorrect. The Unicode Technical Committee regularly considers the encoding of scripts for writing systems created fairly recently. The criteria for encoding include evidence of regular use of the script for writing among a community of users, and evidence of a need for digital interchange of text written in the script. Many scripts invented and promulgated only in the 20th century have already been encoded, including several invented in the last two decades of the century, such as Nyiakeng Puachue Hmong, Hanifi Rohingya, Adlam, Tangsa, and Gurung Khema. The addition of scripts appropriate for encoding has even now extended to new scripts invented in the *21st* century, for example, Wancho (added in Unicode 12.0) and Toto (added in Unicode 14.0). --Ken On 8/15/2024 7:07 AM, Joao S. O. Bueno via Unicode wrote: > if this is true, than all innovation is writing > is ultimately fated to come to an end as unicode asymptotically encodes > whatever it deems worthy from pre-1999, and then all human writing and > characters should be frozen forever. -------------- next part -------------- An HTML attachment was scrubbed... URL: From duerst at it.aoyama.ac.jp Fri Aug 16 09:29:31 2024 From: duerst at it.aoyama.ac.jp (=?UTF-8?Q?Martin_J=2E_D=C3=BCrst?=) Date: Fri, 16 Aug 2024 23:29:31 +0900 Subject: Hanb in domain labels In-Reply-To: References: Message-ID: Hello Henri, I don't know about Chinese and Bopomofo, but for Japanese, there surely are e.g. company names that contain both Kana and Kanji. And company names are one (although of course not the only) use case for domain names. I'm cc'ing Arnt, who is one of the authors of https://www.ietf.org/archive/id/draft-gulbrandsen-smtputf8-nice-addresses-00.html, which is about email addresses (quite a bit related to domain names) and discusses Chinese quite a bit (although it doesn't mention Bopomofo). Regards, Martin. P.S.: draft-gulbrandsen-smtputf8-nice-addresses-00.html is in my view still in a very early stage; I have read through it but still have to write up my comments. On 2024-08-15 18:08, Henri Sivonen via Unicode wrote: > UTS #39 is commonly used as the baseline for detecting IDN spoofs, and UTS > #39 explicitly allows combining Han and Bopomofo. Considering that ? looks > confusable with ? and ? looks confusable with ?, I?m wondering if it?s > appropriate to explicitly allow this combination in the spoof detection > context. Is combining Han and Bopomofo in one domain label something that > occurs commonly enough in domains that aren?t intended to be spoofs for it > being necessary not to treat the script combination as triggering spoof > detection in the domain name context? > From billposer at alum.mit.edu Fri Aug 16 11:23:58 2024 From: billposer at alum.mit.edu (Bill Poser) Date: Fri, 16 Aug 2024 09:23:58 -0700 Subject: Hanb in domain labels In-Reply-To: References: Message-ID: The use of bopomofo in Chinese is not parallel to the use of kana in Japanese. Whereas kana are routinely mixed with kanji in Japanese, with, e.g., a verb stem written in kanji and the suffixes written in kana, and Japanese can be written entirely in kana (e.g. by young children), bopomofo does not appear in ordinary Chinese text. It is an ancillary system, used, e.g., to give the pronunciation of Chinese characters and is a commonly available input method. That doesn't guarantee that it doesn't occur in email addresses, though I don't recall seeing it. I'm not sure if it is even permitted in the legal name of a company. On Fri, Aug 16, 2024 at 7:32?AM Martin J. D?rst via Unicode < unicode at corp.unicode.org> wrote: > Hello Henri, > > I don't know about Chinese and Bopomofo, but for Japanese, there surely > are e.g. company names that contain both Kana and Kanji. And company > names are one (although of course not the only) use case for domain names. > > I'm cc'ing Arnt, who is one of the authors of > > https://www.ietf.org/archive/id/draft-gulbrandsen-smtputf8-nice-addresses-00.html, > > which is about email addresses (quite a bit related to domain names) and > discusses Chinese quite a bit (although it doesn't mention Bopomofo). > > Regards, Martin. > > P.S.: draft-gulbrandsen-smtputf8-nice-addresses-00.html is in my view > still in a very early stage; I have read through it but still have to > write up my comments. > > On 2024-08-15 18:08, Henri Sivonen via Unicode wrote: > > UTS #39 is commonly used as the baseline for detecting IDN spoofs, and > UTS > > #39 explicitly allows combining Han and Bopomofo. Considering that ? > looks > > confusable with ? and ? looks confusable with ?, I?m wondering if it?s > > appropriate to explicitly allow this combination in the spoof detection > > context. Is combining Han and Bopomofo in one domain label something that > > occurs commonly enough in domains that aren?t intended to be spoofs for > it > > being necessary not to treat the script combination as triggering spoof > > detection in the domain name context? > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jshin1987 at gmail.com Fri Aug 16 12:47:39 2024 From: jshin1987 at gmail.com (=?UTF-8?B?SnVuZ3NoaWsgU0hJTiAo7Iug7KCV7IudKQ==?=) Date: Fri, 16 Aug 2024 10:47:39 -0700 Subject: Hanb in domain labels In-Reply-To: References: Message-ID: I second Bill. The issue raised by Henri makes a lot of sense and we need to consider revising UTS 39 given the usage of Bopomofo (i.e. typically Han and Bopomofo wouldn't be mixed together in identifiers). Jungshik On Fri, Aug 16, 2024, 9:31?AM Bill Poser via Unicode < unicode at corp.unicode.org> wrote: > The use of bopomofo in Chinese is not parallel to the use of kana in > Japanese. Whereas kana are routinely mixed with kanji in Japanese, with, > e.g., a verb stem written in kanji and the suffixes written in kana, and > Japanese can be written entirely in kana (e.g. by young children), bopomofo > does not appear in ordinary Chinese text. It is an ancillary system, used, > e.g., to give the pronunciation of Chinese characters and is a commonly > available input method. That doesn't guarantee that it doesn't occur in > email addresses, though I don't recall seeing it. I'm not sure if it is > even permitted in the legal name of a company. > > On Fri, Aug 16, 2024 at 7:32?AM Martin J. D?rst via Unicode < > unicode at corp.unicode.org> wrote: > >> Hello Henri, >> >> I don't know about Chinese and Bopomofo, but for Japanese, there surely >> are e.g. company names that contain both Kana and Kanji. And company >> names are one (although of course not the only) use case for domain names. >> >> I'm cc'ing Arnt, who is one of the authors of >> >> https://www.ietf.org/archive/id/draft-gulbrandsen-smtputf8-nice-addresses-00.html, >> >> which is about email addresses (quite a bit related to domain names) and >> discusses Chinese quite a bit (although it doesn't mention Bopomofo). >> >> Regards, Martin. >> >> P.S.: draft-gulbrandsen-smtputf8-nice-addresses-00.html is in my view >> still in a very early stage; I have read through it but still have to >> write up my comments. >> >> On 2024-08-15 18:08, Henri Sivonen via Unicode wrote: >> > UTS #39 is commonly used as the baseline for detecting IDN spoofs, and >> UTS >> > #39 explicitly allows combining Han and Bopomofo. Considering that ? >> looks >> > confusable with ? and ? looks confusable with ?, I?m wondering if it?s >> > appropriate to explicitly allow this combination in the spoof detection >> > context. Is combining Han and Bopomofo in one domain label something >> that >> > occurs commonly enough in domains that aren?t intended to be spoofs for >> it >> > being necessary not to treat the script combination as triggering spoof >> > detection in the domain name context? >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.vahi at softf1.com Fri Aug 16 13:23:13 2024 From: martin.vahi at softf1.com (Martin Vahi) Date: Fri, 16 Aug 2024 21:23:13 +0300 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: On 8/14/24 23:05, Charlotte Eiffel Lilith Buff wrote: >... > We no longer live in a world where on-screen text has to be divided > into fixed-width cells of a handful of pixels each, or where text > characters have to be defined solely based on their precise shape on a > particular display device without any underlying semantics, or where > the act of drawing a single image to the screen takes up so much memory > and processing power that you have to ?write? your graphics instead > if you want to do anything more complex than a shopping list ? and we > should be thankful for that. > > Retro computing is cool and all, but I wouldn?t want to buy a PC with > floppy disk drives nowadays. >... Thank You for the answer. About the "retro computing" though, the "cloud AI" era works largely on Virtual Private Servers (VPS) or some analogues (bare hardware rental included), where the party that pays the rent for the computing resources logs in (or uses scripts/bots to log in) to the rented computing resources over SSH, which uses text based terminal user interface. In theory one could use some X11/VNC/RDP protocol and spend a considerable amount of the rented RAM on fancy windowing environment that will not be used most of the time. The EFFICIENT solution: command line tools! At some time in France there was even some community movement, where VPS's were set up to give out about 100 SSH/login accounts to strangers so that the strangers could use the computer/VPS as a meeting place and share thoughs with each other by placing files to that VPS. And indeed they can also use some retro tools: https://www.man7.org/linux/man-pages/man1/talk.1p.html archival copy: https://archive.ph/k6K4u I'm not saying that people should ignore security issues like they do at those community building VPS'es, but I am saying that support by wide variety of tools matters A LOT! For example, all modern general purpose programming languages support text. Many of them support Unicode text encoding, UTF8. From software development point of view the possibility to do graphics by just printing characters adds a really useful capability TO ALL THOSE PROGRAMMING LANGUAGES THAT SUPPORT UTF8. In "cloud AI" era many applications are command line applications. A hack to display a tree of computers/VPS-s that run a distributed application can be implemented by creating folders into each other and then calling the https://linux.die.net/man/1/tree archival copy: https://archive.ph/IPqoG but if some SSH-accessible administration command line utility wants to display a graph that includes a cycle, then a lot of inventiveness might be required. That is to say, command line applications are not as retro as they might seem at first glance. An alternative is to come up with a new character encoding scheme that uses Unicode as a sub-set, a lot like the Unicode used ASCII as its subset. The new character encoding will need to have then some UTF8 analogue that uses UTF8 as its subset. Decent mainstream terminal programs that are meant to be used at the "cloud AI" era must then adapt that new character encoding scheme and then a program like echo "$HAS_THE_VALUE_OF_SOME_FANCY_GRAPHICS_CHARACTER_OR_CHARACTERS" will work at various Bash scripts. The problem with this approach is that the various programming language implementations won't adopt it quickly, but some eventually will. An interim solution might be like Unicode is supported in ASCII-only C++ strings: special coding. Easier said than done, but I guess that it's theoretically possible. Thank You for reading my letter. Yours sincerely, Martin.Vahi at softf1.com From asmusf at ix.netcom.com Fri Aug 16 13:44:30 2024 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 16 Aug 2024 11:44:30 -0700 Subject: Hanb in domain labels In-Reply-To: References: Message-ID: <99c1b059-549c-41aa-869a-a037bc314910@ix.netcom.com> FWIW, Bopomofo is not permitted as part of the DNS root zone. You can't register a top level domain name with it, as you can with Han or Kana. A./ On 8/16/2024 9:23 AM, Bill Poser via Unicode wrote: > The use of bopomofo in Chinese is not parallel to the use of kana in > Japanese. Whereas kana are routinely mixed with kanji in Japanese, > with, e.g., a verb stem written in kanji and the suffixes written in > kana, and Japanese can be written entirely in kana (e.g. by young > children), bopomofo does not appear in ordinary Chinese text. It is an > ancillary system, used, e.g., to give the pronunciation of Chinese > characters and is a commonly available input method. That doesn't > guarantee that it doesn't occur in email addresses, though I don't > recall seeing it. I'm not sure if it is even permitted in the legal > name of a company. > > On Fri, Aug 16, 2024 at 7:32?AM Martin J. D?rst via Unicode > wrote: > > Hello Henri, > > I don't know about Chinese and Bopomofo, but for Japanese, there > surely > are e.g. company names that contain both Kana and Kanji. And company > names are one (although of course not the only) use case for > domain names. > > I'm cc'ing Arnt, who is one of the authors of > https://www.ietf.org/archive/id/draft-gulbrandsen-smtputf8-nice-addresses-00.html, > > which is about email addresses (quite a bit related to domain > names) and > discusses Chinese quite a bit (although it doesn't mention Bopomofo). > > Regards,? ?Martin. > > P.S.: draft-gulbrandsen-smtputf8-nice-addresses-00.html is in my view > still in a very early stage; I have read through it but still have to > write up my comments. > > On 2024-08-15 18:08, Henri Sivonen via Unicode wrote: > > UTS #39 is commonly used as the baseline for detecting IDN > spoofs, and UTS > > #39 explicitly allows combining Han and Bopomofo. Considering > that ? looks > > confusable with ? and ? looks confusable with ?, I?m wondering > if it?s > > appropriate to explicitly allow this combination in the spoof > detection > > context. Is combining Han and Bopomofo in one domain label > something that > > occurs commonly enough in domains that aren?t intended to be > spoofs for it > > being necessary not to treat the script combination as > triggering spoof > > detection in the domain name context? > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From list+unicode at jdlh.com Fri Aug 16 13:56:35 2024 From: list+unicode at jdlh.com (Jim DeLaHunt) Date: Fri, 16 Aug 2024 11:56:35 -0700 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: <0236ce1e-4ee3-4813-92ec-8a01b99588a8@jdlh.com> On 2024-08-16 11:23, Martin Vahi via Unicode wrote: > ? the "cloud AI" era works largely on Virtual Private Servers (VPS) or > some > analogues?, where the party that pays the rent for > the computing resources logs in (or uses scripts/bots to log in) to the > rented computing resources over SSH, which uses text based terminal user > interface. In theory one could use some X11/VNC/RDP protocol and spend > a considerable amount of the rented RAM on fancy windowing environment > that will not be used most of the time.? Or, and here is a crazy idea, one might run an HTTP server on the VPS, and have it serve HTML pages, and have the clients connect using existing web browsers, which can display the mixed graphics, text, and interactivity of the HTML pages. > ?The EFFICIENT solution: command line tools!? The thing to be careful about when appealing to efficiency is the question of which costs you include in your measure of efficiency, and which costs you externalise and thus ignore. Among the costs of extending text constructs to include graphics are the cost of making fonts with more and more graphic "characters", the costs of figuring out how the make graphics out of the "characters", and the costs of text complexity for those other applications that don't need or want the graphic capability you are adding. > ? all modern general > purpose programming languages support text. Many of them support Unicode > text encoding, UTF8. From software development point of view the > possibility to do graphics by just printing characters adds a really > useful capability TO ALL THOSE PROGRAMMING LANGUAGES THAT SUPPORT UTF8. All those modern languages also have support for generating HTML and implementing HTML-based applications. > In "cloud AI" era many applications are command line applications.? In a "cloud AI" era many applications are distributed systems which communicate via network-mediated interfaces. I would argue that the command line is among the less important interfaces of those applications. The network interfaces to talk to their distributed relatives are more important. > ? if some SSH-accessible administration command line utility wants to > display a graph that includes a cycle, then a lot of inventiveness might > be required.? Or, the command line utility could spin up an HTTP server hosting an HTML page with an SVG illustration of the graph and its cycle, then print on stdout the URL of that page. That is a relatively low cost to implement, because there will probably be libraries available for the HTTP server, the HTML generation, and the SVG generation. > An alternative is to come up with a new character encoding scheme that > uses Unicode as a sub-set, a lot like the Unicode used ASCII as its > subset.? The new character encoding will need to have then some UTF8 > analogue that uses UTF8 as its subset.? Decent mainstream terminal > programs that are meant to be used at the "cloud AI" era must then adapt > that new character encoding scheme?. The problem with this approach is > that the various programming language implementations won't adopt it > quickly, but some eventually will. An interim solution might be like > Unicode is supported in ASCII-only C++ strings: special coding. > Easier said than done, but I guess that it's theoretically possible. Or, use the alternate, text plus graphics paradigms which already support this. The Unicode design principles say that Unicode is centred on exchange of universal plain text, and it leaves a lot of domains to "higher-level protocols". Why are you so reluctant to use the higher-level protocols? -- . --Jim DeLaHunt, jdlh at jdlh.com http://blog.jdlh.com/ (http://jdlh.com/) multilingual websites consultant, Vancouver, B.C., Canada From irgendeinbenutzername at gmail.com Fri Aug 16 14:03:12 2024 From: irgendeinbenutzername at gmail.com (Charlotte Eiffel Lilith Buff) Date: Fri, 16 Aug 2024 21:03:12 +0200 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> Message-ID: Text-only pseudo-graphical applications certainly still exist, I just don?t see any value in catering specifically to their limitations in this day and age, especially considering how many options Unicode already offers. If you want to draw borders, the classic Box Drawing block alone has half a dozen different stroke styles to choose from. If you want to draw a progress bar or a bar chart, block elements go down to one-eighths slices. If you want to divide the screen into pixel cells to construct larger images, Unicode gives you block quadrants, sextants, and (starting with 16.0) even octants to work with. I understand the drive for artistic expression, but anything more fancy than that honestly seems like a waste of resources for something that is primarly meant to convey debugging information to software engineers. Line graphs can reliably show you general data trends even at relatively low resolutions; if you needed precise values you?d just look at a data table instead. Nothing would really be gained in this regard by the ability to address individual pixels on screen, with every such monochrome pixel taking up 32 whole bits in memory. Of course, if any application does turn out to need more flexible pseudo-graphics capabilities than Unicode can provide, the Private Use Area can be used to assign any glyph one desires to valid Unicode code points. Text like this is unlikely to escape the confines of its originating machine, let alone be transmitted to a device running a completely different operating system, anyway, so you wouldn?t even need to make sure to include your own fonts with the data. Am Fr., 16. Aug. 2024 um 20:23 Uhr schrieb Martin Vahi < martin.vahi at softf1.com>: > On 8/14/24 23:05, Charlotte Eiffel Lilith Buff wrote: > >... > > We no longer live in a world where on-screen text has to be divided > > into fixed-width cells of a handful of pixels each, or where text > > characters have to be defined solely based on their precise shape on a > > particular display device without any underlying semantics, or where > > the act of drawing a single image to the screen takes up so much memory > > and processing power that you have to ?write? your graphics instead > > if you want to do anything more complex than a shopping list ? and we > > should be thankful for that. > > > > Retro computing is cool and all, but I wouldn?t want to buy a PC with > > floppy disk drives nowadays. > >... > > Thank You for the answer. About the "retro computing" though, the "cloud > AI" era works largely on Virtual Private Servers (VPS) or some analogues > (bare hardware rental included), where the party that pays the rent for > the computing resources logs in (or uses scripts/bots to log in) to the > rented computing resources over SSH, which uses text based terminal user > interface. In theory one could use some X11/VNC/RDP protocol and spend > a considerable amount of the rented RAM on fancy windowing environment > that will not be used most of the time. The EFFICIENT solution: command > line tools! > > At some time in France there was even some community movement, where > VPS's were set up to give out about 100 SSH/login accounts to strangers > so that the strangers could use the computer/VPS as a meeting place and > share thoughs with each other by placing files to that VPS. And indeed > they can also use some retro tools: > > https://www.man7.org/linux/man-pages/man1/talk.1p.html > archival copy: https://archive.ph/k6K4u > > I'm not saying that people should ignore security issues like they do > at those community building VPS'es, but I am saying that support by > wide variety of tools matters A LOT! For example, all modern general > purpose programming languages support text. Many of them support Unicode > text encoding, UTF8. From software development point of view the > possibility to do graphics by just printing characters adds a really > useful capability TO ALL THOSE PROGRAMMING LANGUAGES THAT SUPPORT UTF8. > In "cloud AI" era many applications are command line applications. > A hack to display a tree of computers/VPS-s that run a distributed > application can be implemented by creating folders into each other and > then calling the > > https://linux.die.net/man/1/tree > archival copy: https://archive.ph/IPqoG > > but if some SSH-accessible administration command line utility wants to > display a graph that includes a cycle, then a lot of inventiveness might > be required. That is to say, command line applications are not as retro > as they might seem at first glance. > > An alternative is to come up with a new character encoding scheme that > uses Unicode as a sub-set, a lot like the Unicode used ASCII as its > subset. The new character encoding will need to have then some UTF8 > analogue that uses UTF8 as its subset. Decent mainstream terminal > programs that are meant to be used at the "cloud AI" era must then adapt > that new character encoding scheme and then a program like > > echo "$HAS_THE_VALUE_OF_SOME_FANCY_GRAPHICS_CHARACTER_OR_CHARACTERS" > > will work at various Bash scripts. The problem with this approach is > that the various programming language implementations won't adopt it > quickly, but some eventually will. An interim solution might be like > Unicode is supported in ASCII-only C++ strings: special coding. > Easier said than done, but I guess that it's theoretically possible. > > Thank You for reading my letter. > > Yours sincerely, > Martin.Vahi at softf1.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From list+unicode at jdlh.com Fri Aug 16 14:32:11 2024 From: list+unicode at jdlh.com (Jim DeLaHunt) Date: Fri, 16 Aug 2024 12:32:11 -0700 Subject: Hanb in domain labels In-Reply-To: References: Message-ID: <75f46670-7b85-4eb1-bcaa-ee47b881ac2e@jdlh.com> On 2024-08-15 02:08, Henri Sivonen via Unicode wrote: > UTS #39 is commonly used as the baseline for detecting IDN spoofs, and > UTS #39 explicitly allows combining Han and Bopomofo. Considering that > ? looks confusable with ? and ? looks confusable with ?, I?m wondering > if it?s appropriate to explicitly allow this combination in the spoof > detection context.? Are you asking about whether UTS #39 should allow this combination vs being changed to forbid this combination? Or are you asking about whether the rules of the Domain Name System should allow this combination? I am involved with Universal Acceptance advocacy[1]. That means I have one foot in the DNS world, and the ICANN rules which govern it. I am not an expert, but I am aware of some principles there. My understanding is that the DNS world writes its own rules for detecting and preventing IDN spoofs. I have not heard that UTS #39 is a fundamental document for them. > Is combining Han and Bopomofo in one domain label something that > occurs commonly enough in domains?? This sounds like a question about what the DNS, what names are already registered, and what are the rules for registering further names. The former is backward-looking, the latter is forward-looking. Thus the answer has two parts. For the backward-looking question,?I have some awareness of the rules ICANN has put into place. Again, I am not an expert, but I have heard experts talk about some of the terminology and concepts. The ICANN communities have put a lot of effort in recent years into "Label Generation Rules". ("Label" means the identifiers separated by periods in a domain name. In "example.com", "example" and "com" are Labels.) The LGRs are script-specific, so there are LGRs for scripts like Chinese, Bangla, Arabic, etc. The LGRs specifically try to prevent spoofs and confusion between labels. The LGRs define a repertoire of characters which may be used in a label. They define characters or strings which are variants of each other, which a human reader might consider to have the same meaning. There are rules about the registration of one variant label requires that the other variant labels either be registered to the same entity, or be protected from registration. There are a set of Label Generation Rules for the root zone[2] of the DNS. They include rules for Chinese script labels[3] in the root zone. In my simple-minded reading of those rules, Bopomofo characters are not included in the repertoire. I suspect that means that the rules prevent anyone from registering a .???? top-level domain, or a Chinese domain with Bopomofo inclusions. I understand that each top-level registry sets the rules for second-level labels they will accept, though there is pressure from ICANN communities to adopt standard LGRs. There are a set of suggested Label Generation Rules for second-level labels[4]. As I read those rules, at a superficial level they also seem to rule out Bopomofo characters within Chinese language labels or Bopomofo-only labels. If you really want to understand what rules govern domain names, don't rely on my simple-minded understanding. Get in touch with ICANN communities[5] who specialise in those rules. The Generic Names Supporting Organisation might be a good place to start. For the backward-looking question, about what names are already registered in various top-level domains, I don't have specific information. I have the impression that a lot of domain names were registered before the current LGRs were developed. I won't be surprised to hear that some of them don't comply with the LGRs. For instance, the .com and .org domains might have registered some labels with Bopomofo characters in the page. Again, the ICANN communities[5] would be a place to ask. All of that seems to say that (if my understanding is correct), "combining Han and Bopomofo in one domain label" is not "something that occurs commonly? in domains" registered under the LGRs, but that might have occurred with legacy labels registered in the past. Does this help answer your questions? ????? ?Jim DeLaHunt [1] [2] [3] [4] [5] -- . --Jim DeLaHunt, jdlh at jdlh.com http://blog.jdlh.com/ (http://jdlh.com/) multilingual websites consultant, Vancouver, B.C., Canada From beckiergb at gmail.com Sat Aug 17 03:49:06 2024 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Sat, 17 Aug 2024 01:49:06 -0700 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: <0236ce1e-4ee3-4813-92ec-8a01b99588a8@jdlh.com> References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> <0236ce1e-4ee3-4813-92ec-8a01b99588a8@jdlh.com> Message-ID: How did this go from schematic symbols (something that is, however *improbable*, at least *possible* to argue for encoding) to arbitrary bitmaps (something that is never going to become part of Unicode, ever)? Look up sixel graphics and ReGIS. Spend energy on getting terminal emulators and command line applications to support these already-existing higher-level protocols instead of trying to half-bake a new one into a text encoding standard that will never accept it. -- Rebecca Bettencourt On Fri, Aug 16, 2024 at 12:00?PM Jim DeLaHunt via Unicode < unicode at corp.unicode.org> wrote: > On 2024-08-16 11:23, Martin Vahi via Unicode wrote: > > > ? the "cloud AI" era works largely on Virtual Private Servers (VPS) or > > some > > analogues?, where the party that pays the rent for > > the computing resources logs in (or uses scripts/bots to log in) to the > > rented computing resources over SSH, which uses text based terminal user > > interface. In theory one could use some X11/VNC/RDP protocol and spend > > a considerable amount of the rented RAM on fancy windowing environment > > that will not be used most of the time.? > > Or, and here is a crazy idea, one might run an HTTP server on the VPS, > and have it serve HTML pages, and have the clients connect using > existing web browsers, which can display the mixed graphics, text, and > interactivity of the HTML pages. > > > > ?The EFFICIENT solution: command line tools!? > > The thing to be careful about when appealing to efficiency is the > question of which costs you include in your measure of efficiency, and > which costs you externalise and thus ignore. Among the costs of > extending text constructs to include graphics are the cost of making > fonts with more and more graphic "characters", the costs of figuring out > how the make graphics out of the "characters", and the costs of text > complexity for those other applications that don't need or want the > graphic capability you are adding. > > > > ? all modern general > > purpose programming languages support text. Many of them support Unicode > > text encoding, UTF8. From software development point of view the > > possibility to do graphics by just printing characters adds a really > > useful capability TO ALL THOSE PROGRAMMING LANGUAGES THAT SUPPORT UTF8. > > All those modern languages also have support for generating HTML and > implementing HTML-based applications. > > > > In "cloud AI" era many applications are command line applications.? > In a "cloud AI" era many applications are distributed systems which > communicate via network-mediated interfaces. I would argue that the > command line is among the less important interfaces of those > applications. The network interfaces to talk to their distributed > relatives are more important. > > ? if some SSH-accessible administration command line utility wants to > > display a graph that includes a cycle, then a lot of inventiveness might > > be required.? > > Or, the command line utility could spin up an HTTP server hosting an > HTML page with an SVG illustration of the graph and its cycle, then > print on stdout the URL of that page. That is a relatively low cost to > implement, because there will probably be libraries available for the > HTTP server, the HTML generation, and the SVG generation. > > > > An alternative is to come up with a new character encoding scheme that > > uses Unicode as a sub-set, a lot like the Unicode used ASCII as its > > subset. The new character encoding will need to have then some UTF8 > > analogue that uses UTF8 as its subset. Decent mainstream terminal > > programs that are meant to be used at the "cloud AI" era must then adapt > > that new character encoding scheme?. The problem with this approach is > > that the various programming language implementations won't adopt it > > quickly, but some eventually will. An interim solution might be like > > Unicode is supported in ASCII-only C++ strings: special coding. > > Easier said than done, but I guess that it's theoretically possible. > > Or, use the alternate, text plus graphics paradigms which already > support this. The Unicode design principles say that Unicode is centred > on exchange of universal plain text, and it leaves a lot of domains to > "higher-level protocols". Why are you so reluctant to use the > higher-level protocols? > > > -- > . --Jim DeLaHunt, jdlh at jdlh.com http://blog.jdlh.com/ ( > http://jdlh.com/) > multilingual websites consultant, Vancouver, B.C., Canada > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.vahi at softf1.com Sun Aug 18 15:54:45 2024 From: martin.vahi at softf1.com (Martin Vahi) Date: Sun, 18 Aug 2024 23:54:45 +0300 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: <0236ce1e-4ee3-4813-92ec-8a01b99588a8@jdlh.com> References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> <0236ce1e-4ee3-4813-92ec-8a01b99588a8@jdlh.com> Message-ID: First of all, thank You to everybody for their helpful answers. I think that I'll settle with what R.B. suggests at the next quote: On 8/17/24 11:49, Rebecca Bettencourt via Unicode wrote: >... > Look up sixel graphics and ReGIS. Spend energy on getting > terminal emulators and command line applications to support > these already-existing higher-level protocols instead of > trying to half-bake a new one into a text encoding standard > that will never accept it. >... What regards to the J.D. statement at the next quote On 8/16/24 21:56, Jim DeLaHunt via Unicode wrote: >... > Or, use the alternate, text plus graphics paradigms which > already support this. The Unicode design principles say that > Unicode is centred on exchange of universal plain text, and it > leaves a lot of domains to "higher-level protocols". Why are > you so reluctant to use the higher-level protocols? >... then, again, thank You, it's a helpful read, but my answer to the question about my reluctance to using those higher-level protocols is that from software developer's perspective I just want to WRITE ONCE WITHOUT NEEDING TO REWRITE OLD SOFTWARE unless the end user requirements for the software change. For example, in my view, if some application uses Adobe Flash or Java Applets or Microsoft Silverlight or VRML that at some point are not supported by mainstream web browsers or their default-installed-plugins, then from my perspective the need to swap out the code parts that depend on or are implemented in Java Applets and alike is NOT AN END USER REQUIREMENT CHANGE but just a nuisance due to technology trend changes. End users could not care less, if the 3D thing that they move with their mouse is in some Java Applet or WebGL or what ever else, as long as it just works for them without fiddling with the computer. Text based things TEND TO WORK, but those higher-level protocols tend to NOT BE RELIABLY AVAILABLE. An example: anything with an URL that starts with https. The moment some crypto algorithms change, may be some signing servers at the chain of domain authentication change, there is suddenly a need for those changes to be reflected at client computer, be it some "trusted certificates" or recompilation of a newer version of the openssl library (in Linux world). That's not exactly a kind of technology that one can use at some laptop that is shipped with industrial equipment for changing the settings of that equipment about 10 years after the sales, unless there is some direct cable going from the industrial equipment to the laptop that was shipped with it, provided that the laptop will even boot after its Flash memory based SSD has lost the data over time. With magnetic disks there's actually hope that the computer will boot, if the disks have been kept cool enough. As of 2024_08 the best workaround that I'm aware of to the data loss problem is to ship MDisc DVD's for reinstalling everything from scratch or to use MDisc based "live" disks. With the MDisc DVDs or MDisc BluRay based solution one can use what ever fancy software of a given era, with the caveat that electrolytic capacitors on motherboards have an approximate life span of 20 years even in storage. According to some statements on the Wild-Wild-Web the new supercapacitors do not last longer than the old style electrolytic capacitors. Basically, the best bet from hardware point of view is to try to craft software so that it would work with future computers, which might lead one to think about the use of emulators: just install all in a virtual appliance and the virtual appliance will run on future computers. The virtual appliance based approach is, what is being tried for running old industrial equipment and there even seem to be virtual appliance running software agnostic virtual appliance storage device image formats like the OVF https://www.dmtf.org/standards/ovf archival copy: https://archive.ph/rZJdJ but, again, will that be the "new Java Applet"/"new Microsoft Silverlight"/"new Adobe Flash"? Virtual Appliance running software is not exactly a small piece of software that somebody could easily maintain oneself as a small side project, although with the future RISC-V CPU-s there might be hope that all future CPUs will support the basic RISC-V instruction set and then there might be a chance to write the emulators once and the core of them would work on all CPUs and the virtual appliance running software becomes maintainable to a small team of people, who maintain it as a small side project. If Intel and AMD intentionally avoid supporting the RISC-V instruction set, then that might be solved through some general emulator layer that Linux will have anyway, so no problem even with those megacorporations, unless they start making efforts to NOT run Linux, in which case they, Intel and AMD, would probably be eliminated from server market. On the other hand, Intel and AMD might not mind taking the same path business wise that a huge market leader named Nokia took... That is to say, as of 2024_08 I'm just too stupid to know, what to use other than text for creating any kind of reliably future proof user interfaces. I guess old style HTML with images and open source "retro-browsers" like Dillo or RetroZilla https://rn10950.github.io/RetroZillaWeb/ might actually work too. If somebody can maintain/create/develop the RetroZilla web browser nowadays(2024), then there's hope that some analogue of it will be available in the future too. The problem, although relatively minor, is that there needs to be some connection between the "RetroZilla" and the command line application and that connection uses operating system API. Opening a port for HTTP is an operating system matter. May be future operating systems might need some extra fiddling to get "legacy IPv4 port support" working. On top of that Microsoft Windows like operating systems might just plain block anything that wants to open a port, a lot like Windows10 has those dialogues that ask for firewall hole permissions whenever a newly installed application wants to connect to the Wild-Wild-Web. Obviously Microsoft and Apple like companies could not care less, whether some small freelancer like me can ship software that my remote clients are actually able to start using without looking for some local IT-support guy to click the "right OK buttons". Or in the case of industrial cases, Microsoft and Apple will not pay for the working hours and travel costs of industrial equipment manufacturing technicians that need to travel to the other side of the globe to do some mundane setup work. At the same time industrial electronics manufacturers tend to provide software libraries and drivers only for Windows, so Linux is less of an option in industrial equipment that contains that industrial electronics. (Factory owners refuse make their factory robots remotely accessible due to a fear of being blackmailed by ransomware creators.) Plain text seems to stand the test of time the best, despite the awful pre-Unicode encodings era, where may be only the Japanese got it right from the very start with their Tron encoding. http://justsolve.archiveteam.org/wiki/TRON_code archival copy: https://archive.ph/KZpC8 I'm just yearning for a possibility to write software once without it stop working due to 3rd party activity. Text based user interfaces seem to be the most robust ones that I'm currently, as of 2024_08, aware of. That explains, why I'm reluctant to use higher-level protocols. Thank You for reading my letter and thank You for Your answers. From doug at ewellic.org Sun Aug 18 18:13:02 2024 From: doug at ewellic.org (Doug Ewell) Date: Sun, 18 Aug 2024 23:13:02 +0000 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> <0236ce1e-4ee3-4813-92ec-8a01b99588a8@jdlh.com> Message-ID: I have to concede that I no longer have any idea what Martin Vahi is proposing to encode in plain text. At first I thought he was talking about static, monochrome, two-dimensional graphics, but his most recent message suggests something very different. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From cate at cateee.net Mon Aug 19 02:10:19 2024 From: cate at cateee.net (Giacomo Catenazzi) Date: Mon, 19 Aug 2024 09:10:19 +0200 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> <0236ce1e-4ee3-4813-92ec-8a01b99588a8@jdlh.com> Message-ID: <0209df00-873c-403b-a5ff-5e611ca923a3@cateee.net> You are diverging the discussion. Write once: do it in HTML: we can write and use very old pages. For such reason a markup language for technical design may be appropriate. On Unicode: it will be ugly and not so useful. Again: you see maybe no real usage of the technical design characters we already have in Unicode. It is just the wrong approach and nobody like it. Terminals can display images (photos) and graphics since a lot of years. I know it, and I use seldom for harfbuzz (some helper/debugger toools in there). For photo: I used few times just to check capabilities. And it is the general experience (maybe most of people never tried it). Why? It may seem a good idea (so maintainers of terminals implemented it), but no practical usage (we do not use it). HTML views are much better and supports much more language, or we keep simple text only terminal with few features. What it is in between it is just a bad trade-off. Which it seems also your proposal. With a proper markup language we can design simple diagram, and so telling people how an oscillator work, or an amplifier, with annotation for R1, R2, C1, etc. and maybe also writting the resistance in Ohm. With your proposal it will be ugly or impossible, and it will work only for few languages. Terminals sucks on Unicode: Unicode provide single and double width characteristics to characters, to help terminals, but still... way far from real terminal display which can be useable. And Unicode should not implement full maths writing (it will be ugly and not user-friendly: we need a mark-up language), so your proposal (on terminal) would make no sense: putting some electronic components, but we cannot draw simple circuits, annotate them, and write the equations. Also terminals have (usually) no magnifying function), and most cannot contain too many "pixels" (also as default), so we need to handle paging and the rest (so not just plain text, but a rendering engine), and so redoing HTML but on a "budget" and with much more technical limitations. The worst of the two world. Also terminal are not "write once". I'm using Linux since 1996 and I experience that. At beginning we had to try to guess the best VTxxx and set it. Much later "linux" was added, and "xterm", etc. And then all problems with colours: terminal identifier was not a precise identifiers of capabilities, so many programs had different idea. Unicode entered later, and initially support was just for "ISO 10646 Level 1" (see Unicode as "Level 3". Note: Levels are obsolete). Telnet also changed (assumption of characters). SSH is evolving and we cannot use old protocols. You listed HTTP and HTTPS as difference, but who cares? As content authors (which it seems the context of this proposal) it doesn't matter. Just the system administrator will set-up the server. But you need such person in any way: updated are needed: security bugs and other bugs needs to be corrected. And adding your symbols: it requires a lot of works and updated of many components to be able to be displayed, and hoping most systems will install a default font for your symbols (as you see default fonts lacks many characters, so forget about it): HTML allows you to use "webfonts" which are normal fonts, but loaded on demand, and provided by server. And because a reference implementation could be done in Javascript, transforming the language into SVG, it may work without any work or update on user side. You really underestimate how much work should be done in each computer to have terminals to display technical text, both sides (user and content), and not on all languages. And ugly. We may be (unfortunately) used on terminal and monospace fonts, but let's face it: they are ugly, slower to read, and if you add Greek characters (as done in electrical charts) into terminals, our eyes have difficulties to read quickly. Also colours are not standardized. Bold is inconsistent. And it requires good terminal client and good fonts installed on user side. Or show us examples of what you mean, so we can test on our terminals, and so we may get an idea that someone will spend time to create content and using the extension.? Else we should just wait until we had such good example and then re-evaluate. For now it is not useful to continue discussion. Show your work! giacomo On 2024-08-18 22:54, Martin Vahi via Unicode wrote: > > First of all, thank You to everybody for their helpful answers. > I think that I'll settle with what R.B. suggests at the next quote: > > On 8/17/24 11:49, Rebecca Bettencourt via Unicode wrote: > >... > > Look up sixel graphics and ReGIS. Spend energy on getting > > terminal emulators and command line applications to support > > these already-existing higher-level protocols instead of > > trying to half-bake a new one into a text encoding standard > > that will never accept it. > >... > > What regards to the J.D. statement at the next quote > > On 8/16/24 21:56, Jim DeLaHunt via Unicode wrote: > >... > > Or, use the alternate, text plus graphics paradigms which > > already support this. The Unicode design principles say that > > Unicode is centred on exchange of universal plain text, and it > > leaves a lot of domains to "higher-level protocols". Why are > > you so reluctant to use the higher-level protocols? > >... > > then, again, thank You, it's a helpful read, but my answer to the > question about my reluctance to using those higher-level protocols is > that from software developer's perspective I just want to WRITE ONCE > WITHOUT NEEDING TO REWRITE OLD SOFTWARE unless the end user requirements > for the software change. For example, in my view, if some application > uses Adobe Flash or Java Applets or Microsoft Silverlight or VRML that > at some point are not supported by mainstream web browsers or their > default-installed-plugins, then from my perspective the need to swap > out the code parts that depend on or are implemented in Java Applets > and alike is NOT AN END USER REQUIREMENT CHANGE but just a nuisance due > to technology trend changes. End users could not care less, if the 3D > thing that they move with their mouse is in some Java Applet or WebGL or > what ever else, as long as it just works for them without fiddling with > the computer. Text based things TEND TO WORK, but those higher-level > protocols tend to NOT BE RELIABLY AVAILABLE. > > An example: anything with an URL that starts with https. The moment some > crypto algorithms change, may be some signing servers at the chain of > domain authentication change, there is suddenly a need for those changes > to be reflected at client computer, be it some "trusted certificates" > or recompilation of a newer version of the openssl library (in Linux > world).? That's not exactly a kind of technology that one can use at > some laptop that is shipped with industrial equipment for changing > the settings of that equipment about 10 years after the sales, unless > there is some direct cable going from the industrial equipment to the > laptop that was shipped with it, provided that the laptop will even > boot after its Flash memory based SSD has lost the data over time. With > magnetic disks there's actually hope that the computer will boot, if > the disks have been kept cool enough. As of 2024_08 the best workaround > that I'm aware of to the data loss problem is to ship MDisc DVD's for > reinstalling everything from scratch or to use MDisc based "live" disks. > With the MDisc DVDs or MDisc BluRay based solution one can use what > ever fancy software of a given era, with the caveat that electrolytic > capacitors on motherboards have an approximate life span of 20 years > even in storage. According to some statements on the Wild-Wild-Web the > new supercapacitors do not last longer than the old style electrolytic > capacitors. Basically, the best bet from hardware point of view is to > try to craft software so that it would work with future computers, which > might lead one to think about the use of emulators: just install all > in a virtual appliance and the virtual appliance will run on future > computers. > > The virtual appliance based approach is, what is being tried for running > old industrial equipment and there even seem to be virtual appliance > running software agnostic virtual appliance storage device image formats > like the OVF > > ??? https://www.dmtf.org/standards/ovf > ??? archival copy: https://archive.ph/rZJdJ > > but, again, will that be the > "new Java Applet"/"new Microsoft Silverlight"/"new Adobe Flash"? > Virtual Appliance running software is not exactly a small piece of > software that somebody could easily maintain oneself as a small side > project, although with the future RISC-V CPU-s there might be hope that > all future CPUs will support the basic RISC-V instruction set and then > there might be a chance to write the emulators once and the core of > them would work on all CPUs and the virtual appliance running software > becomes maintainable to a small team of people, who maintain it as a > small side project. > > If Intel and AMD intentionally avoid supporting the RISC-V instruction > set, then that might be solved through some general emulator layer that > Linux will have anyway, so no problem even with those megacorporations, > unless they start making efforts to NOT run Linux, in which case they, > Intel and AMD, would probably be eliminated from server market. On the > other hand, Intel and AMD might not mind taking the same path business > wise that a huge market leader named Nokia took... > > That is to say, as of 2024_08 I'm just too stupid to know, what to use > other > than text for creating any kind of reliably future proof user interfaces. > I guess old style HTML with images and open source "retro-browsers" like > Dillo or RetroZilla > > ??? https://rn10950.github.io/RetroZillaWeb/ > > might actually work too. If somebody can maintain/create/develop the > RetroZilla web browser nowadays(2024), then there's hope that some > analogue of it will be available in the future too. The problem, > although relatively minor, is that there needs to be some connection > between the "RetroZilla" and the command line application and that > connection uses operating system API. Opening a port for HTTP is an > operating system matter. May be future operating systems might need > some extra fiddling to get "legacy IPv4 port support" working. On top > of that Microsoft Windows like operating systems might just plain > block anything that wants to open a port, a lot like Windows10 has > those dialogues that ask for firewall hole permissions whenever a newly > installed application wants to connect to the Wild-Wild-Web. Obviously > Microsoft and Apple like companies could not care less, whether some > small freelancer like me can ship software that my remote clients are > actually able to start using without looking for some local IT-support > guy to click the "right OK buttons". Or in the case of industrial cases, > Microsoft and Apple will not pay for the working hours and travel costs > of industrial equipment manufacturing technicians that need to travel > to the other side of the globe to do some mundane setup work. At the > same time industrial electronics manufacturers tend to provide software > libraries and drivers only for Windows, so Linux is less of an option in > industrial equipment that contains that industrial electronics. (Factory > owners refuse make their factory robots remotely accessible due to a > fear of being blackmailed by ransomware creators.) > > Plain text seems to stand the test of time the best, despite the awful > pre-Unicode encodings era, where may be only the Japanese got it right > from the very start with their Tron encoding. > > ??? http://justsolve.archiveteam.org/wiki/TRON_code > ??? archival copy: https://archive.ph/KZpC8 > > I'm just yearning for a possibility to write software once without it > stop working due to 3rd party activity. Text based user interfaces seem > to be the most robust ones that I'm currently, as of 2024_08, aware of. > That explains, why I'm reluctant to use higher-level protocols. > > Thank You for reading my letter and > thank You for Your answers. From hsivonen at mozilla.com Mon Aug 19 05:33:48 2024 From: hsivonen at mozilla.com (Henri Sivonen) Date: Mon, 19 Aug 2024 13:33:48 +0300 Subject: Hanb in domain labels In-Reply-To: <75f46670-7b85-4eb1-bcaa-ee47b881ac2e@jdlh.com> References: <75f46670-7b85-4eb1-bcaa-ee47b881ac2e@jdlh.com> Message-ID: On Fri, Aug 16, 2024 at 10:32?PM Jim DeLaHunt wrote: > On 2024-08-15 02:08, Henri Sivonen via Unicode wrote: > > > UTS #39 is commonly used as the baseline for detecting IDN spoofs, and > > UTS #39 explicitly allows combining Han and Bopomofo. Considering that > > ? looks confusable with ? and ? looks confusable with ?, I?m wondering > > if it?s appropriate to explicitly allow this combination in the spoof > > detection context.? > > Are you asking about whether UTS #39 should allow this combination vs > being changed to forbid this combination? Or are you asking about > whether the rules of the Domain Name System should allow this combination? > Foremost I'm asking if it's appropriate that browsers that in general refuse to render mixed-script domain labels in the Unicode form in the user interface (in the URL bar in particular) make an exception, due to UTS #39 making an exception, for the combination of Han and Bopomofo. Alternative possible behaviors would be treating Han and Bopomofo in one label the way e.g. mixing Greek and Cyrillic in one label is treated: Refusing to render the label in the Unicode form. A more complex possibility would be to check if a label that contains both Han and Bopomofo contains specific confusable characters and refuse to render Hanb labels in the Unicode form if specific confusable characters (Han or Bopomofo) are present. If the conclusion is that either of the alternative behaviors above would be more appropriate than special-casing Han+Bopomofo as a permitted mixed-script combination, the next question is whether UTS #39 should change accordingly. > I am involved with Universal Acceptance advocacy[1]. That means I have > one foot in the DNS world, and the ICANN rules which govern it. I am not > an expert, but I am aware of some principles there. My understanding is > that the DNS world writes its own rules for detecting and preventing IDN > spoofs. I have not heard that UTS #39 is a fundamental document for them. > UTS #39 is a fundamental document for Firefox and Chrome (could be for Safari, too, but I don't know) as the baseline of IDN spoof detection. (More checks are layered on top, though.) > > Is combining Han and Bopomofo in one domain label something that > > occurs commonly enough in domains?? > > This sounds like a question about what the DNS, what names are already > registered, and what are the rules for registering further names. The > former is backward-looking, the latter is forward-looking. Thus the > answer has two parts. > Indeed. Though the backward-looking history is long enough that future demand can probably be inferred from the backward-looking part. > For the backward-looking question, I have some awareness of the rules > ICANN has put into place. Again, I am not an expert, but I have heard > experts talk about some of the terminology and concepts. > > The ICANN communities have put a lot of effort in recent years into > "Label Generation Rules". ("Label" means the identifiers separated by > periods in a domain name. In "example.com", "example" and "com" are > Labels.) The LGRs are script-specific, so there are LGRs for scripts > like Chinese, Bangla, Arabic, etc. The LGRs specifically try to prevent > spoofs and confusion between labels. The LGRs define a repertoire of > characters which may be used in a label. They define characters or > strings which are variants of each other, which a human reader might > consider to have the same meaning. There are rules about the > registration of one variant label requires that the other variant labels > either be registered to the same entity, or be protected from registration. > > There are a set of Label Generation Rules for the root zone[2] of the > DNS. They include rules for Chinese script labels[3] in the root zone. > In my simple-minded reading of those rules, Bopomofo characters are not > included in the repertoire. I suspect that means that the rules prevent > anyone from registering a .???? top-level domain, or a Chinese domain > with Bopomofo inclusions. > It indeed looks like the root LGRs currently don't allow Bopomofo, but it appears that they also don't allow Cyrillic TLDs, which do exist, so it seems that root LGRs are enough in a work-in-progress state not to draw definite conclusions from. > I understand that each top-level registry sets the rules for > second-level labels they will accept, though there is pressure from > ICANN communities to adopt standard LGRs. There are a set of suggested > Label Generation Rules for second-level labels[4]. As I read those > rules, at a superficial level they also seem to rule out Bopomofo > characters within Chinese language labels or Bopomofo-only labels. > That particular rule set also excludes Hiragana and Katakana, so it's not clear that LGRs for Hani existing means the exclusion of Hanb, Jpan, and Kore. (I didn't ask about Jpan in my initial post, despite Han ? and Katakana ? existing, because of the different role of Hiragana and Katakana compared to the role of Bopomofo. I didn't ask about Kore, because I'm not aware of a confusability issue even if I have doubts about demand for Han + Hangul domain labels. I am curious, though, how users and domain holders deal with the ? vs. ? issue. Is the glyph size distinction consistent and obvious enough?) > All of that seems to say that (if my understanding is correct), > "combining Han and Bopomofo in one domain label" is not "something that > occurs commonly? in domains" registered under the LGRs, but that might > have occurred with legacy labels registered in the past. > Thanks. -- Henri Sivonen hsivonen at mozilla.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From list+unicode at jdlh.com Mon Aug 19 14:19:35 2024 From: list+unicode at jdlh.com (Jim DeLaHunt) Date: Mon, 19 Aug 2024 12:19:35 -0700 Subject: Hanb in domain labels In-Reply-To: References: <75f46670-7b85-4eb1-bcaa-ee47b881ac2e@jdlh.com> Message-ID: On 2024-08-19 03:33, Henri Sivonen wrote: > On Fri, Aug 16, 2024 at 10:32?PM Jim DeLaHunt > wrote: > > On 2024-08-15 02:08, Henri Sivonen via Unicode wrote: > > > UTS #39 is commonly used as the baseline for detecting IDN > spoofs, and > > UTS #39 explicitly allows combining Han and Bopomofo. > Considering that > > ? looks confusable with ? and ? looks confusable with ?, I?m > wondering > > if it?s appropriate to explicitly allow this combination in the > spoof > > detection context.? > > Are you asking about whether UTS #39 should allow this combination vs > being changed to forbid this combination? Or are you asking about > whether the rules of the Domain Name System should allow this > combination? > > > Foremost I'm asking if it's appropriate that browsers that in general > refuse to render mixed-script domain labels in the Unicode form in the > user interface (in the URL bar in particular) make an exception? for > the combination of Han and Bopomofo.? Ah. I did not interpret "allow this combination" as referring to browser location bar behaviour, nor to it meaning "display in Unicode (U-Label) form instead of encoded ASCII (A-Label) form". So you asking whether browsers should indicate to users that a domain name which combines Han and Bopmofo is untrustworthy? ? Also, > ?There are a set of Label Generation Rules for the root zone[2] of > the > DNS. They include rules for Chinese script labels[3] in the root > zone. > In my simple-minded reading of those rules, Bopomofo characters > are not > included in the repertoire. I suspect that means that the rules > prevent > anyone from registering a .???? top-level domain, or a Chinese > domain > with Bopomofo inclusions. > ? > [2] > [3] > > > > > It indeed looks like the root LGRs currently don't allow Bopomofo, but > it appears that they also don't allow Cyrillic TLDs, which do exist, > so it seems that root LGRs are enough in a work-in-progress state not > to draw definite conclusions from. I overlooked something important in [2]: the ICANNWiki content is not ICANN content, it is a separate org documenting ICANN. And it turns out that their Root Zone Label Generation Rules page at [2] has stale content. ICANN's own page on Root Zone Label Generation Rules [6] describes version 5 of the root zone LGRs, which include entries for Cyrllic, Japanese, and Korean scripts in addition to Chinese. (I am making a note to update the ICANNWiki Root Zone LGRs page, [2], if that is how their wiki works.) [6] (Content dates from 2022. No, I don't know why they have a 2015 date in their URL.) > I understand that each top-level registry sets the rules for > second-level labels they will accept, though there is pressure from > ICANN communities to adopt standard LGRs. There are a set of > suggested > Label Generation Rules for second-level labels[4]. As I read those > rules, at a superficial level they also seem to rule out Bopomofo > characters within Chinese language labels or Bopomofo-only labels. > > > That particular rule set also excludes Hiragana and Katakana, so it's > not clear that LGRs for Hani existing means the exclusion of Hanb, > Jpan, and Kore.? Have a look at the version 5 LGRs [6]. There may also be second-level LGRs for other scripts like Japanese, Korean, and Cyrillic. I have not checked. Does that clarify? > ?(I didn't ask about Jpan in my initial post, despite Han ? and > Katakana ? existing, because of the different role of Hiragana and > Katakana compared to the role of Bopomofo. I didn't ask about Kore, > because I'm not aware of a confusability issue even if I have doubts > about demand for Han + Hangul domain labels. I am curious, though, how > users and domain holders deal with the ? vs. ? issue. Is the glyph > size distinction consistent and obvious enough?) You are not the first person to ask this question. Answers at Japanese Stack Exchange[7], Reddit[8], WaniKani[9]. Summary: readers differentiate the based on context, and sometimes when the context is ambiguous people interpret the written kanji to be the kana. The best summary: "Context is always the key in Japanese." Those replies also point out other visually similar kana and kanji pairs. [7] [8] [9] I hope this is helpful. Cheers! ???? ?Jim DeLaHunt -- . --Jim DeLaHunt,jdlh at jdlh.com http://blog.jdlh.com/ (http://jdlh.com/) multilingual websites consultant, Vancouver, B.C., Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: From cate at cateee.net Tue Aug 20 02:31:35 2024 From: cate at cateee.net (Giacomo Catenazzi) Date: Tue, 20 Aug 2024 09:31:35 +0200 Subject: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: <3FEF281D-0781-463A-B298-151CB27A8F89@nixmagic.com> References: <0209df00-873c-403b-a5ff-5e611ca923a3@cateee.net> <3FEF281D-0781-463A-B298-151CB27A8F89@nixmagic.com> Message-ID: [I was writing it privately, but because original email didn't pass the maillist filters I decided to write to all of you [did you use correct From address?]] Hello Michael, Yes, we do things differently, but we do them first, and we discuss and change. Not the way around. I want to see a good use of it, and check if the selected layer is the best way to implement it (I douby). Writing email is cheap. And we do not need to be programmers: look the history of many very useful programs: done by non-programmers (working in other fields) with a specific need. I see the point of the proposal, but I really think it will not used. And one of my point it is that the proposal suck because it is very limited on capability, but also not compatible with existing tools and terminals. The proposal will not solve the problem (limited size, alignment, simple and complex circuits, annotation, inadequate to write maths for that symbol, huge language issues), and bitmap can be already drawn in terminals, also on old terminals, so either one go down with CSI (escape sequences of terminal), or up to HTML and SVG (where we can animate current flow, to help understanding AC circuits). We are just proposing like the images in terminals: a nice feature on paper and many people was thinking it will be used. But no: not enough good to replace HTML, but complex so not implemented in simple terminal GUI (or CLI). And so? Now it is difficult to create new terminals because of bloats, so we cannot do something lightweight, or something which can be used on other contexts (or inside an app). Upper layers are much better: one can choose what to implement, and if it is done in a good way, it could be backward compatible vew decades (JavaScript library, or small bitmap together with vector implementation, or...). As I iterated, Unicode supports a lot of technical design symbols and figures, but implementation sucks (default font of operating system). In a terminal how do you solve it? If we cannot do thing that are defined long time ago, why do you think new block will solve the problem (without having a good compelling example, so people can test and force to have a good implementation). In any case, telnet cannot be used anymore. ssh changed (same problem with algorithms as TLS: unable to access old machine, or the inverse). Terminal behaviour is not well defined (see colours). But both are different layers, like TLS. We can port CERN html and view in modern browser (luckly they prefered explicit escape of accented characterd), no need to look of the different layer. Note now we write HTML with UTF-8 (so Unicode), but HTML is not linked to Unicode: we add freedom. And updating layers (like TLS) is not difficult: changing semanting is bad. But TLS doesn't change that. And a Raspberry Pi can do a proxy webserver (which were supported since long time), so you can use a non-TLS webbrowser on modern web (but new HTML, Javascript, CSS, etc.). You can do automatically.? This proposal (terminals) cannot do it. You cannot reformat pages (e.g. for new languages) automatically, but if you arelady wrote them in an higher layer. So just do this proposal in an upper layer, and convert to SVG for terminals: you have all advantages. This proposal will make more incompatible old terminals. Or tell me how do you support the proposal in terminals. Do we need a complete new interface to select a font for each Unicode block? How to add a font which support this proposal?Do you think terminals will magically support Unicode blocks just because they are listed in a obscure page in Unicode Standard? We cannot implement it in a good way. Noway. (a upper layer is much better). And we are ignoring most of the world again (terminal are enough good for English, going on other language the suckisness increase, until "non-useability"). giacomo [And I keep (just) the original mail, because it is not yet passed Unicode list filters] On 2024-08-20 6:37, Michael De Roover wrote: >> Write once: do it in HTML: we can write and use very old pages. For >> such reason a markup language for technical design may be >> appropriate. On Unicode: it will be ugly and not so useful. Again: >> you see maybe no real usage of the technical design characters we >> already have in Unicode. It is just the wrong approach and nobody >> like it. > > I think this is actually a nice approach for Martin to look into > further. He's probably right in that TLS is "an enemy" in very old web > browsers, albeit not expressed all that well IMO. Going by personal > experience with an old PSP of mine, its CA certificate store is way > out of date. Nothing except /maybe/ Google (haven't tried, perhaps I > should've) is going to be accepted by its web browser if it uses TLS. > Usually I see people working around that by either using TheOldNet, or > a forward proxy that terminates TLS. I only use reverse TLS > terminating proxies myself, but NGINX does that for me and I have no > reason to doubt it being able to do that in forward mode as well. > > Granted, even after overcoming the TLS hurdle, rendering would be > next. These old web browsers are a nightmare when it comes to their > HTML rendering capabilities. My own website has a local counterpart > (web.lan), which does not use TLS. Just like my public website, it > uses no JavaScript but has a persistent bar at the bottom. The PSP's > web browser cannot render that bar correctly (bar is rendered at the > bottom of the document, not fixed), otherwise it was all fine. > Combined text and images meanwhile should be no problem, even for > earlier browsers still. The only concern I'd have with e.g. early > 2000s flip phones, is that they may run out of memory with high > resolution images. They may also not support all image formats, > especially PNG or WEBP. My first choice for that would likely be BMP, > if the format and viewers support monochrome and resolution is > sufficiently low. Being uncompressed, output size should be easy to > calculate in advance, and easy to render by a flip phone's > microcontroller. > > Either way, in retro applications, it's important to target the lowest > common denominator. Granted, Martin's questions do not strike me as a > retro computing subject ? rather one of terminal emulators' (or even > plain TTYs') rendering capabilities. So perhaps focus should be given > to CLI web browsers like elinks, lynx, w3m, ... instead. I think that > some of those can display images, by inline converting them into sixel > graphics. From what I've seen on Wikipedia about sixel graphics, it > even seems quite close to the block character proposal mentioned by > Martin before. The only appreciable difference to me is that it's > 1x6px for each character instead. And if that doesn't cut it either, > perhaps implementing it in the terminal emulator using the private-use > space in Unicode could be an option instead? Wouldn't be the first > time that the Linux community does things "our own way", and it > certainly wouldn't be the last either. At least with private-use, it > would not cross into Unicode's public space, and perhaps provide a > (more) compelling case for inclusion in the future. Still a long shot, > but at least better than it is now I guess. > > N.B.: Along with an email I sent to Martin and the mailing list > yesterday, this appears to be my first submission to the list. Could > one of the list moderators please look into what may be my account > still being on moderation? Thank you! > > Met vriendelijke groet, > Michael De Roover > > Mail: unicode at nixmagic.com (on list) > Web: michael.de.roover.eu.org > > Mail: michaelderoover at gmail.com > >> On 19 Aug 2024, at 09:19, Giacomo Catenazzi via Unicode >> wrote: (...) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bortzmeyer at nic.fr Tue Aug 20 04:01:42 2024 From: bortzmeyer at nic.fr (Stephane Bortzmeyer) Date: Tue, 20 Aug 2024 11:01:42 +0200 Subject: Plain text is forever (Was: Have Characters that Depict Electronic Components been Discussed? In-Reply-To: References: <96a84680-9125-4608-8d7a-8828c582ea63@softf1.com> <0236ce1e-4ee3-4813-92ec-8a01b99588a8@jdlh.com> Message-ID: On Sun, Aug 18, 2024 at 11:54:45PM +0300, Martin Vahi via Unicode wrote a message of 139 lines which said: > my reluctance to using those higher-level protocols is > that from software developer's perspective I just want to WRITE ONCE > WITHOUT NEEDING TO REWRITE OLD SOFTWARE Indeed, who remembers "Adobe Flash or Java Applets or Microsoft Silverlight or VRML"? "Plain text is great for long-term archiving and what it means for Unicode" could be a good talk at the WAC (Web Archiving Conference). The call for proposals is open until 11 september, do not hesitate to suggest a talk along what you wrote in your message. https://netpreserve.org/ga2025/ https://netpreserve.org/ga2025/cfp/ From gwidion at gmail.com Tue Aug 20 07:56:11 2024 From: gwidion at gmail.com (Joao S. O. Bueno) Date: Tue, 20 Aug 2024 09:56:11 -0300 Subject: FYI: Practical uses of OpenType fontes as another higher level protocol Message-ID: Hi - Just got this news yesterday which I believe might be of interest for some of the participants on this list - People had successfully made use of mechanisms in OpenType fonts to be able to colorize text according to context. The practical use is for simple syntax highlighting. https://blog.glyphdrawing.club/font-with-built-in-syntax-highlighting/?utm_source=tldrnewsletter This is a protocol sitting between unicode and markup languages, at the cost of having to be properly configured in any displaying app - (however, any app using proper libraries for rendering the fonts, and enabling font selection and parametrization would work "out of the box") This can also work as one more item to list in the "proper ways to do it in higher protocols" when people involved or interested in Unicode are queried about including mechanisms such as enabling color attributes. Regards, Joao From mark at kli.org Wed Aug 21 12:09:34 2024 From: mark at kli.org (Mark E. Shoulson) Date: Wed, 21 Aug 2024 13:09:34 -0400 Subject: FYI: Practical uses of OpenType fontes as another higher level protocol In-Reply-To: References: Message-ID: <55c5446c-c5bc-4a74-a85a-6d7b125cd65f@kli.org> Be careful.? OpenType fonts are so powerful these days they are essentially Turing-complete and can become an arbitrarily complex "protocol".? See https://fuglede.github.io/llama.ttf/ which is a complete Large Language Model AI stuffed into a font, renderable by (almost) ordinary tools.? It's an extreme case, and meant to be such, but still. ~mark On 8/20/24 08:56, Joao S. O. Bueno via Unicode wrote: > Hi - > > Just got this news yesterday which I believe might be of interest for > some of the participants on this list - > People had successfully made use of mechanisms in OpenType fonts > to be able to colorize text according to context. The practical use > is for simple syntax highlighting. > > https://blog.glyphdrawing.club/font-with-built-in-syntax-highlighting/?utm_source=tldrnewsletter > > This is a protocol sitting between unicode and markup languages, > at the cost of having to be properly configured in any displaying app - > (however, any app using proper libraries for rendering the fonts, and > enabling font selection and parametrization would work "out of the box") > > > This can also work as one more item to list in the "proper ways to do it > in higher protocols" when > people involved or interested in Unicode are queried about > including mechanisms such as enabling color attributes. > > Regards, > > Joao From gwidion at gmail.com Wed Aug 21 12:19:08 2024 From: gwidion at gmail.com (Joao S. O. Bueno) Date: Wed, 21 Aug 2024 14:19:08 -0300 Subject: FYI: Practical uses of OpenType fontes as another higher level protocol In-Reply-To: <55c5446c-c5bc-4a74-a85a-6d7b125cd65f@kli.org> References: <55c5446c-c5bc-4a74-a85a-6d7b125cd65f@kli.org> Message-ID: On Wed, Aug 21, 2024 at 2:13?PM Mark E. Shoulson via Unicode wrote: > > Be careful. OpenType fonts are so powerful these days they are > essentially Turing-complete and can become an arbitrarily complex > "protocol". See https://fuglede.github.io/llama.ttf/ which is a > complete Large Language Model AI stuffed into a font, renderable by > (almost) ordinary tools. It's an extreme case, and meant to be such, > but still. Sure. Anyway, a lot of the (~30 YOLD) recent font file formats would be turing complete themselves - maybe lacking the ability to perform any I/O out of glyph parameters - but "type 1" fontes were full postscript programs, weren't they? > ~mark > > On 8/20/24 08:56, Joao S. O. Bueno via Unicode wrote: > > Hi - > > > > Just got this news yesterday which I believe might be of interest for > > some of the participants on this list - > > People had successfully made use of mechanisms in OpenType fonts > > to be able to colorize text according to context. The practical use > > is for simple syntax highlighting. > > > > https://blog.glyphdrawing.club/font-with-built-in-syntax-highlighting/?utm_source=tldrnewsletter > > > > This is a protocol sitting between unicode and markup languages, > > at the cost of having to be properly configured in any displaying app - > > (however, any app using proper libraries for rendering the fonts, and > > enabling font selection and parametrization would work "out of the box") > > > > > > This can also work as one more item to list in the "proper ways to do it > > in higher protocols" when > > people involved or interested in Unicode are queried about > > including mechanisms such as enabling color attributes. > > > > Regards, > > > > Joao From arthur at reutenauer.eu Wed Aug 21 15:51:14 2024 From: arthur at reutenauer.eu (Arthur Reutenauer) Date: Wed, 21 Aug 2024 22:51:14 +0200 Subject: FYI: Practical uses of OpenType fontes as another higher level protocol In-Reply-To: References: <55c5446c-c5bc-4a74-a85a-6d7b125cd65f@kli.org> Message-ID: On Wed, Aug 21, 2024 at 02:19:08PM -0300, Joao S. O. Bueno via Unicode wrote: > but "type 1" fontes were > full postscript programs, weren't they? No. Type 3 fonts were, though. Arthur