From aprilop at fn.de Sat Mar 4 07:26:34 2023 From: aprilop at fn.de (Andreas Prilop) Date: Sat, 04 Mar 2023 13:26:34 +0000 Subject: Hebrew presentation forms U+FB43 and U+FB49 Message-ID: <4CFFDB8C-45EE-4A4A-ADBB-B3541F564AAE@fn.de> Two Hebrew letters (presentation forms) puzzle me: U+FB43 final pe with dagesh U+FB49 shin with dagesh, without dot Final [p] is written with normal pe U+05E4. Shin with dagesh (in classical Hebrew) must have either shin dot or sin dot, too. Therefore letters U+FB43 and U+FB49 do not exist. From jr at qsm.co.il Sat Mar 4 15:09:19 2023 From: jr at qsm.co.il (Jonathan Rosenne) Date: Sat, 4 Mar 2023 21:09:19 +0000 Subject: Hebrew presentation forms U+FB43 and U+FB49 In-Reply-To: <4CFFDB8C-45EE-4A4A-ADBB-B3541F564AAE@fn.de> References: <4CFFDB8C-45EE-4A4A-ADBB-B3541F564AAE@fn.de> Message-ID: Correct. But all the Hebrew presentation forms should be deprecated. Best Regards, Jonathan Rosenne -----Original Message----- From: Unicode On Behalf Of Andreas Prilop via Unicode Sent: Saturday, March 4, 2023 3:27 PM To: unicode at corp.unicode.org Subject: Hebrew presentation forms U+FB43 and U+FB49 Two Hebrew letters (presentation forms) puzzle me: U+FB43 final pe with dagesh U+FB49 shin with dagesh, without dot Final [p] is written with normal pe U+05E4. Shin with dagesh (in classical Hebrew) must have either shin dot or sin dot, too. Therefore letters U+FB43 and U+FB49 do not exist. From mark at kli.org Sat Mar 4 18:46:42 2023 From: mark at kli.org (Mark E. Shoulson) Date: Sat, 4 Mar 2023 19:46:42 -0500 Subject: Hebrew presentation forms U+FB43 and U+FB49 In-Reply-To: <4CFFDB8C-45EE-4A4A-ADBB-B3541F564AAE@fn.de> References: <4CFFDB8C-45EE-4A4A-ADBB-B3541F564AAE@fn.de> Message-ID: Final pe with dagesh is an oddity, as final -p being written as bent-pe is a modern convention, but -p DOES occur once in the Hebrew Bible, in Proverbs 30:6: ?????????????? ?????????????? ?? ?????????????? ?????? ??????????????? The shin-with-dagesh-but-no-dot feels like someone was just trying to cover all the permutations. ~mark On 3/4/23 08:26, Andreas Prilop via Unicode wrote: > Two Hebrew letters (presentation forms) puzzle me: > > U+FB43 final pe with dagesh > > U+FB49 shin with dagesh, without dot > > Final [p] is written with normal pe U+05E4. > Shin with dagesh (in classical Hebrew) must have either shin dot or sin dot, too. > > Therefore letters U+FB43 and U+FB49 do not exist. From as at signographie.de Tue Mar 7 06:35:49 2023 From: as at signographie.de (=?UTF-8?Q?A=2E_St=C3=B6tzner?=) Date: Tue, 7 Mar 2023 13:35:49 +0100 (CET) Subject: Missing Latin superscript lowercase letters Message-ID: <1317264669.2762677.1678192549210@email.ionos.de> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Bildschirmfoto 2023-03-07 um 13.28.18.png Type: image/png Size: 597702 bytes Desc: not available URL: From aprilop at fn.de Tue Mar 7 10:27:35 2023 From: aprilop at fn.de (Andreas Prilop) Date: Tue, 07 Mar 2023 16:27:35 +0000 Subject: Missing Latin superscript lowercase letters In-Reply-To: <1317264669.2762677.1678192549210@email.ionos.de> References: <1317264669.2762677.1678192549210@email.ionos.de> Message-ID: <4B947007-0C7A-4F24-B30F-442F74AF76E5@fn.de> On 7 March 2023, A. St?tzner wrote: > Only i n q are missing as superscript modifiers. https://www.google.com/search?q=%22superscript+latin+small+letter+i%22 https://www.google.com/search?q=%22superscript+latin+small+letter+n%22 https://www.google.com/search?q=%22modifier+letter+small+q%22 And have a look at https://corp.unicode.org/pipermail/unicode/2023-March/010468.html From qsjn4ukr at gmail.com Tue Mar 7 11:12:33 2023 From: qsjn4ukr at gmail.com (QSJN 4 UKR) Date: Tue, 7 Mar 2023 19:12:33 +0200 Subject: Missing Latin superscript lowercase letters In-Reply-To: <4B947007-0C7A-4F24-B30F-442F74AF76E5@fn.de> References: <1317264669.2762677.1678192549210@email.ionos.de> <4B947007-0C7A-4F24-B30F-442F74AF76E5@fn.de> Message-ID: > https://www.google.com/search?q=%22modifier+letter+small+q%22 U+107a5 v14.0 Wow! Thank you, Google, who would have thought that it could be found somewhere there. From doug at ewellic.org Tue Mar 7 11:15:57 2023 From: doug at ewellic.org (Doug Ewell) Date: Tue, 7 Mar 2023 17:15:57 +0000 Subject: Missing Latin superscript lowercase letters In-Reply-To: <1317264669.2762677.1678192549210@email.ionos.de> References: <1317264669.2762677.1678192549210@email.ionos.de> Message-ID: A. St?tzner via Unicode wrote: > I suspect this has been discussed several times before. Enough so that it?s an FAQ: https://www.unicode.org/faq/ligature_digraph.html#Pf8 > apart from that we have ? and ? another time in the 00xx range. The ordinal indicators don?t count in this discussion, as they often appear with an underscore below the ?a? or ?o?. > Just one example where the notorious Opentype makeshift would not be > a preferable solution and where hard-coding will be needed: If these superscript letters weren?t already encoded (as others have pointed out), an image of mathematical or engineering equations wouldn?t exactly be the best supporting evidence for encoding them in plain text. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From asmusf at ix.netcom.com Tue Mar 7 15:15:44 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Tue, 7 Mar 2023 13:15:44 -0800 Subject: Missing Latin superscript lowercase letters In-Reply-To: <1317264669.2762677.1678192549210@email.ionos.de> References: <1317264669.2762677.1678192549210@email.ionos.de> Message-ID: The example given, from a typeset mathematical formula, is inadmissible. I mathematical typesetting what is superscripted is not the individual letter, but the expression. In principle, the superscripted expression is arbitrarily complex and thus the superscript is fully recursive. This is precisely the kind of situation where hardcoding anything is not helpful. A./ On 3/7/2023 4:35 AM, A. St?tzner via Unicode wrote: > I suspect this has been discussed several times before. > We have almost the entire a?z encoded, although smashed in two places: > in the 02xx block we have? ? ? ? ? ? ? ? ? > in the 1Dxx block we have? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > apart from that we have ? and ? another time in the 00xx range. > Only *i n q*? are missing as superscript modifiers. Wouldn?t it be > sensible to fill that gap at last? > Just one example where the notorious Opentype makeshift would not be a > preferable solution and where hard-coding will be needed: > greetings to all, > A. St?tzner -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Bildschirmfoto 2023-03-07 um 13.28.18.png Type: image/png Size: 597702 bytes Desc: not available URL: From pgcon6 at msn.com Tue Mar 14 19:30:55 2023 From: pgcon6 at msn.com (Peter Constable) Date: Wed, 15 Mar 2023 00:30:55 +0000 Subject: Zero-Width Joiner U+200D In-Reply-To: References: <23a469db-fc5e-5037-5da0-01e9e0c13c6a@ix.netcom.com> <6f4f18b5-b880-56ee-48a2-99e3d910621c@ix.netcom.com> Message-ID: From Unicode 15, section 9.2 (p. 375): ?The Non-joiner and the Joiner. The Unicode Standard provides two user-selectable for[1]matting codes: U+200C zero width non-joiner and U+200D zero width joiner. The use of a joiner adjacent to a suitable letter permits that letter to form a cursive connection without a visible neighbor. This provides a simple way to encode some special cases, such as exhibiting a connecting form in isolation, as shown in Figure 9-2.? Later in that section (p. 383), ZWJ is listed in the Join_Causing set of Arabic joining types It seems to me the text is describing the original intent as Asmus described. Peter From: Unicode On Behalf Of Jukka K. Korpela via Unicode Sent: Tuesday, February 21, 2023 4:56 AM To: Asmus Freytag Cc: unicode at corp.unicode.org Subject: Re: Zero-Width Joiner U+200D Asmus Freytag via Unicode (unicode at corp.unicode.org) wrote: I think we need to look at whether the language accurately reflects what we were trying to say. I do know that it was revised at one point, when the use of ZWJ was generalized beyond cursive connection. It seems that this took place as early as in Unicode 2. The interpretation you suggest may be an inadvertent result of that change, or someone had found out why the usage that I always understood as intended is for some reason problematic. In that case, it should be excluded more explicitly, in my view. In fact, reading chapter 23 onwards, I now see the use of ZWJ?s around a character to ask for isolated form. It was just so far from the place that described ZWJ and ZWNJ between adjacent characters, giving the impression that this is their only use. Perhaps it would help to remove the word ?adjacent? from ?U+200D zero width joiner is intended to produce a more connected rendering of adjacent characters than would otherwise be the case, if possible. The text describes the use of ZWJ for isolated form and shows this in example 23-1. Sorry for the confusion I caused. So the answer to Andreas? question is ?yes, it should?, with the value of ?should? roughly as ?is intended to, according to the Unicode standard, but a program that renders Unicode characters is not required to obey, or even understand, such rendering suggestions? Jukka -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Tue Mar 14 20:57:45 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Tue, 14 Mar 2023 18:57:45 -0700 Subject: Zero-Width Joiner U+200D In-Reply-To: References: <23a469db-fc5e-5037-5da0-01e9e0c13c6a@ix.netcom.com> <6f4f18b5-b880-56ee-48a2-99e3d910621c@ix.netcom.com> Message-ID: The language that gave rise to the original confusion ?U+200D zero width joiner is intended to produce a more connected rendering of adjacent characters than would otherwise be the case, if possible...." is not in contradiction (as originally claimed), because "adjacent" clearly refers to characters "adjacent" to the ZWJ, not "character that would be adjacent across a ZWJ". There's nothing in the language that supports that (mis-)reading. However, simply changing the language to ?U+200D zero width joiner is intended to produce a more connected rendering of characters *adjacent to it* than would otherwise be the case, if possible... would prevent that reading. Yet with that change, the sentence becomes completely impossible to scan. The problem stems partially from the desire to make the text on ZWJ and ZWNJ appear (anti-)symmetric. However, this ignores the fact that they behave very differently when placed near spaces and start/end of line or text. I would suggest a slight change: Joiner.U+200Dzero width joiner/*requests*//**/a more connected rendering of adjacent characters than would otherwise be the case. where "requests" replaced the curious "intends to produce". And we can delete the "if possible" because if not possible, its only a request and no request can be satisfied in situations where that is not possible. The remaining text below the bullets already covers that case, should there be any doubts. However, I would suggest we add a bullet: * A typical use of ZWJ is to show the connected form of a character without a visible neighbor. On 3/14/2023 5:30 PM, Peter Constable via Unicode wrote: > > From Unicode 15, section 9.2 (p. 375): > > ?The Non-joiner and the Joiner. The Unicode Standard provides two > user-selectable for[1]matting codes: U+200C zero width non-joiner and > U+200D zero width joiner. The use of a joiner adjacent to a suitable > letter permits that letter to form a cursive connection without a > visible neighbor. This provides a simple way to encode some special > cases, such as exhibiting a connecting form in isolation, as shown in > Figure 9-2.? > > Later in that section (p. 383), ZWJ is listed in the Join_Causing set > of Arabic joining types > > It seems to me the text is describing the original intent as Asmus > described. > > Peter > > *From:* Unicode *On Behalf Of > *Jukka K. Korpela via Unicode > *Sent:* Tuesday, February 21, 2023 4:56 AM > *To:* Asmus Freytag > *Cc:* unicode at corp.unicode.org > *Subject:* Re: Zero-Width Joiner U+200D > > ?Asmus Freytag via Unicode (unicode at corp.unicode.org) wrote: > > I think we need to look at whether the language accurately > reflects what we were trying to say. I do know that it was revised > at one point, when the use of ZWJ was generalized beyond cursive > connection. > > It seems that this took place as early as in Unicode 2. > > The interpretation you suggest may be an inadvertent result of > that change, or someone had found out why the usage that I always > understood as intended is for some reason problematic. In that > case, it should be excluded more explicitly, in my view. > > In fact, reading chapter 23 onwards, I now see the use of ZWJ?s around > a character to ask for isolated form. It was just so far from the > place that described ZWJ and ZWNJ between adjacent characters, giving > the impression that this is their only use. Perhaps it would help to > remove the word ?adjacent? from ?U+200D zero width joiner is intended > to produce a more connected rendering of adjacent characters than > would otherwise be the case, if possible. > > The text describes the use of ZWJ for isolated form and shows this in > example 23-1. Sorry for the confusion I caused. > > So the answer to Andreas? question is ?yes, it should?, with the value > of ?should? roughly as ?is intended to, according to the Unicode > standard, but a program that renders Unicode characters is not > required to obey, or even understand, such rendering suggestions? > > Jukka > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgcon6 at msn.com Tue Mar 14 19:41:47 2023 From: pgcon6 at msn.com (Peter Constable) Date: Wed, 15 Mar 2023 00:41:47 +0000 Subject: Missing Latin superscript lowercase letters In-Reply-To: <1317264669.2762677.1678192549210@email.ionos.de> References: <1317264669.2762677.1678192549210@email.ionos.de> Message-ID: Doug Ewell responded: > an image of mathematical or engineering equations wouldn?t exactly be the best supporting evidence for encoding them in plain text. Not only would it not be the best supporting evidence, it wouldn?t be considered supporting evidence _at all_ since math formula layout is not plain text. As Asmus responded, > what is superscripted is not the individual letter, but the expression. In principle, the superscripted expression is arbitrarily complex and thus the superscript is fully recursive. Andreas says below, > Just one example where the notorious Opentype makeshift would not be a preferable solution and where hard-coding will be needed: OpenType makeshift? Perhaps you?re not familiar with the MATH table in OpenType, which allows handling all of the typographic nuance in math layout examples such as were shown below. MATH - The mathematical typesetting table (OpenType 1.9) - Typography | Microsoft Learn Peter Constable From: Unicode On Behalf Of A. St?tzner via Unicode Sent: Tuesday, March 7, 2023 4:36 AM To: Don Peterson via Unicode Subject: Missing Latin superscript lowercase letters I suspect this has been discussed several times before. We have almost the entire a?z encoded, although smashed in two places: in the 02xx block we have ? ? ? ? ? ? ? ? in the 1Dxx block we have ? ? ? ? ? ? ? ? ? ? ? ? ? ? apart from that we have ? and ? another time in the 00xx range. Only i n q are missing as superscript modifiers. Wouldn?t it be sensible to fill that gap at last? Just one example where the notorious Opentype makeshift would not be a preferable solution and where hard-coding will be needed: [cid:image001.png at 01D9569B.E55A72D0] greetings to all, A. St?tzner -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 597702 bytes Desc: image001.png URL: From aprilop at fn.de Thu Mar 16 13:48:12 2023 From: aprilop at fn.de (Andreas Prilop) Date: Thu, 16 Mar 2023 18:48:12 +0000 Subject: Missing Latin superscript lowercase letters In-Reply-To: References: <1317264669.2762677.1678192549210@email.ionos.de> Message-ID: On 15 March 2023, Peter Constable wrote: > Andreas says below, > >> Just one example where the notorious Opentype makeshift would not >> be a preferable solution and where hard-coding will be needed: What?? The person who wrote this is A. St?tzner, not Andreas [Prilop]. From doug at ewellic.org Thu Mar 16 14:16:12 2023 From: doug at ewellic.org (Doug Ewell) Date: Thu, 16 Mar 2023 19:16:12 +0000 Subject: Missing Latin superscript lowercase letters In-Reply-To: References: <1317264669.2762677.1678192549210@email.ionos.de> Message-ID: Peter Constable wrote: > Doug Ewell responded: > >> an image of mathematical or engineering equations wouldn?t exactly be >> the best supporting evidence for encoding them in plain text. > > Not only would it not be the best supporting evidence, it wouldn?t be > considered supporting evidence _at all_ since math formula layout is > not plain text. There was an implied tone of voice and rolling of eyes in my post. I didn?t include an emoji to convey that. Andreas Prilop replied: >> Andreas says below, >> >>> Just one example where the notorious Opentype makeshift would not be >>> a preferable solution and where hard-coding will be needed: > > What?? > > The person who wrote this is A. St?tzner, not Andreas [Prilop]. ?A. St?tzner? is also named Andreas: http://luc.devroye.org/fonts-43707.html -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From 142857 at mail.de Sun Mar 19 04:18:02 2023 From: 142857 at mail.de (142857 at mail.de) Date: Sun, 19 Mar 2023 09:18:02 +0000 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" Message-ID: <1679216117381.3713198999.2468275785@mail.de> Dear Unicode Consortium Team, We would like to report a missing emoji in the current Unicode standard, which we believe would be of great value to the mathematical and scientific community. The missing emoji is the "Heavy equal sign-Emoji" (=). (?, ?, ?, ?) We propose that this emoji be added to the Unicode standard as soon as possible, and we suggest that it be named "Heavy EQUAL SIGN" to accurately reflect its function. Thank you for your attention to this matter. Sincerely, LX "Collaboratively created by ChatGPT and LX" Hi, I am Lx(Alex, from Germany - now without chatGPT;-) and new to that mailing-list System/whole unicode-plattform. I am here because someone has to say it: I miss that = in equal size to the operators(functions, signs) since years - now I can tell it to you. Thank you all for good work, I'll read more about the project and mailinglist. I Hope the "shape" is okay - not to long/short. Have a good time -------------- next part -------------- An HTML attachment was scrubbed... URL: From beckiergb at gmail.com Sun Mar 19 13:38:58 2023 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Sun, 19 Mar 2023 11:38:58 -0700 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: <1679216117381.3713198999.2468275785@mail.de> References: <1679216117381.3713198999.2468275785@mail.de> Message-ID: It already exists: http://www.kreativekorp.com/charset/unicode/char/1F7F0/ -- Rebecca Bettencourt On Sun, Mar 19, 2023 at 8:19?AM Lx via Unicode wrote: > Dear Unicode Consortium Team, > > We would like to report a missing emoji in the current Unicode standard, > which we believe would be of great value to the mathematical and scientific > community. The missing emoji is the "Heavy equal sign-Emoji" (=). (?, ?, ?, > ?) > > We propose that this emoji be added to the Unicode standard as soon as > possible, and we suggest that it be named "Heavy EQUAL SIGN" to accurately > reflect its function. > > Thank you for your attention to this matter. > > Sincerely, > > LX > "Collaboratively created by ChatGPT and LX" > > Hi, > I am Lx(Alex, from Germany - now without chatGPT;-) and new to that > mailing-list System/whole unicode-plattform. > I am here because someone has to say it: > I miss that = in equal size to the operators(functions, signs) since years > - now I can tell it to you. > Thank you all for good work, I'll read more about the project and > mailinglist. > I Hope the "shape" is okay - not to long/short. > Have a good time > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom at bluesky.org Sun Mar 19 13:53:53 2023 From: tom at bluesky.org (Tom Gewecke) Date: Sun, 19 Mar 2023 11:53:53 -0700 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: References: <1679216117381.3713198999.2468275785@mail.de> Message-ID: <18524A68-FF5C-4AA1-B501-920B62122024@bluesky.org> 1f7f0 is included in Apple?s Color Emoji font: ? > On Mar 19, 2023, at 11:38 AM, Rebecca Bettencourt via Unicode wrote: > > It already exists: > > http://www.kreativekorp.com/charset/unicode/char/1F7F0/ > > -- Rebecca Bettencourt > > > On Sun, Mar 19, 2023 at 8:19?AM Lx via Unicode > wrote: > Dear Unicode Consortium Team, > We would like to report a missing emoji in the current Unicode standard, which we believe would be of great value to the mathematical and scientific community. The missing emoji is the "Heavy equal sign-Emoji" (=). (?, ?, ?, ?) > We propose that this emoji be added to the Unicode standard as soon as possible, and we suggest that it be named "Heavy EQUAL SIGN" to accurately reflect its function. > Thank you for your attention to this matter. > Sincerely, -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Sun Mar 19 16:58:04 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sun, 19 Mar 2023 14:58:04 -0700 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: References: <1679216117381.3713198999.2468275785@mail.de> Message-ID: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> Well, what can you expect from a message written by a chatbot? A./ On 3/19/2023 11:38 AM, Rebecca Bettencourt via Unicode wrote: > It already exists: > > http://www.kreativekorp.com/charset/unicode/char/1F7F0/ > > -- Rebecca Bettencourt > > > On Sun, Mar 19, 2023 at 8:19?AM Lx via Unicode > wrote: > > Dear Unicode Consortium Team, > > We would like to report a missing emoji in the current Unicode > standard, which we believe would be of great value to the > mathematical and scientific community. The missing emoji is the > "Heavy equal sign-Emoji" (=). (?, ?, ?, ?) > > We propose that this emoji be added to the Unicode standard as > soon as possible, and we suggest that it be named "Heavy EQUAL > SIGN" to accurately reflect its function. > > Thank you for your attention to this matter. > > Sincerely, > > > LX > "Collaboratively created by ChatGPT and LX" > Hi, > I am Lx(Alex, from Germany - now without chatGPT;-) and new to > that mailing-list System/whole unicode-plattform. > I am here because someone has to say it: > I miss that = in equal size to the operators(functions, signs) > since years - now I can tell it to you. > Thank you all for good work, I'll read more about the project and > mailinglist. > I Hope the "shape" is okay - not to long/short. > Have a good time > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sosipiuk at gmail.com Sun Mar 19 18:01:44 2023 From: sosipiuk at gmail.com (=?UTF-8?Q?S=C5=82awomir_Osipiuk?=) Date: Sun, 19 Mar 2023 23:01:44 +0000 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> References: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> Message-ID: <1679266719312.2071283061.1008587296@gmail.com> Quite a lot, actually, but at its heart it's still a computer and hence more likely to do what you tell it than what you want. I assume "LX" told ChatGPT to "write a proposal for a Heavy Equals Sign Emoji" rather than asking "Is there a Heavy Equals Sign Emoji defined in Unicode?" On Sunday, 19 March 2023, 17:58:04 (-04:00), Asmus Freytag via Unicode wrote: Well, what can you expect from a message written by a chatbot? A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Sun Mar 19 22:10:38 2023 From: doug at ewellic.org (Doug Ewell) Date: Mon, 20 Mar 2023 03:10:38 +0000 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: <1679266719312.2071283061.1008587296@gmail.com> References: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> <1679266719312.2071283061.1008587296@gmail.com> Message-ID: S?awomir Osipiuk replied to Asmus Freytag: >> Well, what can you expect from a message written by a chatbot? > > Quite a lot, actually, but at its heart it's still a computer and > hence more likely to do what you tell it than what you want. > > I assume "LX" told ChatGPT to "write a proposal for a Heavy Equals > Sign Emoji" rather than asking "Is there a Heavy Equals Sign Emoji > defined in Unicode?" But it didn?t even do that; it merely posted a request to the list. Even emoji require much more in the way of a ?proposal? than that. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From jonathan.coxhead at gmail.com Sun Mar 19 23:04:02 2023 From: jonathan.coxhead at gmail.com (Jonathan Coxhead) Date: Sun, 19 Mar 2023 21:04:02 -0700 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: <18524A68-FF5C-4AA1-B501-920B62122024@bluesky.org> References: <18524A68-FF5C-4AA1-B501-920B62122024@bluesky.org> Message-ID: <31BCA19B-84F6-41F8-A5EE-F3DF942EE26A@gmail.com> An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Sun Mar 19 23:56:06 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sun, 19 Mar 2023 21:56:06 -0700 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: <31BCA19B-84F6-41F8-A5EE-F3DF942EE26A@gmail.com> References: <18524A68-FF5C-4AA1-B501-920B62122024@bluesky.org> <31BCA19B-84F6-41F8-A5EE-F3DF942EE26A@gmail.com> Message-ID: On 3/19/2023 9:04 PM, Jonathan Coxhead via Unicode wrote: > On Mar 19, 2023, at 11:56 AM, Tom Gewecke via Unicode > wrote: >> >> ? 1f7f0 is included in Apple?s Color Emoji font: ? ?? > > ? ?This is odd, though: HEAVY EQUALS SIGN, at least as of Unicode > 15.1.0, does not have text/emoji variant selectors. Other signs that > do, include ? ? ? ??, but ? is missing from this set. Nevertheless, > my iPhone does offer it as an emoji for text messages. Does Unicode > need to move this way in order to match ?existing practice?? > > ?? I'm speculating a bit here, but 1F7F0 may be emoji only. That is, you'd use "=" as the "text" version. (Or some other existing character). I know, I should dig around a bit instead of speculating, but don't have the time right now. A./ > >> >>> On Mar 19, 2023, at 11:38 AM, Rebecca Bettencourt via Unicode >>> wrote: >>> >>> It already exists: >>> >>> http://www.kreativekorp.com/charset/unicode/char/1F7F0/ >>> >>> -- Rebecca Bettencourt >>> >>> >>> On Sun, Mar 19, 2023 at 8:19?AM Lx via Unicode >>> wrote: >>> >>> Dear Unicode Consortium Team, >>> We would like to report a missing emoji in the current Unicode >>> standard, which we believe would be of great value to the >>> mathematical and scientific community. The missing emoji is the >>> "Heavy equal sign-Emoji" (=). (?, ?, ?, ?) >>> We propose that this emoji be added to the Unicode standard as >>> soon as possible, and we suggest that it be named "Heavy EQUAL >>> SIGN" to accurately reflect its function. >>> Thank you for your attention to this matter. >>> Sincerely, >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Sun Mar 19 23:59:25 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sun, 19 Mar 2023 21:59:25 -0700 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: <1679266719312.2071283061.1008587296@gmail.com> References: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> <1679266719312.2071283061.1008587296@gmail.com> Message-ID: <8edcaef3-b19a-1bb4-91e3-3db22307a216@ix.netcom.com> On 3/19/2023 4:01 PM, S?awomir Osipiuk via Unicode wrote: > Quite a lot, actually, but at its heart it's still a computer and > hence more likely to do what you tell it than what you want. > I assume "LX" told ChatGPT to "write a proposal for a Heavy Equals > Sign Emoji" rather than asking "Is there a Heavy Equals Sign Emoji > defined in Unicode?" Right, and I had asked you do to it, you would have noticed it's there already and asked me why I was so daft to propose something as missing that is already present. Also, if you had concluded that a new character would be required, you would have argued why the seemingly existing one was not in fact the one that you thought is needed. That being a a requirement of a good proposal (explaining why seeming alternatives are not valid). So, if the role of the chatbot just consists of creating plausible sounding language, we should ban their use in submissions and communications with the Consortium. Because they may make it sound like a proposal or suggestion is well researched or thought out, when if fact it isn't, wasting everybody's time. A./ > > On Sunday, 19 March 2023, 17:58:04 (-04:00), Asmus Freytag via Unicode > wrote: > > Well, what can you expect from a message written by a chatbot? > > A./ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Mon Mar 20 00:56:38 2023 From: jameskass at code2001.com (James Kass) Date: Mon, 20 Mar 2023 05:56:38 +0000 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: <8edcaef3-b19a-1bb4-91e3-3db22307a216@ix.netcom.com> References: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> <1679266719312.2071283061.1008587296@gmail.com> <8edcaef3-b19a-1bb4-91e3-3db22307a216@ix.netcom.com> Message-ID: <920f4bc6-4195-aa32-00b1-7ecb8ffce32b@code2001.com> On 2023-03-20 4:59 AM, Asmus Freytag via Unicode wrote: > So, if the role of the chatbot just consists of creating plausible > sounding language, we should ban their use in submissions and > communications with the Consortium. Because they may make it sound > like a proposal or suggestion is well researched or thought out, when > if fact it isn't, wasting everybody's time. The chat bot wrote of a belief that the character would be advantageous to the mathematical and scientific communities, which seemed a bit off kilter.? As these bots become more sophisticated, it will be harder to spot them. From marius.spix at web.de Mon Mar 20 06:59:41 2023 From: marius.spix at web.de (Marius Spix) Date: Mon, 20 Mar 2023 12:59:41 +0100 Subject: Aw: Re: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: <8edcaef3-b19a-1bb4-91e3-3db22307a216@ix.netcom.com> References: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> <1679266719312.2071283061.1008587296@gmail.com> <8edcaef3-b19a-1bb4-91e3-3db22307a216@ix.netcom.com> Message-ID: An HTML attachment was scrubbed... URL: From gtbot2007 at gmail.com Mon Mar 20 13:22:02 2023 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Mon, 20 Mar 2023 14:22:02 -0400 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: References: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> <1679266719312.2071283061.1008587296@gmail.com> <8edcaef3-b19a-1bb4-91e3-3db22307a216@ix.netcom.com> Message-ID: Yall missing the joke. It was made by ChatGPT. It was probably just told to make a proposal. On Mon, Mar 20, 2023 at 8:02?AM Marius Spix via Unicode < unicode at corp.unicode.org> wrote: > I already suggested this in 2019. Here is the conversation. > > > > > *Gesendet:* Mittwoch, 18. Dezember 2019 um 14:43 Uhr > *Von:* "Joao S. O. Bueno" > *An:* "Marius Spix" > *Cc:* unicode at unicode.org > *Betreff:* Re: HEAVY EQUALS SIGN > I think that as your object is emoji drawing, not mathematics, this > request can't > be justified that way. > > Maybe it would make more sense to try and check whether modification > combining > characters to shift the change the combined character into other > weight/decoration/color and/or other > character effects could be built, that could be used not only along emoji, > but with all other characters. > > Currently those transforms require the use of another text protocol, like > HTML, or ANSI sequences > for terminal, or even proprietary and add-hoc text file structures like > Microsoft's .doc and .rtf (and other > not that proprietary, but equally dependant on specific software to be > proper rendered, like .ooxml and .odf). > > Since modificator characters for color and others have been tried and > tested in Unicode land for > some emojis, the ball to have in-unicode proper character transforms could > start to roll - > > Does anyone know if there is already an initiative like that? I'd like to > know more about it. > > (as for the O.P.: I think the way out for you now is to use an > out-of-unicode markup > to select a heavier-looking font for the `+` and `=` characters) > > js > -><- > > > > > On Wed, 18 Dec 2019 at 09:42, Marius Spix via Unicode > wrote: > > Unicode has a HEAVY PLUS SIGN (U+2795) and a HEAVY MINUS SIGN (U+2796). > I wonder, if a HEAVY EQUALS SIGN could complete that character set. > This would allow emoji phrases like ? ??= ??. (man plus cat equals > love) looking typographically better, when you replace the equals sign > with a new HEAVY EQUALS SIGN character. Thoughts? > > Marius > > > > *Gesendet:* Montag, 20. M?rz 2023 um 05:59 Uhr > *Von:* "Asmus Freytag via Unicode" > *An:* unicode at corp.unicode.org > *Betreff:* Re: Missing "(Heavy)EQUAL SIGN-Emoji" > On 3/19/2023 4:01 PM, S?awomir Osipiuk via Unicode wrote: > > Quite a lot, actually, but at its heart it's still a computer and hence > more likely to do what you tell it than what you want. > I assume "LX" told ChatGPT to "write a proposal for a Heavy Equals Sign > Emoji" rather than asking "Is there a Heavy Equals Sign Emoji defined in > Unicode?" > > Right, and I had asked you do to it, you would have noticed it's there > already and asked me why I was so daft to propose something as missing that > is already present. Also, if you had concluded that a new character would > be required, you would have argued why the seemingly existing one was not > in fact the one that you thought is needed. That being a a requirement of a > good proposal (explaining why seeming alternatives are not valid). > > So, if the role of the chatbot just consists of creating plausible > sounding language, we should ban their use in submissions and > communications with the Consortium. Because they may make it sound like a > proposal or suggestion is well researched or thought out, when if fact it > isn't, wasting everybody's time. > > A./ > > > On Sunday, 19 March 2023, 17:58:04 (-04:00), Asmus Freytag via Unicode > wrote: > > > Well, what can you expect from a message written by a chatbot? > > A./ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Mon Mar 20 13:26:58 2023 From: doug at ewellic.org (Doug Ewell) Date: Mon, 20 Mar 2023 18:26:58 +0000 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: References: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> <1679266719312.2071283061.1008587296@gmail.com> <8edcaef3-b19a-1bb4-91e3-3db22307a216@ix.netcom.com> Message-ID: Gabriel Tellez wrote: > Yall missing the joke. It was made by ChatGPT. It was probably just > told to make a proposal. ... for a character that already exists. Neither artificial nor natural intelligence was in play here. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From gtbot2007 at gmail.com Mon Mar 20 13:28:59 2023 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Mon, 20 Mar 2023 14:28:59 -0400 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: References: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> <1679266719312.2071283061.1008587296@gmail.com> <8edcaef3-b19a-1bb4-91e3-3db22307a216@ix.netcom.com> Message-ID: You think ChatGPT is smart enough to know it already exits? On Mon, Mar 20, 2023 at 2:27 PM Doug Ewell wrote: > Gabriel Tellez wrote: > > > Yall missing the joke. It was made by ChatGPT. It was probably just > > told to make a proposal. > > ... for a character that already exists. > > Neither artificial nor natural intelligence was in play here. > > -- > Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Mon Mar 20 13:46:22 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Mon, 20 Mar 2023 11:46:22 -0700 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: References: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> <1679266719312.2071283061.1008587296@gmail.com> <8edcaef3-b19a-1bb4-91e3-3db22307a216@ix.netcom.com> Message-ID: That's the point. The data is accessible, but the tool is unable to use that information. (Or "willingly" sets it aside because of the wording of the request). Imagine, instead of a comment on a? "water cooler conversation" list like this, a hypothetical case of someone actually submitting a full proposal for a script or an extension of one for which real world data is hard to come by. With some generative algorithms able to create images, you can make a proposal for a completely fictitious script, with completely fictitious samples of supposed text passages with nobody being the wiser. Of course, this could be done manually as well, but the amount of effort required has been sufficient deterrence. That equation has now shifted, and we need to figure out whether our processes are up to this. Or imagine someone uses ChatGBT to compose a proposal that, while it conforms to reality as far as the characters are concerned, is full of statements on how the script is supposedly written, its supposed origins, and supposed application, that are not likewise based in actual expertise. With an automated tool, it takes no effort to create a very plausible sounding document that is optimized towards approval, while not restrained in any way by actual facts (or at least not facts that are easily verifiable). We see this occasionally in manually created documents that some submitters exaggerate the spread of a script or the degree its in actual use, but this process makes that so much easier. A./ On 3/20/2023 11:28 AM, Gabriel Tellez wrote: > You think ChatGPT is smart enough to know it already exits? > > On Mon, Mar 20, 2023 at 2:27 PM Doug Ewell wrote: > > Gabriel Tellez wrote: > > > Yall missing the joke. It was made by ChatGPT. It was probably just > > told to make a proposal. > > ... for a character that already exists. > > Neither artificial nor natural intelligence was in play here. > > -- > Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Mon Mar 20 13:47:45 2023 From: doug at ewellic.org (Doug Ewell) Date: Mon, 20 Mar 2023 18:47:45 +0000 Subject: Missing "(Heavy)EQUAL SIGN-Emoji" In-Reply-To: References: <0e8fe317-34ef-28fe-c080-550c9e1e95bc@ix.netcom.com> <1679266719312.2071283061.1008587296@gmail.com> <8edcaef3-b19a-1bb4-91e3-3db22307a216@ix.netcom.com> Message-ID: Gabriel Tellez wrote: > You think ChatGPT is smart enough to know it already exits? Proposing to add something that already exists, of which there can only be one, is never an intelligent thing to do. Looking in the Unicode Character Database would be a good start. If it doesn?t know what or where that is, it might ask follow-up questions. Looking for HEAVY EQUAL SIGN and finding only HEAVY EQUALS SIGN might also trigger some sort of near-match cross-check. But wait, there?s more: it also didn?t write a proposal. Instead, it posted to a mailing list asking for something to be done. There is an actual proposal process that goes well beyond simply posting to a mailing list. I suppose creating an email and phrasing the request the way it did shows some measure of artificial intelligence. I still stand by the second part. -- Doug Ewell, CC, ALB | Lakewood, CO, US | http://ewellic.org From kent.b.karlsson at bahnhof.se Tue Mar 21 23:36:17 2023 From: kent.b.karlsson at bahnhof.se (Kent Karlsson) Date: Wed, 22 Mar 2023 05:36:17 +0100 Subject: Missing Latin superscript lowercase letters In-Reply-To: References: <1317264669.2762677.1678192549210@email.ionos.de> Message-ID: <40E26F97-2412-4BB6-8056-A628D7E5200E@bahnhof.se> Peter Constable wrote: > Doug Ewell responded: > > > an image of mathematical or engineering equations wouldn?t > > exactly be the best supporting evidence for encoding them in plain text. > > Not only would it not be the best supporting evidence, it wouldn?t be > considered supporting evidence _at all_ since math formula layout is not plain text. Says who? There is no law of nature (or of omputing) that says that math expressions must be non-plain text. Just because all of neqn/eqn, (La)TeX, MathML, OMML, and indeed UnicodeMath are representations of math expressions that are *not* plain text does not mean that math expressions must be expressed by a higher level protocol. I.e. it could very well be a text level protocol (where the ?math controls? are not expressed as printable text, but as control codes). Further, if some symbol/letter for some reason only ever occurred in superscript position in math expressions, such examples would still be supporting evidence for that symbol/letter. The closest practical example I can think of is the degree sign, which in origin is a superscript 0. Asmus Freytag wrote: > I[n] mathematical typesetting what is superscripted is not the individual > letter, but the expression. In principle, the superscripted expression > is arbitrarily complex and thus the superscript is fully recursive. > > This is precisely the kind of situation where hardcoding anything is not > helpful. I would go even further than that, and say that with very few exceptions, characters that have a compatibility decomposition have no business in a math expression. In my little project "math anywhere" (ok, I just thought of that name, and no, of course I cannot implement it everywhere) I'm proposing a plain text format for math expressions. Plus a version that is compatible with HTML and SVG. And also a version that one can be best described as a "mark-down" version that is (relatively) easy to input via a keyboard; all equivalent in what can be expressed. See https://github.com/kent-karlsson/control/blob/main/math-layout-controls-2023-A.pdf . The plain text format for math expressions can well represent math expressions in an otherwise plain text context. Whether you want to see the math expression themselves as plain text is very much in the eye of the beholder. The HTML/SVG compatible version does not have "clay feet?. The price for that is that additional parsing is needed. Unusual in that that parsing must work on the DOM, but otherwise nothing strange and basically the same parsing as for the ?plain text? version (where the parsing of course works on the text). This (or these, considering the three variants) is also the only format for math expression representation that can handle RTL math expressions reasonably. /Kent K -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Wed Mar 22 02:53:25 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Wed, 22 Mar 2023 00:53:25 -0700 Subject: Missing Latin superscript lowercase letters In-Reply-To: <40E26F97-2412-4BB6-8056-A628D7E5200E@bahnhof.se> References: <1317264669.2762677.1678192549210@email.ionos.de> <40E26F97-2412-4BB6-8056-A628D7E5200E@bahnhof.se> Message-ID: <046e15c0-8015-10d6-8720-b1ee3cc431e7@ix.netcom.com> On 3/21/2023 9:36 PM, Kent Karlsson via Unicode wrote: > There is no law of nature (or of omputing) that says that math expressions > must be non-plain text. Just because all of neqn/eqn, (La)TeX, MathML, > OMML, and indeed > UnicodeMath are representations of math expressions that are *not* > plain text does not > mean that math expressions must be expressed by a higher level > protocol. I.e. it could > very well be a text level protocol (where the ?math controls? are not > expressed as > printable text, but as control codes). Using control sequences or codes for your markup does not make your content plain text. The fact remains that mathematical notation is fundamentally recursive when it comes to super/subscript: it's not individual letters, but entire expressions that are super/subscripted (and at least in theory, they cover the full range? of mathematical expressions) and they are recursive: they can contain nested super/subscripted expressions. Again, in theory, this recursion is not limited, except that for reasons of practicality such recursion has to be realized in ways that the overall expression remains legible. Therefore, if your goal is mathematical notation, you want an operator that super/subscripts an expression and not code points for single characters. The key takeaway is the natural scoping: super/subscripting is applied on the level of a whole expression. That means that your markup needs to be scoped and that definitely makes it rich text. The existing single characters are (almost) all encoded for use in phonetic notation, which is not recursive and doesn't super/subscript entire expressions. Instead it uses super/subscripting to indicate modification. Hence "modifier letters". > Further, if some symbol/letter for some reason only ever occurred in > superscript > position in math expressions, such examples would still be supporting > evidence for > that symbol/letter. The closest practical example I can think of is > the degree sign, which > in origin is a superscript 0. The degree sign is either the exception that proves the rule, or something else: a symbol that occurs frequently in contexts that are not full mathematical expressions, as it is typical for unit symbols. When used with temperature, it's interesting to note that not all temperature scales use it consistently. You don't see it with Fahrenheit very often, for example, reflecting differences in traditional keyboard layouts. Note that many unit symbols have one-off encodings that Unicode had to support via compatibility characters or even canonical duplicates (think micro and Ohm vs. their Greek letter counterparts). Without the need to support a transition from pre-existing character sets, these duplicates would not exist. But they do and so does the degree sign. Neither of them, however, form precedents for non-compatibility characters. A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From marius.spix at web.de Thu Mar 23 09:55:24 2023 From: marius.spix at web.de (Marius Spix) Date: Thu, 23 Mar 2023 15:55:24 +0100 Subject: Aw: Re: Missing Latin superscript lowercase letters In-Reply-To: <40E26F97-2412-4BB6-8056-A628D7E5200E@bahnhof.se> References: <1317264669.2762677.1678192549210@email.ionos.de> <40E26F97-2412-4BB6-8056-A628D7E5200E@bahnhof.se> Message-ID: An HTML attachment was scrubbed... URL: From cate at cateee.net Thu Mar 23 10:40:20 2023 From: cate at cateee.net (Giacomo Catenazzi) Date: Thu, 23 Mar 2023 16:40:20 +0100 Subject: Aw: Re: Missing Latin superscript lowercase letters In-Reply-To: References: <1317264669.2762677.1678192549210@email.ionos.de> <40E26F97-2412-4BB6-8056-A628D7E5200E@bahnhof.se> Message-ID: <2c43b2b5-8281-de1e-0948-73e419244f52@cateee.net> On 23 Mar 2023 15:55, Marius Spix via Unicode wrote: > In TeX and MathML, U+005E CIRCUMFLEX ACCENT ^ is used for superscript > and?U+005F LOW LINE _?for subscript. This also allows power towers like > 2^(2^2), which are not possible with the existing Unicode characters. > This notation is recognized by mathmaticians,?physicists and chemists > and widely accepted. > In?some programming languages, e. g. Java or C++,?^ is used for the XOR > operation and _ for digit grouping, but that does not matter here, > because the context is always the decisive factor. > Many modern fonts have support for auto-alignment of digits in > combination with U+2044 FRACTION SLASH ??like in that example:?13?37. So > it may be possible to design a font with special handling for ^ and _. Let's be honest: most people which are asking for superscript letters are not interested in display in mathematics. In any case /FRACTION SLASH/ can display some fractions, not all fonts have good support for it (e.g. on your mail I see it nice, but in the quoted text above I see it very ugly. But such formatting not only depends on font, but also on user preferences (and settings). Like tabular numbers (also often included in fonts), it may not be enabled by default, or it should be explicitly enabled e.g. on tables). Unicode do not support such presentation settings. So a FRACTION SLASH is just a solution for one specific case. Note: there is no delimiter, so usually fonts support it only for numbers, and maybe with limited number of digits. Maths is more then nubers. An other reasons to have maths in markup language. Also it is very annoying to type Unicode symbols not on the restricted number of keys in normal keyboards. We learn it also from computer languages. On earliest days computers languages had many symbols because every operations "needed" own symbols. Guess what? Now every modern computer languages uses practically only ASCII characters. Practicability is better then a ideal system few people uses. In any case, to display true maths, we need a specialised engine (and fonts). We are far from having current shaping engines (and fonts) to display maths in a nice way. (and personally I prefer that developers of shaping engines will works on improving the actual engine and fonts for human languages, before to go on such specialised field (which we have already good tools). Superscript letters can be done with current fonts and current shaping engines and many markup languages, so any discussion (and new characters) are distractions which do not direct us on a true Unicode mathematical typesetting (not a goal, like musical notation). And it will make things worst: searching engines must have to interpret everything. Speech synthesis will become much more complex (and it should understand where it is maths, chemistry or units: you will need to spell them differently. And probably many other unforeseen problems. ciao cate From gtbot2007 at gmail.com Fri Mar 24 08:23:55 2023 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Fri, 24 Mar 2023 09:23:55 -0400 Subject: Aw: Re: Missing Latin superscript lowercase letters In-Reply-To: <2c43b2b5-8281-de1e-0948-73e419244f52@cateee.net> References: <1317264669.2762677.1678192549210@email.ionos.de> <40E26F97-2412-4BB6-8056-A628D7E5200E@bahnhof.se> <2c43b2b5-8281-de1e-0948-73e419244f52@cateee.net> Message-ID: Mathematical typesetting and proper musical notation are not in the scope of Unicode. (Just another reason I'm making my own character set.) On Thu, Mar 23, 2023 at 11:43?AM Giacomo Catenazzi via Unicode < unicode at corp.unicode.org> wrote: > > On 23 Mar 2023 15:55, Marius Spix via Unicode wrote: > > In TeX and MathML, U+005E CIRCUMFLEX ACCENT ^ is used for superscript > > and U+005F LOW LINE _ for subscript. This also allows power towers like > > 2^(2^2), which are not possible with the existing Unicode characters. > > This notation is recognized by mathmaticians, physicists and chemists > > and widely accepted. > > > In some programming languages, e. g. Java or C++, ^ is used for the XOR > > operation and _ for digit grouping, but that does not matter here, > > because the context is always the decisive factor. > > > Many modern fonts have support for auto-alignment of digits in > > combination with U+2044 FRACTION SLASH ? like in that example: 13?37. So > > it may be possible to design a font with special handling for ^ and _. > > Let's be honest: most people which are asking for superscript letters > are not interested in display in mathematics. > > In any case /FRACTION SLASH/ can display some fractions, not all fonts > have good support for it (e.g. on your mail I see it nice, but in the > quoted text above I see it very ugly. But such formatting not only > depends on font, but also on user preferences (and settings). Like > tabular numbers (also often included in fonts), it may not be enabled by > default, or it should be explicitly enabled e.g. on tables). Unicode do > not support such presentation settings. So a FRACTION SLASH is just a > solution for one specific case. Note: there is no delimiter, so usually > fonts support it only for numbers, and maybe with limited number of > digits. Maths is more then nubers. An other reasons to have maths in > markup language. > > Also it is very annoying to type Unicode symbols not on the restricted > number of keys in normal keyboards. We learn it also from computer > languages. On earliest days computers languages had many symbols because > every operations "needed" own symbols. Guess what? Now every modern > computer languages uses practically only ASCII characters. > Practicability is better then a ideal system few people uses. > > In any case, to display true maths, we need a specialised engine (and > fonts). We are far from having current shaping engines (and fonts) to > display maths in a nice way. (and personally I prefer that developers of > shaping engines will works on improving the actual engine and fonts for > human languages, before to go on such specialised field (which we have > already good tools). > > Superscript letters can be done with current fonts and current shaping > engines and many markup languages, so any discussion (and new > characters) are distractions which do not direct us on a true Unicode > mathematical typesetting (not a goal, like musical notation). And it > will make things worst: searching engines must have to interpret > everything. Speech synthesis will become much more complex (and it > should understand where it is maths, chemistry or units: you will need > to spell them differently. And probably many other unforeseen problems. > > ciao > cate > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kent.b.karlsson at bahnhof.se Sat Mar 25 12:29:13 2023 From: kent.b.karlsson at bahnhof.se (Kent Karlsson) Date: Sat, 25 Mar 2023 18:29:13 +0100 Subject: Missing Latin superscript lowercase letters In-Reply-To: <046e15c0-8015-10d6-8720-b1ee3cc431e7@ix.netcom.com> References: <1317264669.2762677.1678192549210@email.ionos.de> <40E26F97-2412-4BB6-8056-A628D7E5200E@bahnhof.se> <046e15c0-8015-10d6-8720-b1ee3cc431e7@ix.netcom.com> Message-ID: > 22 mars 2023 kl. 08:53 skrev Asmus Freytag via Unicode : >On 3/21/2023 9:36 PM, Kent Karlsson via Unicode wrote: >>There is no law of nature (or of [c]omputing) that says that math expressions >>must be non-plain text. Just because all of neqn/eqn, (La)TeX, MathML, OMML, and indeed >>UnicodeMath are representations of math expressions that are *not* plain text does not >>mean that math expressions must be expressed by a higher level protocol. I.e. it could >>very well be a text level protocol (where the ?math controls? are not expressed as >>printable text, but as control codes). >Using control sequences or codes for your markup does not make your content plain text. >The fact remains that mathematical notation is fundamentally recursive when it comes to >super/subscript: it's not individual letters, but entire expressions that are super/subscripted >(and at least in theory, they cover the full range of mathematical expressions) and they are >recursive: they can contain nested super/subscripted expressions. Again, in theory, this recursion >is not limited, except that for reasons of practicality such recursion has to be realized in ways that >the overall expression remains legible. True, but? >Therefore, if your goal is mathematical notation, you want an operator that super/subscripts an >expression and not code points for single characters. The key takeaway is the natural scoping: >super/subscripting is applied on the level of a whole expression. That means that your markup >needs to be scoped and that definitely makes it rich text. Two cases: Bidi algorithm. Whether based solely on characters?s bidi properties or also bidi controls are used, the bidi handling is intrinsically scoped. I think you still call that plain text. Further, in HTML the bidi control characters aren?t used, instead there are attributes to most tags that control the bidi handling. Even though markup is used, I think you still think of it as plain text? Indeed, a math expression is more plain text than a bidi text. From the appearance of the math expression you can derive the structure (ignoring ?phantom? expressions, which I included only because MathML has that, and they seem to sometimes be practical; and some reservation for stretch, which only should have an effect on some symbols). On the other hand, for a bidi processed text, you cannot guarantee the recovery of the given structure from the displayed text, indeed I think that in general is impossible; not so plain? Combining characters. Before Unicode, ECMA-48 defined CSI 1 SP _CSI 2 SP _ for ?combining? the characters in to a single displayed character (for certain implementation defined values of ). Unicode ?replaced? (that is probably not what happened historically, but technically it can be seen that way) that by instead having combining characters. Without that invention, we would in hypothetical-HTML have a tag for doing such combinations (since HTML does not like C0/C1 control codes?). And? these combining characters do not work on a single character, but on the combining sequence (a ?scope?) that precedes it; and they can indeed be seen as a special kind of control characters. You still consider these scoped controls to be plain text. So that the controls have a ?scope? that is more than a single character (as in both of the cases above) or are recursive (as in both of the cases above) does apparently not exclude a feature from being regarded as plain text. So I maintain that what is plain text or not is much in the eye of the beholder (regardless of internal representation), not only w.r.t. scopeness and recursiveness, but also possibility to correctly derive the structure of a text (which in general is impossible for bidi). >The existing single characters are (almost) all encoded for use in phonetic notation, which is not >recursive and doesn't super/subscript entire expressions. Instead it uses super/subscripting to >indicate modification. Hence "modifier letters". True. And I have said nothing against that point. Indeed, I said that those characters do *not* belong in a math expression, and should ?stay? with (mostly) phonetic notation. >>Further, if some symbol/letter for some reason only ever occurred in superscript >>position in math expressions, such examples would still be supporting evidence for >>that symbol/letter. The closest practical example I can think of is the degree sign, which >>in origin is a superscript 0. >The degree sign is either the exception that proves the rule, or something else: a symbol >that occurs frequently in contexts that are not full mathematical expressions, as it is typical True, but I was arguing against Peter Constable's postulation that something that (for whatever reason) occurs only in a superscript position in a math expression could not have its encoding supported by an example where it occurred in a superscript position in a math expression. THAT postulation is false. (And the closest example I could think of was the degree sign; there MAY be examples of yet unencoded characters that only occur in superscript position in math expressions.) >for unit symbols. When used with temperature, it's interesting to note that not all temperature >scales use it consistently. You don't see it with Fahrenheit very often, for example, reflecting >differences in traditional keyboard layouts. Ok, let?s digress a bit? I do see that too, in news articles (in web apps) from USA and British news companies and see also ?C? when degrees Celsius is meant. But writing farad (F) or coulomb (C) when referring to temperature is just horrible, and only embarrassing for the journalist who wrote that. (Another related horror is ?kph?, and there you cannot even blame keyboard layouts.) >Note that many unit symbols have one-off encodings that Unicode had to support via compatibility >characters or even canonical duplicates (think micro and Ohm vs. their Greek letter counterparts). >Without the need to support a transition from pre-existing character sets, these duplicates would >not exist. But they do Yes. (But not relevant to this discussion.) >and so does the degree sign. The degree sign is not a compatibility character. It ?divorced? from superscript 0 looong before computers? >Neither of them, however, form precedents for non-compatibility characters. Not sure what that sentence means, since the premise is skewed. >A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Sat Mar 25 16:34:42 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 25 Mar 2023 14:34:42 -0700 Subject: Missing Latin superscript lowercase letters In-Reply-To: References: <1317264669.2762677.1678192549210@email.ionos.de> <40E26F97-2412-4BB6-8056-A628D7E5200E@bahnhof.se> <046e15c0-8015-10d6-8720-b1ee3cc431e7@ix.netcom.com> Message-ID: <36980084-ad59-d8f2-378d-35116bb0d770@ix.netcom.com> Kent, I'm not able to match your beautifully color-code reply chain, but here goes. On 3/25/2023 10:29 AM, Kent Karlsson via Unicode wrote: > >>Further, if some symbol/letter for some reason only ever occurred in superscript > >>position in math expressions, such examples would still be supporting evidence for > >>that symbol/letter. The closest practical example I can think of is the > degree sign, which > >>in origin is a superscript 0. > >The degree sign is either the exception that proves the rule, or > something else: a symbol > >that occurs frequently in contexts that are not full mathematical > expressions, as it is typical > True, but I was arguing against Peter Constable's postulation that > something that (for whatever > reason) occurs only in a superscript position in a math expression > /could not/?have its encoding > supported by an example where it occurred in a superscript position in > a math expression. > THAT postulation is false. (And the closest example I could think of > was the degree sign; there > MAY be examples of yet unencoded characters that only occur in > superscript position in math > expressions.) This is an argument best explored when there's an actual test case. In essence, modifier letters in phonetics fall into this category, because ordinarily you don't expect to style phonetic notation other than globally (e.g. font choice). They therefore can be argued to have an identity that is different from simply superscripting the same letter form. The latter looks the same, but we assert (via encoding) that they are not the same thing. That fits the conception of phonetic notation that every character individually stands for something specific. Whereas in a mathematical expression, the identity of a letter doesn't change, whether it's superscripted or not. It's clearly just a different use of the same letter, which is underlined by the fact that superscripting can be nested. So, we would have to have a test case, not yet encoded, where there's a different identity for the superscripted shape than if the same shape were to be rendered normally. The degree sign is a bad example, in a way, as it's clearly not a superscript 'o' or '0' (letter/digit) but is correctly implemented as a pure circle. That puts it in the category of symbols for which the size, spacing and placement of the "ink" matters more than the resemblance of that "ink" to other symbols. It is also not considered a "superscript circle" (no compat decomp). It is a good example in a different way, since it's clearly a character for which the "ink" is always in a position and size as would be appropriate for a superscript. I'm sure that if we encounter some other character for which it would be inappropriate to give a compat decomp that we would consider whether it should be encoded. At that juncture, we would look at the context in which it is to be used. > >for unit symbols. When used with temperature, it's interesting to note > that not all temperature > >scales use it consistently. You don't see it with Fahrenheit very often, for > example, reflecting > >differences in traditional keyboard layouts. > Ok, let?s digress a bit? I do see that too, in news articles (in web > apps) from USA and British news > companies and see also ?C? when degrees Celsius is meant. But writing > farad (F) or coulomb (C) > when referring to temperature is just horrible, and only embarrassing > for the journalist who wrote > that. (Another related horror is ?kph?, and there you cannot even > blame keyboard layouts.) I think it goes a bit too far to assume that any and all unit abbreviations have to be in the SI notation always. I'm sure there are places where there are regulations that define the use of specific abbreviations and in any contexts where they apply to SI, you would be free to read "k" as kilo and "kph" as kilo-ph (and then reject that as undefined). The same is not true for ordinary everyday usage in places where SI units aren't customary. Likewise, the "ph" suffix to mean "per hour" is well established in places, while "/h" is not. That said, given that usage, I'd personally prefer kmph? over kph. For example, in the weather forecast, 80F never refers to capacity, is understood by the audience, and therefore there's no objection to that usage on ground of confusion with SI units. However, usage is not consistent, you see it both with and without the degree sign, and without naming names, websites by academic institutions are just as likely to leave it off as popular websites are likely to add it. As you can see, actual usage is all over the place and as Unicode is not prescriptive, we simply deal with what's out there. > >Note that many unit symbols have one-off encodings that Unicode had to > support via compatibility > >characters or even canonical duplicates (think micro and Ohm vs. their Greek > letter counterparts). > >Without the need to support a transition from pre-existing character sets, > these duplicates would > >not exist. But they do > Yes. (But not relevant to this discussion.) > >and so does the degree sign. > The degree sign is not a compatibility character. It ?divorced? from > superscript 0 looong before > computers? > >Neither of them, however, form precedents for non-compatibility characters. > Not sure what that sentence means, since the premise is skewed. The argument is that because there may be some characters that are used in ways that justify direct encoding (whether for compatibility or whatever), this does not serve as a blanket justification to extend that treatment to others. A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dpk at nonceword.org Mon Mar 27 06:34:25 2023 From: dpk at nonceword.org (Daphne Preston-Kendal) Date: Mon, 27 Mar 2023 13:34:25 +0200 Subject: Missing Latin superscript lowercase letters In-Reply-To: <1317264669.2762677.1678192549210@email.ionos.de> References: <1317264669.2762677.1678192549210@email.ionos.de> Message-ID: <8B4133B5-203C-4E95-A05E-83EE2C426A46@nonceword.org> > Only i n q are missing as superscript modifiers. Wouldn?t it be sensible to fill that gap at last? In the pronunciation alphabet devised by James Murray for the first edition of the Oxford English Dictionary, superscript i was used for the glide-off of the English diphthong written in the IPA as /e?/. https://archive.org/details/oed01arch/page/n21/mode/2up?view=theater In the digital transcription of the first edition, the superscripting was done with markup tags (as was the distinction between italic and roman letters used in the Murray phonemic?phonetic transcription). Nonetheless given the significance of the OED and the otherwise comprehensive treatment of phonetic characters in Unicode, even non-/pre-IPA ones like this, I think there?s a strong case for encoding this. Conceivably superscript n might be used in the IPA to denote a nasal consonant with ?alveolarization?. The fact it isn?t encoded yet makes me think this is rare to nonexistent. The corresponding process for superscript q would be uvularization, but I don?t know that using the symbol for the uvular plosive would ever be applicable here. Daphne From kenwhistler at sonic.net Mon Mar 27 22:00:51 2023 From: kenwhistler at sonic.net (Ken Whistler) Date: Mon, 27 Mar 2023 20:00:51 -0700 Subject: Missing Latin superscript lowercase letters In-Reply-To: <8B4133B5-203C-4E95-A05E-83EE2C426A46@nonceword.org> References: <1317264669.2762677.1678192549210@email.ionos.de> <8B4133B5-203C-4E95-A05E-83EE2C426A46@nonceword.org> Message-ID: <933b37e9-007d-05e7-b027-fda15c3696a2@sonic.net> Whatever the use case might be for each of these, the quoted premise is simply incorrect. 2071????????? ; Super # Lm?????? SUPERSCRIPT LATIN SMALL LETTER I 207F????????? ; Super # Lm?????? SUPERSCRIPT LATIN SMALL LETTER N 107A5 ??????? ; Super # Lm?????? MODIFIER LETTER LATIN SMALL Q That the names of 2071 and 207F depart from the usual pattern is simply an historical accident, based on the original sources for the encoding. These are them, they *are* encoded. BTW, 2071 and 207F have been there from the very *first* version of Unicode. The modifer letter q only got in very recently, based on evidence that tracked down an appropriate example in a linguistic usage. --Ken On 3/27/2023 4:34 AM, Daphne Preston-Kendal via Unicode wrote: >> Only i n q are missing as superscript modifiers. Wouldn?t it be sensible to fill that gap at last? > In the pronunciation alphabet devised by James Murray for the first edition of the Oxford English Dictionary, superscript i was used for the glide-off of the English diphthong written in the IPA as /e?/.https://archive.org/details/oed01arch/page/n21/mode/2up?view=theater > In the digital transcription of the first edition, the superscripting was done with markup tags (as was the distinction between italic and roman letters used in the Murray phonemic?phonetic transcription). Nonetheless given the significance of the OED and the otherwise comprehensive treatment of phonetic characters in Unicode, even non-/pre-IPA ones like this, I think there?s a strong case for encoding this. > > Conceivably superscript n might be used in the IPA to denote a nasal consonant with ?alveolarization?. The fact it isn?t encoded yet makes me think this is rare to nonexistent. > > The corresponding process for superscript q would be uvularization, but I don?t know that using the symbol for the uvular plosive would ever be applicable here. > > > Daphne > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From as at signographie.de Tue Mar 28 02:17:38 2023 From: as at signographie.de (=?UTF-8?Q?A=2E_St=C3=B6tzner?=) Date: Tue, 28 Mar 2023 09:17:38 +0200 (CEST) Subject: Missing Latin superscript lowercase letters In-Reply-To: <933b37e9-007d-05e7-b027-fda15c3696a2@sonic.net> References: <1317264669.2762677.1678192549210@email.ionos.de> <8B4133B5-203C-4E95-A05E-83EE2C426A46@nonceword.org> <933b37e9-007d-05e7-b027-fda15c3696a2@sonic.net> Message-ID: <1153574413.745214.1679987858280@email.ionos.de> An HTML attachment was scrubbed... URL: From prosfilaes at gmail.com Thu Mar 30 11:54:10 2023 From: prosfilaes at gmail.com (David Starner) Date: Thu, 30 Mar 2023 11:54:10 -0500 Subject: Inverted asterism Message-ID: There doesn't seem to be an inverted asterism in Unicode. Is there a good reason there's not? https://en.wikisource.org/wiki/Page:Monthly_scrap_book,_for_February.pdf/24 shows the example I have at hand, from an 1832 English-language periodical from Scotland. -- The standard is written in English . If you have trouble understanding a particular section, read it again and again and again . . . Sit up straight. Eat your vegetables. Do not mumble. -- _Pascal_, ISO 7185 (1991) From jameskass at code2001.com Thu Mar 30 13:10:10 2023 From: jameskass at code2001.com (James Kass) Date: Thu, 30 Mar 2023 18:10:10 +0000 Subject: Inverted asterism In-Reply-To: References: Message-ID: On 2023-03-30 4:54 PM, David Starner via Unicode wrote: > There doesn't seem to be an inverted asterism in Unicode. Is there a > good reason there's not? > https://en.wikisource.org/wiki/Page:Monthly_scrap_book,_for_February.pdf/24 > shows the example I have at hand, from an 1832 English-language > periodical from Scotland. > Looking at it in the browser, the two 'stacked asterisks' match. Copy/pasting the line into a plain-text editor (which uses a different font) shows one asterisk above and two below. ???? ? The above short Hints were submitted to the ... Would this be considered a glyph variant, or a separate character? Are the two forms ever used contrastively in the same source? From marius.spix at web.de Thu Mar 30 13:12:40 2023 From: marius.spix at web.de (Marius Spix) Date: Thu, 30 Mar 2023 20:12:40 +0200 Subject: Fw: Inverted asterism References: Message-ID: An HTML attachment was scrubbed... URL: From doug at ewellic.org Thu Mar 30 13:41:24 2023 From: doug at ewellic.org (Doug Ewell) Date: Thu, 30 Mar 2023 18:41:24 +0000 Subject: Inverted asterism In-Reply-To: References: Message-ID: David Starner wrote: > There doesn't seem to be an inverted asterism in Unicode. Is there a > good reason there's not? Probably the same reason as always: it wasn?t in a known character set 30 years ago, and nobody has successfully proposed it since then. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From marius.spix at web.de Thu Mar 30 13:52:37 2023 From: marius.spix at web.de (Marius Spix) Date: Thu, 30 Mar 2023 20:52:37 +0200 Subject: Aw: Re: Inverted asterism In-Reply-To: References: Message-ID: This is not font-specific. They use the rotate() css function. It seems that the typesetter also used three separate glyphs. That ?reversed asterism? is 3 en wide and does not overlap like the reference glyph for the asterism. > Gesendet: Donnerstag, den 30.03.2023 um 20:10 Uhr > Von: "James Kass via Unicode" > An: unicode at corp.unicode.org > Betreff: Re: Inverted asterism > > > > On 2023-03-30 4:54 PM, David Starner via Unicode wrote: > > There doesn't seem to be an inverted asterism in Unicode. Is there a > > good reason there's not? > > https://en.wikisource.org/wiki/Page:Monthly_scrap_book,_for_February.pdf/24 > > shows the example I have at hand, from an 1832 English-language > > periodical from Scotland. > > > Looking at it in the browser, the two 'stacked asterisks' match. > Copy/pasting the line into a plain-text editor (which uses a different > font) shows one asterisk above and two below. > > ???? ? The above short Hints were submitted to the ... > > Would this be considered a glyph variant, or a separate character? Are > the two forms ever used contrastively in the same source? > From jameskass at code2001.com Thu Mar 30 14:10:23 2023 From: jameskass at code2001.com (James Kass) Date: Thu, 30 Mar 2023 19:10:23 +0000 Subject: Aw: Re: Inverted asterism In-Reply-To: References: Message-ID: On 2023-03-30 6:52 PM, Marius Spix via Unicode wrote: > This is not font-specific. They use the rotate() css function. It seems that the typesetter also used three separate glyphs. That ?reversed asterism? is 3 en wide and does not overlap like the reference glyph for the asterism. Thank you for clarifying that.? So the reproduced text uses rich text features to match the source.? Much like the reproduced text uses rich text features to match the italics in the source. From doug at ewellic.org Thu Mar 30 15:31:47 2023 From: doug at ewellic.org (Doug Ewell) Date: Thu, 30 Mar 2023 20:31:47 +0000 Subject: Aw: Re: Inverted asterism In-Reply-To: References: Message-ID: James Kass replied to Marius Spix: >> This is not font-specific. They use the rotate() css function. It >> seems that the typesetter also used three separate glyphs. That >> ?reversed asterism? is 3 en wide and does not overlap like the >> reference glyph for the asterism. > > Thank you for clarifying that. So the reproduced text uses rich text > features to match the source. Much like the reproduced text uses rich > text features to match the italics in the source. I think it?s reasonable to allow that the inverted asterism might have some claim to being a legitimate plain-text character, unlike italicized text, which has a long-established history of not being considered plain text (NOT a thread we should be rehashing here). -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From jameskass at code2001.com Thu Mar 30 15:49:23 2023 From: jameskass at code2001.com (James Kass) Date: Thu, 30 Mar 2023 20:49:23 +0000 Subject: Inverted asterism In-Reply-To: References: Message-ID: <2440aae6-030e-adcb-a049-2749959fe9b7@code2001.com> On 2023-03-30 8:31 PM, Doug Ewell via Unicode wrote: > I think it?s reasonable to allow that the inverted asterism might have some claim to being a legitimate plain-text character ... Agreed.? But as Marius Spix pointed out, the glyph in question can already be represented in plain-text as "*?*".? We've seen arguments in the past against encoding which assert that since the text already cannot be reproduced without rich-text there's no need for direct encoding. I was interested in the side-by-side comparison of the repro with the source.? Wondering how closely the authors of the web page want to match the source.? Some stylistic differences are captured in the repro, others are not.? For example :? The font size change between the top and the bottom of the page are matched, but the font style (serif) is not matched.? The italics in the source are matched, but the paragraph indentations aren't.? &c. From asmusf at ix.netcom.com Thu Mar 30 16:47:25 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Thu, 30 Mar 2023 14:47:25 -0700 Subject: Aw: Re: Inverted asterism In-Reply-To: References: Message-ID: <34d61524-b8cc-cc73-50e9-28c6ba019045@ix.netcom.com> I'm a bit confused here. Inspecting the page I don't see the use of "rotate()" and when I look at source text as well as cut&paste, I see ?*?*? composed from two six pointed and one five pointed asterisk. That seems a crude representation of what is in print, in that the print original seems to have three identical asterisks. A./ On 3/30/2023 11:52 AM, Marius Spix via Unicode wrote: > This is not font-specific. They use the rotate() css function. It seems that the typesetter also used three separate glyphs. That ?reversed asterism? is 3 en wide and does not overlap like the reference glyph for the asterism. > >> Gesendet: Donnerstag, den 30.03.2023 um 20:10 Uhr >> Von: "James Kass via Unicode" >> An:unicode at corp.unicode.org >> Betreff: Re: Inverted asterism >> >> >> >> On 2023-03-30 4:54 PM, David Starner via Unicode wrote: >>> There doesn't seem to be an inverted asterism in Unicode. Is there a >>> good reason there's not? >>> https://en.wikisource.org/wiki/Page:Monthly_scrap_book,_for_February.pdf/24 >>> shows the example I have at hand, from an 1832 English-language >>> periodical from Scotland. >>> >> Looking at it in the browser, the two 'stacked asterisks' match. >> Copy/pasting the line into a plain-text editor (which uses a different >> font) shows one asterisk above and two below. >> >> ???? ? The above short Hints were submitted to the ... >> >> Would this be considered a glyph variant, or a separate character? Are >> the two forms ever used contrastively in the same source? >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Thu Mar 30 17:03:35 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Thu, 30 Mar 2023 15:03:35 -0700 Subject: Inverted asterism In-Reply-To: References: Message-ID: <64c2fdac-9a43-fc19-2e84-896720c7f166@ix.netcom.com> On 3/30/2023 9:54 AM, David Starner via Unicode wrote: > There doesn't seem to be an inverted asterism in Unicode. Is there a > good reason there's not? > https://en.wikisource.org/wiki/Page:Monthly_scrap_book,_for_February.pdf/24 > shows the example I have at hand, from an 1832 English-language > periodical from Scotland. > The primary reason would seem to be that no successful proposal has been submitted. A successful proposal would establish that this cannot be rendered with a simple text sequence and also that this usage isn't a one-off. As rendered on my browser, the transcription shows a text sequence, but with the defect of being composed using a five-pointed asterisk in the lower position. (I don't see any use of CSS). I note that the original lacks overlap which makes it impossible to be certain whether the typesetter used a single slug or three. In making an encoding decision, several determinations would have to be made. (1) does the attested usage rise to the level where encoding is warranted (or is this limited to a single document or otherwise not worth preserving in plain text)? (2) does the example represent a single glyph or a sequence? (3) if a sequence, is every element encoded? (4) if a single glyph, is it sufficient if it can be represented using some rich text? (italics, rotation, etc). We don't really have an algorithm yet for deriving these determinations unambiguously from the input data; it would be best if we had a proposal on record so we can have a disposition on record. Whether positive or negative, that would help settle future requests. At this point, there's a question whether the proposal should request a lower, six-pointed asterisk or a the inverted asterism, and whether it is possible to adduce enough data to help in making that decision. What we need for cases like this would be a place for proposals that are in a public "pending" state, so that people other than the proposer can adduce additional evidence over time without the need to immediately come down one way or the other. A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Thu Mar 30 18:07:01 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Thu, 30 Mar 2023 16:07:01 -0700 Subject: Inverted asterism In-Reply-To: <3F54D1A4-7D6A-4F10-BB89-2D1E3F588B82@gmail.com> References: <64c2fdac-9a43-fc19-2e84-896720c7f166@ix.netcom.com> <3F54D1A4-7D6A-4F10-BB89-2D1E3F588B82@gmail.com> Message-ID: <83003dfd-2346-de9d-6384-443448511491@ix.netcom.com> Thanks for clearing that up. A./ On 3/30/2023 3:13 PM, Steven R. Loomis wrote: > The page has changed since we?ve started discussing it. > - > https://en.wikisource.org/w/index.php?title=Page%3AMonthly_scrap_book%2C_for_February.pdf%2F24&diff=13108267&oldid=13107428 > > > It?s now ?*?*? > > the original template is at > https://en.wikisource.org/wiki/Template:Inverted_asterism > > -- > Steven R. Loomis > Code Hive Tx, LLC > https://codehivetx.us > > > >> On Mar 30, 2023, at 5:03 PM, Asmus Freytag via Unicode >> wrote: >> >> On 3/30/2023 9:54 AM, David Starner via Unicode wrote: >>> There doesn't seem to be an inverted asterism in Unicode. Is there a >>> good reason there's not? >>> https://en.wikisource.org/wiki/Page:Monthly_scrap_book,_for_February.pdf/24 >>> shows the example I have at hand, from an 1832 English-language >>> periodical from Scotland. >>> >> The primary reason would seem to be that no successful proposal has >> been submitted. >> >> A successful proposal would establish that this cannot be rendered >> with a simple text sequence and also that this usage isn't a one-off. >> >> As rendered on my browser, the transcription shows a text sequence, >> but with the defect of being composed using a five-pointed asterisk >> in the lower position. (I don't see any use of CSS). >> >> I note that the original lacks overlap which makes it impossible to >> be certain whether the typesetter used a single slug or three. >> >> In making an encoding decision, several determinations would have to >> be made. >> >> (1) does the attested usage rise to the level where encoding is >> warranted (or is this limited to a single document or otherwise not >> worth preserving in plain text)? >> >> (2) does the example represent a single glyph or a sequence? >> >> (3) if a sequence, is every element encoded? >> >> (4) if a single glyph, is it sufficient if it can be represented >> using some rich text? (italics, rotation, etc). >> >> We don't really have an algorithm yet for deriving these >> determinations unambiguously from the input data; it would be best if >> we had a proposal on record so we can have a disposition on record. >> Whether positive or negative, that would help settle future requests. >> >> At this point, there's a question whether the proposal should request >> a lower, six-pointed asterisk or a the inverted asterism, and whether >> it is possible to adduce enough data to help in making that decision. >> >> What we need for cases like this would be a place for proposals that >> are in a public "pending" state, so that people other than the >> proposer can adduce additional evidence over time without the need to >> immediately come down one way or the other. >> >> A./ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sosipiuk at gmail.com Thu Mar 30 19:09:22 2023 From: sosipiuk at gmail.com (=?UTF-8?Q?S=C5=82awomir_Osipiuk?=) Date: Fri, 31 Mar 2023 00:09:22 +0000 Subject: Inverted asterism In-Reply-To: References: Message-ID: <1680221165365.603965960.2270730226@gmail.com> On Thursday, 30 March 2023, 14:10:10 (-04:00), James Kass via Unicode wrote: > > Would this be considered a glyph variant, or a separate character? Are the two forms ever used contrastively in the same source? > This is my question as well. Is there a semantic difference between the "regular" and inverted asterism? At first glance I would say this should just be considered a stylistic variant. From cate at cateee.net Fri Mar 31 02:24:02 2023 From: cate at cateee.net (Giacomo Catenazzi) Date: Fri, 31 Mar 2023 09:24:02 +0200 Subject: Fw: Inverted asterism In-Reply-To: References: Message-ID: <93ca7716-d356-b02e-9d9e-fd8d080e126b@cateee.net> On 30 Mar 2023 20:12, Marius Spix via Unicode wrote: > You can already represent this with?U+002A +?U+204E +?U+002A:?*?* > There are many examples, where asterisks are used as ornaments. I do not think that in this case this character is used as ornaments. As in many scientific books (catalogue like), they uses characters to split different parts (e.g. to distinguish description, sources/bibliography, coding information, personal notes). Maybe like we have on many modern ICT books, with "information", "warning", "attention" icons before a paragraph. So semantically it is part of "presentation" not on "pure text", and further: at paragraph level (unlike m-dashes). Note: often such characters may be recycled (one way or the other way) with some meaning in text, and I would not be surprised if it is used also on some text. if I will have time, I'll look some old encyclopaedias or floras. ciao cate From marius.spix at web.de Fri Mar 31 02:59:21 2023 From: marius.spix at web.de (Marius Spix) Date: Fri, 31 Mar 2023 09:59:21 +0200 Subject: Aw: Re: Fw: Inverted asterism In-Reply-To: <122039fc-ad73-4ca9-e2ce-9440281b2090@ix.netcom.com> References: <122039fc-ad73-4ca9-e2ce-9440281b2090@ix.netcom.com> Message-ID: An HTML attachment was scrubbed... URL: From alex.plantema at xs4all.nl Fri Mar 31 16:51:32 2023 From: alex.plantema at xs4all.nl (Alex Plantema) Date: Fri, 31 Mar 2023 23:51:32 +0200 Subject: Aw: Re: Inverted asterism In-Reply-To: <34d61524-b8cc-cc73-50e9-28c6ba019045@ix.netcom.com> References: <34d61524-b8cc-cc73-50e9-28c6ba019045@ix.netcom.com> Message-ID: Op do 30-03-2023 om 23:47 schreef Asmus Freytag via Unicode: > I'm a bit confused here. > > Inspecting the page I don't see the use of "rotate()" and when I look at source text as well as cut&paste, I see ?*?*? composed from two six pointed and one five pointed asterisk. > > That seems a crude representation of what is in print, in that the print original seems to have three identical asterisks. > > The page source contains ?The asterism is one glyph. Whether there are 5 or 6 points depends upon the font. -- Alex. -------------- next part -------------- An HTML attachment was scrubbed... URL: From boldewyn at gmail.com Fri Mar 31 17:46:03 2023 From: boldewyn at gmail.com (Manuel Strehl) Date: Sat, 1 Apr 2023 00:46:03 +0200 Subject: How do U+2571..U+2573 connect? Message-ID: Hi, if you look at the Box Drawing block, e.g., https://codepoints.net/box_drawing, every character goes through the middle of the edges of an imagined rectangle around the glyph. That is, apart from U+2571, U+2572 and U+2573, the diagonal lines. Those touch exclusively the corners of said rectangle. I fail to imagine how these three characters could ever attach to any of the other characters in this block. Are they not meant to do that or am I missing a trick here? Thanks for any pointers! Cheers, Manuel PS: This question was triggered by this reddit post: https://www.reddit.com/r/Unicode/comments/127y7dn/looking_for_box_drawing_characters/ From asmusf at ix.netcom.com Fri Mar 31 18:06:04 2023 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 31 Mar 2023 16:06:04 -0700 Subject: How do U+2571..U+2573 connect? In-Reply-To: References: Message-ID: <03d4f71d-9d08-01bc-4227-8cf684197226@ix.netcom.com> The easy answer is that these do not consist of a single set. For example, the single-, double-line regular-stroke symbols and their combinations, form a subset that is supported by the DOS code page 437. Another common DOS code page (850) has only the single-line ones. Neither set contains any element that terminates in the middle of the cell. Those, as well as the heavy stroke or mixed weight combinations are presumably supported somewhere else, as are the curved corners. I don't know off hand what the character sets are from which these were derived, but again, I would not be surprised if they supported only a subset. The diagonals, therefore, are not necessarily from any of those subsets, and therefore likely never intended to be used to provide diagonal connections. A./ On 3/31/2023 3:46 PM, Manuel Strehl via Unicode wrote: > Hi, > > if you look at the Box Drawing block, e.g., > https://codepoints.net/box_drawing, every character goes through the > middle of the edges of an imagined rectangle around the glyph. That > is, apart from U+2571, U+2572 and U+2573, the diagonal lines. Those > touch exclusively the corners of said rectangle. > > I fail to imagine how these three characters could ever attach to any > of the other characters in this block. Are they not meant to do that > or am I missing a trick here? > > Thanks for any pointers! > > Cheers, > Manuel > > PS: This question was triggered by this reddit post: > https://www.reddit.com/r/Unicode/comments/127y7dn/looking_for_box_drawing_characters/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From boldewyn at gmail.com Fri Mar 31 18:23:08 2023 From: boldewyn at gmail.com (Manuel Strehl) Date: Sat, 1 Apr 2023 01:23:08 +0200 Subject: How do U+2571..U+2573 connect? In-Reply-To: <03d4f71d-9d08-01bc-4227-8cf684197226@ix.netcom.com> References: <03d4f71d-9d08-01bc-4227-8cf684197226@ix.netcom.com> Message-ID: Thanks for the answer! I almost thought that this was what?s going on. From looking at the Wikipedia page, https://en.wikipedia.org/wiki/Box-drawing_character, it seems that those three were not part of any larger legacy encoding. (The curved corners come from Acorn Computers by the way, acording to that article.) This leaves me wondering, where those three characters come from at all. Manuel Am 01.04.23 um 01:06 schrieb Asmus Freytag via Unicode: > The easy answer is that these do not consist of a single set. For > example, the single-, double-line regular-stroke symbols and their > combinations, form a subset that is supported by the DOS code page > 437. Another common DOS code page (850) has only the single-line ones. > > Neither set contains any element that terminates in the middle of the > cell. > > Those, as well as the heavy stroke or mixed weight combinations are > presumably supported somewhere else, as are the curved corners. I > don't know off hand what the character sets are from which these were > derived, but again, I would not be surprised if they supported only a > subset. > > The diagonals, therefore, are not necessarily from any of those > subsets, and therefore likely never intended to be used to provide > diagonal connections. > > A./ > > > On 3/31/2023 3:46 PM, Manuel Strehl via Unicode wrote: >> Hi, >> >> if you look at the Box Drawing block, e.g., >> https://codepoints.net/box_drawing, every character goes through the >> middle of the edges of an imagined rectangle around the glyph. That >> is, apart from U+2571, U+2572 and U+2573, the diagonal lines. Those >> touch exclusively the corners of said rectangle. >> >> I fail to imagine how these three characters could ever attach to any >> of the other characters in this block. Are they not meant to do that >> or am I missing a trick here? >> >> Thanks for any pointers! >> >> Cheers, >> Manuel >> >> PS: This question was triggered by this reddit post: >> https://www.reddit.com/r/Unicode/comments/127y7dn/looking_for_box_drawing_characters/ > > From doug at ewellic.org Fri Mar 31 18:26:51 2023 From: doug at ewellic.org (Doug Ewell) Date: Fri, 31 Mar 2023 23:26:51 +0000 Subject: How do U+2571..U+2573 connect? In-Reply-To: References: <03d4f71d-9d08-01bc-4227-8cf684197226@ix.netcom.com> Message-ID: They are in at least the T.101-G2 set, used for teletext. ?Doug Sent via the Samsung Galaxy S22 Ultra 5G, an AT&T 5G smartphone Get Outlook for Android ________________________________ From: Unicode on behalf of Manuel Strehl via Unicode Sent: Friday, March 31, 2023 5:23:08 PM To: unicode at corp.unicode.org Subject: Re: How do U+2571..U+2573 connect? Thanks for the answer! I almost thought that this was what?s going on. From looking at the Wikipedia page, https://en.wikipedia.org/wiki/Box-drawing_character, it seems that those three were not part of any larger legacy encoding. (The curved corners come from Acorn Computers by the way, acording to that article.) This leaves me wondering, where those three characters come from at all. Manuel Am 01.04.23 um 01:06 schrieb Asmus Freytag via Unicode: > The easy answer is that these do not consist of a single set. For > example, the single-, double-line regular-stroke symbols and their > combinations, form a subset that is supported by the DOS code page > 437. Another common DOS code page (850) has only the single-line ones. > > Neither set contains any element that terminates in the middle of the > cell. > > Those, as well as the heavy stroke or mixed weight combinations are > presumably supported somewhere else, as are the curved corners. I > don't know off hand what the character sets are from which these were > derived, but again, I would not be surprised if they supported only a > subset. > > The diagonals, therefore, are not necessarily from any of those > subsets, and therefore likely never intended to be used to provide > diagonal connections. > > A./ > > > On 3/31/2023 3:46 PM, Manuel Strehl via Unicode wrote: >> Hi, >> >> if you look at the Box Drawing block, e.g., >> https://codepoints.net/box_drawing, every character goes through the >> middle of the edges of an imagined rectangle around the glyph. That >> is, apart from U+2571, U+2572 and U+2573, the diagonal lines. Those >> touch exclusively the corners of said rectangle. >> >> I fail to imagine how these three characters could ever attach to any >> of the other characters in this block. Are they not meant to do that >> or am I missing a trick here? >> >> Thanks for any pointers! >> >> Cheers, >> Manuel >> >> PS: This question was triggered by this reddit post: >> https://www.reddit.com/r/Unicode/comments/127y7dn/looking_for_box_drawing_characters/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kent.b.karlsson at bahnhof.se Fri Mar 31 18:44:07 2023 From: kent.b.karlsson at bahnhof.se (Kent Karlsson) Date: Sat, 1 Apr 2023 01:44:07 +0200 Subject: How do U+2571..U+2573 connect? In-Reply-To: References: <03d4f71d-9d08-01bc-4227-8cf684197226@ix.netcom.com> Message-ID: I don?t see them in any of the G2 character sets in https://www.etsi.org/deliver/etsi_en/300700_300799/300706/01.02.01_60/en_300706v010201p.pdf (ETSI EN 300 706 V1.2.1 (2003-04) European Standard (Telecommunications series) Enhanced Teletext specification). /Kent K > 1 apr. 2023 kl. 01:26 skrev Doug Ewell via Unicode : > > They are in at least the T.101-G2 set, used for teletext. > > ?Doug > > > > Sent via the Samsung Galaxy S22 Ultra 5G, an AT&T 5G smartphone > Get Outlook for Android > From: Unicode on behalf of Manuel Strehl via Unicode > Sent: Friday, March 31, 2023 5:23:08 PM > To: unicode at corp.unicode.org > Subject: Re: How do U+2571..U+2573 connect? > > Thanks for the answer! I almost thought that this was what?s going on. > From looking at the Wikipedia page, > https://en.wikipedia.org/wiki/Box-drawing_character , it seems that those > three were not part of any larger legacy encoding. (The curved corners > come from Acorn Computers by the way, acording to that article.) > > This leaves me wondering, where those three characters come from at all. > > Manuel > > Am 01.04.23 um 01:06 schrieb Asmus Freytag via Unicode: > > The easy answer is that these do not consist of a single set. For > > example, the single-, double-line regular-stroke symbols and their > > combinations, form a subset that is supported by the DOS code page > > 437. Another common DOS code page (850) has only the single-line ones. > > > > Neither set contains any element that terminates in the middle of the > > cell. > > > > Those, as well as the heavy stroke or mixed weight combinations are > > presumably supported somewhere else, as are the curved corners. I > > don't know off hand what the character sets are from which these were > > derived, but again, I would not be surprised if they supported only a > > subset. > > > > The diagonals, therefore, are not necessarily from any of those > > subsets, and therefore likely never intended to be used to provide > > diagonal connections. > > > > A./ > > > > > > On 3/31/2023 3:46 PM, Manuel Strehl via Unicode wrote: > >> Hi, > >> > >> if you look at the Box Drawing block, e.g., > >> https://codepoints.net/box_drawing , every character goes through the > >> middle of the edges of an imagined rectangle around the glyph. That > >> is, apart from U+2571, U+2572 and U+2573, the diagonal lines. Those > >> touch exclusively the corners of said rectangle. > >> > >> I fail to imagine how these three characters could ever attach to any > >> of the other characters in this block. Are they not meant to do that > >> or am I missing a trick here? > >> > >> Thanks for any pointers! > >> > >> Cheers, > >> Manuel > >> > >> PS: This question was triggered by this reddit post: > >> https://www.reddit.com/r/Unicode/comments/127y7dn/looking_for_box_drawing_characters/ > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From beckiergb at gmail.com Fri Mar 31 19:39:10 2023 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Fri, 31 Mar 2023 17:39:10 -0700 Subject: How do U+2571..U+2573 connect? In-Reply-To: References: <03d4f71d-9d08-01bc-4227-8cf684197226@ix.netcom.com> Message-ID: These three box drawing diagonals appear in at least: - Amstrad CPC - Mattel Aquarius - Atari 8-bit - MSX - PETSCII - Kaypro - Sharp MZ - Ohio Scientific - Robotron See page 11 of: https://www.unicode.org/L2/L2019/19025-aux-LegacyComputingSources.pdf See page 5 of: https://www.unicode.org/L2/L2021/21235-terminals-supplement-sources.pdf They don't appear in Teletext. As to how they came to be in Unicode originally, I don't know. Probably some IBM or DEC character set. -- Rebecca Bettencourt On Fri, Mar 31, 2023 at 4:47?PM Kent Karlsson via Unicode < unicode at corp.unicode.org> wrote: > I don?t see them in any of the G2 character sets in > > https://www.etsi.org/deliver/etsi_en/300700_300799/300706/01.02.01_60/en_300706v010201p.pdf > (ETSI EN 300 706 V1.2.1 (2003-04) European Standard (Telecommunications > series) *Enhanced Teletext specification*). > > /Kent K > > 1 apr. 2023 kl. 01:26 skrev Doug Ewell via Unicode < > unicode at corp.unicode.org>: > > They are in at least the T.101-G2 set, used for teletext. > > ?Doug > > > > Sent via the Samsung Galaxy S22 Ultra 5G, an AT&T 5G smartphone > Get Outlook for Android > ------------------------------ > *From:* Unicode on behalf of Manuel > Strehl via Unicode > *Sent:* Friday, March 31, 2023 5:23:08 PM > *To:* unicode at corp.unicode.org > *Subject:* Re: How do U+2571..U+2573 connect? > > Thanks for the answer! I almost thought that this was what?s going on. > From looking at the Wikipedia page, > https://en.wikipedia.org/wiki/Box-drawing_character, it seems that those > three were not part of any larger legacy encoding. (The curved corners > come from Acorn Computers by the way, acording to that article.) > > This leaves me wondering, where those three characters come from at all. > > Manuel > > Am 01.04.23 um 01:06 schrieb Asmus Freytag via Unicode: > > The easy answer is that these do not consist of a single set. For > > example, the single-, double-line regular-stroke symbols and their > > combinations, form a subset that is supported by the DOS code page > > 437. Another common DOS code page (850) has only the single-line ones. > > > > Neither set contains any element that terminates in the middle of the > > cell. > > > > Those, as well as the heavy stroke or mixed weight combinations are > > presumably supported somewhere else, as are the curved corners. I > > don't know off hand what the character sets are from which these were > > derived, but again, I would not be surprised if they supported only a > > subset. > > > > The diagonals, therefore, are not necessarily from any of those > > subsets, and therefore likely never intended to be used to provide > > diagonal connections. > > > > A./ > > > > > > On 3/31/2023 3:46 PM, Manuel Strehl via Unicode wrote: > >> Hi, > >> > >> if you look at the Box Drawing block, e.g., > >> https://codepoints.net/box_drawing, every character goes through the > >> middle of the edges of an imagined rectangle around the glyph. That > >> is, apart from U+2571, U+2572 and U+2573, the diagonal lines. Those > >> touch exclusively the corners of said rectangle. > >> > >> I fail to imagine how these three characters could ever attach to any > >> of the other characters in this block. Are they not meant to do that > >> or am I missing a trick here? > >> > >> Thanks for any pointers! > >> > >> Cheers, > >> Manuel > >> > >> PS: This question was triggered by this reddit post: > >> > https://www.reddit.com/r/Unicode/comments/127y7dn/looking_for_box_drawing_characters/ > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: