From aprilop at fn.de Fri Jul 1 07:15:16 2022 From: aprilop at fn.de (Andreas Prilop) Date: Fri, 01 Jul 2022 12:15:16 +0000 Subject: Different Bidirectional Character Types Message-ID: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> Reference: https://unicode.org/reports/tr9/#Bidirectional_Character_Types Why do Hebrew letters and Arabic letters have different bidirectional character types? Some effects can be seen using this HTML code:

רוקפורד 555-2368
روكفورد 555-2368
או 3−2=1
أو 3−2=1

Why do Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?) have different bidirectional character types? Some effects can be seen using this HTML code:

١٩٩٩ ١٢ ٣١ ١٩٩٩-١٢-٣١
۱۹۹۹ ۱۲ ۳۱ ۱۹۹۹-۱۲-۳۱

From aprilop at fn.de Fri Jul 1 07:36:40 2022 From: aprilop at fn.de (Andreas Prilop) Date: Fri, 01 Jul 2022 12:36:40 +0000 Subject: Different Bidirectional Character Types In-Reply-To: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> Message-ID: <5462E34A-9826-4FD8-91EA-E73BD97643B0@fn.de> I wrote: > Why do Hebrew letters and Arabic letters have different > bidirectional character types? > Some effects can be seen using this HTML code: Or visit https://corp.unicode.org/pipermail/unicode/2022-July/010191.html From asmusf at ix.netcom.com Fri Jul 1 11:05:46 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 1 Jul 2022 09:05:46 -0700 Subject: Different Bidirectional Character Types In-Reply-To: <5462E34A-9826-4FD8-91EA-E73BD97643B0@fn.de> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <5462E34A-9826-4FD8-91EA-E73BD97643B0@fn.de> Message-ID: An HTML attachment was scrubbed... URL: From aprilop at fn.de Fri Jul 1 12:02:35 2022 From: aprilop at fn.de (Andreas Prilop) Date: Fri, 01 Jul 2022 17:02:35 +0000 Subject: Different Bidirectional Character Types In-Reply-To: References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <5462E34A-9826-4FD8-91EA-E73BD97643B0@fn.de> Message-ID: On 1 July 2022, Asmus Freytag wrote: >> Why do Hebrew letters and Arabic letters have different >> bidirectional character types? >> https://corp.unicode.org/pipermail/unicode/2022-July/010191.html > > If this is not explained in the text of UAX#9 can you point out > where there explanation would need to be improved? I cannot find an explanation *why* Hebrew and Arabic letters should behave differently. Why ?555-2368? after Hebrew letters but ?2368-555? after Arabic letters? Why ?31-12-1999? with Arabic-Indic digits but ?1999-12-31? with Persian letters? Why? From haberg-1 at telia.com Sat Jul 2 04:01:00 2022 From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=) Date: Sat, 2 Jul 2022 11:01:00 +0200 Subject: Different Bidirectional Character Types In-Reply-To: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> Message-ID: > On 1 Jul 2022, at 14:15, Andreas Prilop via Unicode wrote: > > Reference: > https://unicode.org/reports/tr9/#Bidirectional_Character_Types > > Why do Hebrew letters and Arabic letters have different > bidirectional character types? I cannot parse this, but in Hebrew, Arabic, and Persian, text is written RTL, but numbers LTR. For example, trying A123 in a translator supporting those scripts, I get: ?123 ? ??? ? ??? From richard.wordingham at ntlworld.com Sat Jul 2 04:54:46 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Sat, 2 Jul 2022 10:54:46 +0100 Subject: Different Bidirectional Character Types In-Reply-To: References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> Message-ID: <20220702105446.033065ab@JRWUBU2> On Sat, 2 Jul 2022 11:01:00 +0200 Hans ?berg via Unicode wrote: > > On 1 Jul 2022, at 14:15, Andreas Prilop via Unicode > > wrote: > > > > Reference: > > https://unicode.org/reports/tr9/#Bidirectional_Character_Types > > > > Why do Hebrew letters and Arabic letters have different > > bidirectional character types? > > I cannot parse this, but in Hebrew, Arabic, and Persian, text is > written RTL, but numbers LTR. For example, trying A123 in a > translator supporting those scripts, I get: ?123 ? ??? > ? ??? > > For numbers, using natural language, you don't mean LTR, but 'with the most significant digit on the left'. It is a convention that the when encoding 'four and twenty' using digits, the most significant digit is stored first. N'ko decimal numbers have the most significant digit on the right, with the result that N'ko digits have bidi class Right_To_Left, as do N'ko letters. As to parsing the question, at the literal level Hebrew letters have bidi class Right_To_Left (R) while Arabic letters have bidi class Arabic_Letter (AL); Moroccan decimal digits (e.g U+0030) have bidi class European_Number (EN), Egyptian decimal digits have bidi class Arabic_Number (AN), Urdu decimal digits have bidi class European_Number (EN) and Hindi decimal digits (e.g. U+0966) have bidi class Left_to_Right (L). When one throws dollar signs, which have bidi class European_Terminator (ET) into the mix, these differences matter to the bidi algorithm. Richard. From eliz at gnu.org Sat Jul 2 05:13:53 2022 From: eliz at gnu.org (Eli Zaretskii) Date: Sat, 02 Jul 2022 13:13:53 +0300 Subject: Different Bidirectional Character Types In-Reply-To: <20220702105446.033065ab@JRWUBU2> (message from Richard Wordingham via Unicode on Sat, 2 Jul 2022 10:54:46 +0100) References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> Message-ID: <835ykfddpa.fsf@gnu.org> > Date: Sat, 2 Jul 2022 10:54:46 +0100 > From: Richard Wordingham via Unicode > > On Sat, 2 Jul 2022 11:01:00 +0200 > Hans ?berg via Unicode wrote: > > > > On 1 Jul 2022, at 14:15, Andreas Prilop via Unicode > > > wrote: > > > > > > Reference: > > > https://unicode.org/reports/tr9/#Bidirectional_Character_Types > > > > > > Why do Hebrew letters and Arabic letters have different > > > bidirectional character types? > > > > I cannot parse this, but in Hebrew, Arabic, and Persian, text is > > written RTL, but numbers LTR. For example, trying A123 in a > > translator supporting those scripts, I get: ?123 ? ??? > > ? ??? > > > > > > For numbers, using natural language, you don't mean LTR, but 'with the > most significant digit on the left'. It is a convention that the when > encoding 'four and twenty' using digits, the most significant digit is > stored first. N'ko decimal numbers have the most significant digit on > the right, with the result that N'ko digits have bidi class > Right_To_Left, as do N'ko letters. > > As to parsing the question, at the literal level Hebrew letters have > bidi class Right_To_Left (R) while Arabic letters have bidi class > Arabic_Letter (AL); Moroccan decimal digits (e.g U+0030) have bidi > class European_Number (EN), Egyptian decimal digits have bidi class > Arabic_Number (AN), Urdu decimal digits have bidi class European_Number > (EN) and Hindi decimal digits (e.g. U+0966) have bidi class > Left_to_Right (L). When one throws dollar signs, which have bidi > class European_Terminator (ET) into the mix, these differences matter to > the bidi algorithm. I think a simpler answer is that Arabic letters (bidi class AL) in some cases make European Numbers (EN) behave like Arabic Numbers (AN); see rule W2 of UAX#9. And Arabic Numbers then affect how other "weak" characters are reordered, see W6. IOW, these distinctions are needed to produce the expected reordered order in each case. From aprilop at fn.de Sat Jul 2 06:22:09 2022 From: aprilop at fn.de (Andreas Prilop) Date: Sat, 02 Jul 2022 11:22:09 +0000 Subject: Different Bidirectional Character Types In-Reply-To: <835ykfddpa.fsf@gnu.org> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <835ykfddpa.fsf@gnu.org> Message-ID: On 2 July 2022, Eli Zaretskii wrote: > I think a simpler answer is that Arabic letters (bidi class AL) in > some cases make European Numbers (EN) behave like Arabic Numbers (AN); > see rule W2 of UAX#9. And Arabic Numbers then affect how other "weak" > characters are reordered, see W6. My question was: Why? http://google.com/search?q=555-2368+%22%D7%A8%D7%95%D7%A7%D7%A4%D7%95%D7%A8%D7%93%22&filter=0 displays the number ?555-2368?. http://google.com/search?q=555-2368+%22%D8%B1%D9%88%D9%83%D9%81%D9%88%D8%B1%D8%AF%22&filter=0 displays the number ?2368-555?. Why this difference? And why are Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?) treated differently? From eliz at gnu.org Sat Jul 2 06:56:29 2022 From: eliz at gnu.org (Eli Zaretskii) Date: Sat, 02 Jul 2022 14:56:29 +0300 Subject: Different Bidirectional Character Types In-Reply-To: (message from Andreas Prilop via Unicode on Sat, 02 Jul 2022 11:22:09 +0000) References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <835ykfddpa.fsf@gnu.org> Message-ID: <83v8sfbudu.fsf@gnu.org> > Date: Sat, 02 Jul 2022 11:22:09 +0000 > From: Andreas Prilop via Unicode > > On 2 July 2022, Eli Zaretskii wrote: > > > I think a simpler answer is that Arabic letters (bidi class AL) in > > some cases make European Numbers (EN) behave like Arabic Numbers (AN); > > see rule W2 of UAX#9. And Arabic Numbers then affect how other "weak" > > characters are reordered, see W6. > > My question was: Why? > > http://google.com/search?q=555-2368+%22%D7%A8%D7%95%D7%A7%D7%A4%D7%95%D7%A8%D7%93%22&filter=0 > displays the number ?555-2368?. > > http://google.com/search?q=555-2368+%22%D8%B1%D9%88%D9%83%D9%81%D9%88%D8%B1%D8%AF%22&filter=0 > displays the number ?2368-555?. > > Why this difference? > > And why are Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?) > treated differently? Because the expected order on display is different. The expected order differs because the way different script are written differs, the reasons are largely historical and cultural, AFAIK. IOW, the reasons for these differences are instrumental, not theoretical: we need the characters to behave differently when reordered. From haberg-1 at telia.com Sat Jul 2 14:46:52 2022 From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=) Date: Sat, 2 Jul 2022 21:46:52 +0200 Subject: Different Bidirectional Character Types In-Reply-To: <20220702105446.033065ab@JRWUBU2> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> Message-ID: > On 2 Jul 2022, at 11:54, Richard Wordingham via Unicode wrote: > > On Sat, 2 Jul 2022 11:01:00 +0200 > Hans ?berg via Unicode wrote: > >>> On 1 Jul 2022, at 14:15, Andreas Prilop via Unicode >>> wrote: >>> >>> Reference: >>> https://unicode.org/reports/tr9/#Bidirectional_Character_Types >>> >>> Why do Hebrew letters and Arabic letters have different >>> bidirectional character types? >> >> I cannot parse this, but in Hebrew, Arabic, and Persian, text is >> written RTL, but numbers LTR. For example, trying A123 in a >> translator supporting those scripts, I get: ?123 ? ??? >> ? ??? > > For numbers, using natural language, you don't mean LTR, but 'with the > most significant digit on the left'. I asked some Arab speaking how they think about it when writing numbers, and they said they indeed think about it as writing LTR, and not RTL with changed endianness. In a file with RTL/LTR markers, by this, the digits get the same order. I assumed this is how Unicode represents it, but it would be nice with clarification. From textexin at xencraft.com Sat Jul 2 16:02:31 2022 From: textexin at xencraft.com (Tex) Date: Sat, 2 Jul 2022 14:02:31 -0700 Subject: Different Bidirectional Character Types In-Reply-To: References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <835ykfddpa.fsf@gnu.org> Message-ID: <001e01d88e57$0be510d0$23af3270$@xencraft.com> On my windows system, using either chrome or firefox, both links display the same for me. What setup are you using Andreas? tex -----Original Message----- From: Unicode [mailto:unicode-bounces at corp.unicode.org] On Behalf Of Andreas Prilop via Unicode Sent: Saturday, July 2, 2022 4:22 AM To: unicode at corp.unicode.org Subject: Re: Different Bidirectional Character Types On 2 July 2022, Eli Zaretskii wrote: > I think a simpler answer is that Arabic letters (bidi class AL) in > some cases make European Numbers (EN) behave like Arabic Numbers (AN); > see rule W2 of UAX#9. And Arabic Numbers then affect how other "weak" > characters are reordered, see W6. My question was: Why? http://google.com/search?q=555-2368+%22%D7%A8%D7%95%D7%A7%D7%A4%D7%95%D7%A8%D7%93%22&filter=0 displays the number ?555-2368?. http://google.com/search?q=555-2368+%22%D8%B1%D9%88%D9%83%D9%81%D9%88%D8%B1%D8%AF%22&filter=0 displays the number ?2368-555?. Why this difference? And why are Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?) treated differently? From eliz at gnu.org Sun Jul 3 00:04:17 2022 From: eliz at gnu.org (Eli Zaretskii) Date: Sun, 03 Jul 2022 08:04:17 +0300 Subject: Different Bidirectional Character Types In-Reply-To: (message from Hans =?utf-8?Q?=C3=85berg?= via Unicode on Sat, 2 Jul 2022 21:46:52 +0200) References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> Message-ID: <8335fibxda.fsf@gnu.org> > Date: Sat, 2 Jul 2022 21:46:52 +0200 > Cc: unicode at corp.unicode.org > From: Hans ?berg via Unicode > > > For numbers, using natural language, you don't mean LTR, but 'with the > > most significant digit on the left'. > > I asked some Arab speaking how they think about it when writing numbers, and they said they indeed think about it as writing LTR, and not RTL with changed endianness. In a file with RTL/LTR markers, by this, the digits get the same order. I assumed this is how Unicode represents it, but it would be nice with clarification. I thin UAX#9 clarifies it perfectly: numbers are displayed in LTR order. From aprilop at fn.de Sun Jul 3 00:51:36 2022 From: aprilop at fn.de (Andreas Prilop) Date: Sun, 03 Jul 2022 05:51:36 +0000 Subject: Different Bidirectional Character Types In-Reply-To: <001e01d88e57$0be510d0$23af3270$@xencraft.com> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <835ykfddpa.fsf@gnu.org> <001e01d88e57$0be510d0$23af3270$@xencraft.com> Message-ID: <067FEF07-4060-42B8-AED7-8D71C32D627F@fn.de> On 2 July 2022, Tex wrote: >> http://google.com/search?q=555-2368+%22%D7%A8%D7%95%D7%A7%D7%A4%D7%95%D7%A8%D7%93%22&filter=0 >> displays the number ?555-2368?. >> >> http://google.com/search?q=555-2368+%22%D8%B1%D9%88%D9%83%D9%81%D9%88%D8%B1%D8%AF%22&filter=0 >> displays the number ?2368-555?. > > On my windows system, using either chrome or firefox, both links display the same for me. Sorry for the confusion. Not the link itself, but the results, the found pages. I search for ?555-2368?. With Hebrew letters, the display is ?555-2368?. With Arabic letters, the display is ?2368-555?. Look at the results. And my other question >> And why are Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?) >> treated differently? I write ?1999-12-31?. The display is ?1999-12-31? with Persian digits. The display is ?31-12-1999? with Arabic-Indic digits. https://corp.unicode.org/pipermail/unicode/2022-July/010191.html From richard.wordingham at ntlworld.com Sun Jul 3 04:13:08 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Sun, 3 Jul 2022 10:13:08 +0100 Subject: Different Bidirectional Character Types In-Reply-To: <8335fibxda.fsf@gnu.org> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> Message-ID: <20220703101308.55a36bf0@JRWUBU2> On Sun, 03 Jul 2022 08:04:17 +0300 Eli Zaretskii via Unicode wrote: > > Date: Sat, 2 Jul 2022 21:46:52 +0200 > > Cc: unicode at corp.unicode.org > > From: Hans ?berg via Unicode > > > > > For numbers, using natural language, you don't mean LTR, but > > > 'with the most significant digit on the left'. > > > > I asked some Arab speaking how they think about it when writing > > numbers, and they said they indeed think about it as writing LTR, > > and not RTL with changed endianness. In a file with RTL/LTR > > markers, by this, the digits get the same order. I assumed this is > > how Unicode represents it, but it would be nice with clarification. > > > > I thin UAX#9 clarifies it perfectly: numbers are displayed in LTR > order. But Hans is forwarding an answer as to which digit comes first when divorced from computers. The order of writing can in general be quite variable. For example, although the ordering vowel then tone is widely taught in Thailand, at least for vertical stacks, I've seen evidence of people trying to write the marks in a Tai Tham stack . (The marks were TONE-1, MAI KANG and then SIGN OA BELOW. Many people want to write the last of these , seventy years ago they would have said the consonant first; nowadays, they usually say the preposed vowel first. The fine details of the old scheme seem to be lost. Richard. From aprilop at fn.de Sun Jul 3 04:20:07 2022 From: aprilop at fn.de (Andreas Prilop) Date: Sun, 03 Jul 2022 09:20:07 +0000 Subject: Different Bidirectional Character Types In-Reply-To: <8335fibxda.fsf@gnu.org> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> Message-ID: <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> On 3 July 2022, Eli Zaretskii wrote: > I thin UAX#9 clarifies it perfectly: numbers are displayed in LTR order. This is undisputed. I ask about the differences ?555-2368? vs. ?2368-555? ?1=3?2? vs. ?1=2?3? ?1999-12-31? vs. ?31-12-1999? The Bidirectional Algorithm is responsible for these differences. But why? -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliz at gnu.org Sun Jul 3 04:42:39 2022 From: eliz at gnu.org (Eli Zaretskii) Date: Sun, 03 Jul 2022 12:42:39 +0300 Subject: Different Bidirectional Character Types In-Reply-To: <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> (message from Andreas Prilop via Unicode on Sun, 03 Jul 2022 09:20:07 +0000) References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> Message-ID: <83tu7ya5ww.fsf@gnu.org> > Date: Sun, 03 Jul 2022 09:20:07 +0000 > From: Andreas Prilop via Unicode > > I ask about the differences > > ?555-2368? vs. ?2368-555? > > ?1=3?2? vs. ?1=2?3? > > ?1999-12-31? vs. ?31-12-1999? > > The Bidirectional Algorithm is responsible for these differences. But why? Because that's how the users of each script want the text to be displayed in these cases. The UBA was specified as it is to satisfy the expectations of the users of the respective scripts. Those expectations have to do with history, traditions, and culture. And please note that your cases are no longer just numbers, they involve the dash ('-'), which is a "weak" character, and its reordering for display depends on surrounding text. From textexin at xencraft.com Sun Jul 3 15:36:27 2022 From: textexin at xencraft.com (Tex) Date: Sun, 3 Jul 2022 13:36:27 -0700 Subject: Different Bidirectional Character Types In-Reply-To: <067FEF07-4060-42B8-AED7-8D71C32D627F@fn.de> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <835ykfddpa.fsf@gnu.org> <001e01d88e57$0be510d0$23af3270$@xencraft.com> <067FEF07-4060-42B8-AED7-8D71C32D627F@fn.de> Message-ID: <003901d88f1c$925bdde0$b71399a0$@xencraft.com> I understood you meant the results. Perhaps we are seeing different results. Mine are consistent "555-2368?. If you want I can send you screen shots. Ah ok, I moved the number after the Arabic text in the search string and then the search flips the numbers around the hyphen. For Hebrew ahead of the number it does not. tex -----Original Message----- From: Unicode [mailto:unicode-bounces at corp.unicode.org] On Behalf Of Andreas Prilop via Unicode Sent: Saturday, July 2, 2022 10:52 PM To: unicode at corp.unicode.org Subject: Re: Different Bidirectional Character Types On 2 July 2022, Tex wrote: >> http://google.com/search?q=555-2368+%22%D7%A8%D7%95%D7%A7%D7%A4%D7%95%D7%A8%D7%93%22&filter=0 >> displays the number ?555-2368?. >> >> http://google.com/search?q=555-2368+%22%D8%B1%D9%88%D9%83%D9%81%D9%88%D8%B1%D8%AF%22&filter=0 >> displays the number ?2368-555?. > > On my windows system, using either chrome or firefox, both links display the same for me. Sorry for the confusion. Not the link itself, but the results, the found pages. I search for ?555-2368?. With Hebrew letters, the display is ?555-2368?. With Arabic letters, the display is ?2368-555?. Look at the results. And my other question >> And why are Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?) >> treated differently? I write ?1999-12-31?. The display is ?1999-12-31? with Persian digits. The display is ?31-12-1999? with Arabic-Indic digits. https://corp.unicode.org/pipermail/unicode/2022-July/010191.html From asmusf at ix.netcom.com Tue Jul 5 20:44:54 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Tue, 5 Jul 2022 18:44:54 -0700 Subject: Different Bidirectional Character Types In-Reply-To: <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> Message-ID: <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> An HTML attachment was scrubbed... URL: From ishida at w3.org Mon Jul 11 05:39:27 2022 From: ishida at w3.org (r12a) Date: Mon, 11 Jul 2022 11:39:27 +0100 Subject: Different Bidirectional Character Types In-Reply-To: <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> Message-ID: Does this help clarify the original question? Modern Standard Arabic: Expressions & sequences https://r12a.github.io/scripts/arabic/arb.html#expressions ?see also https://r12a.github.io/scripts/arabic/block.html#ar061C ri Asmus Freytag via Unicode wrote on 06/07/2022 02:44: > On 7/3/2022 2:20 AM, Andreas Prilop via Unicode wrote: >> On 3 July 2022, Eli Zaretskii wrote: >> >>> I thin UAX#9 clarifies it perfectly: numbers are displayed in LTR order. >> This is undisputed. >> I ask about the differences >> >> ?555-2368? vs. ?2368-555? >> >> ?1=3?2? vs. ?1=2?3? >> >> ?1999-12-31? vs. ?31-12-1999? >> >> The Bidirectional Algorithm is responsible for these differences. But why? > > The real answer is that this matches differences in displaying lists > of numbers (!) not order of digits, in Hebrew vs. Arabic. > > The Bidi algorithm uses the classes AL and AN (and rules that resolve > them) to implement these inherent differences in the way the various > scripts handle such cases (multiple groups of digits separated by punct). > > As I mentioned, I raised a public review issue to make sure that UAX#9 > either *specifically and explicitly* cites or, alternatively, > incorporates language that explains scripts have different preferences > in resolving groups of numbers (not: digits) and points in a high > level to where in the spec these preferences are addressed. > > I agree, it's not enough to reverse engineer the algorithm and > conclude that it behaves as specd. It should be a simple matter to > understand why it was designed the way it was. > > A./ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Mon Jul 11 12:34:49 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Mon, 11 Jul 2022 10:34:49 -0700 Subject: Different Bidirectional Character Types In-Reply-To: References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> Message-ID: <9b381ef2-6e43-5488-ba26-eeae1f6ad7aa@ix.netcom.com> An HTML attachment was scrubbed... URL: From aprilop at fn.de Mon Jul 11 14:07:19 2022 From: aprilop at fn.de (Andreas Prilop) Date: Mon, 11 Jul 2022 19:07:19 +0000 Subject: Different Bidirectional Character Types In-Reply-To: <9b381ef2-6e43-5488-ba26-eeae1f6ad7aa@ix.netcom.com> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <9b381ef2-6e43-5488-ba26-eeae1f6ad7aa@ix.netcom.com> Message-ID: <7C607CF2-7DBD-4D05-B47A-59A31301A29F@fn.de> On 11 July 2022, Asmus Freytag wrote: >> https://r12a.github.io/scripts/arabic/arb.html#expressions >> https://r12a.github.io/scripts/arabic/block.html#ar061C > > I think these are excellent summaries and we should make sure > we include a high-level version of this in the intro to UAX#9 > so that readers at least know what types of issues the algorithm > tries to address. I agree. Thank you very much for these links! They are in deed very helpful. From richard.wordingham at ntlworld.com Mon Jul 11 20:39:47 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Tue, 12 Jul 2022 02:39:47 +0100 Subject: Different Bidirectional Character Types In-Reply-To: References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> Message-ID: <20220712023947.157285c9@JRWUBU2> On Mon, 11 Jul 2022 11:39:27 +0100 r12a via Unicode wrote: > Does this help clarify the original question? > > Modern Standard Arabic: Expressions & sequences > https://r12a.github.io/scripts/arabic/arb.html#expressions It gives an inkling. However, I don't understand, "The underlying order of characters, and the typing order remain the same." The text of Figures 4 and 5 has to differ by more than the language tagging. Is this a corruption of an example which had (Near Eastern) Arabic numerals for Figure 4 and Eastern Arabic numerals for Figure 5? Supporting quotations would help, as this example looks weird. Are Persian number ranges calqued from European languages? Richard. From ishida at w3.org Tue Jul 12 01:27:59 2022 From: ishida at w3.org (r12a) Date: Tue, 12 Jul 2022 07:27:59 +0100 Subject: Different Bidirectional Character Types In-Reply-To: <20220712023947.157285c9@JRWUBU2> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> Message-ID: <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> Richard Wordingham via Unicode wrote on 12/07/2022 02:39: > On Mon, 11 Jul 2022 11:39:27 +0100 > r12a via Unicode wrote: > >> Does this help clarify the original question? >> >> Modern Standard Arabic: Expressions & sequences >> https://r12a.github.io/scripts/arabic/arb.html#expressions > It gives an inkling. However, I don't understand, "The > underlying order of characters, and the typing order remain the same." > The text of Figures 4 and 5 has to differ by more than the language > tagging. Is this a corruption of an example which had (Near Eastern) > Arabic numerals for Figure 4 and Eastern Arabic numerals for Figure 5? > > Supporting quotations would help, as this example looks weird. Are > Persian number ranges calqued from European languages? To make it clearer that this is just about the order of the displayed text, i changed the sentence you mentioned to "The underlying order of the digits...".? The difference is actually produced in this case by the addition of an LRM to the Persian, and not by the language setting. If you click on the image you'll see the characters that make up each example. ri -------------- next part -------------- An HTML attachment was scrubbed... URL: From aprilop at fn.de Tue Jul 12 10:48:31 2022 From: aprilop at fn.de (Andreas Prilop) Date: Tue, 12 Jul 2022 15:48:31 +0000 Subject: Different Bidirectional Character Types In-Reply-To: <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> Message-ID: On 12 July 2022, r12a wrote: >>> https://r12a.github.io/scripts/arabic/arb.html#expressions > > The difference is actually produced in this case by the addition > of an LRM to the Persian, and not by the language setting. It is still Arabic. The Arabic word ?? needs to be translated to ??. And the month should be spelled ????. From ishida at w3.org Tue Jul 12 11:08:48 2022 From: ishida at w3.org (r12a) Date: Tue, 12 Jul 2022 17:08:48 +0100 Subject: Different Bidirectional Character Types In-Reply-To: References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> Message-ID: <1364573f-44f6-c393-8366-8cedb0e695d4@w3.org> Erk. Thanks for pointing that out. Should be fixed now. ri Andreas Prilop via Unicode wrote on 12/07/2022 16:48: > On 12 July 2022, r12a wrote: > >>>> https://r12a.github.io/scripts/arabic/arb.html#expressions >> The difference is actually produced in this case by the addition >> of an LRM to the Persian, and not by the language setting. > It is still Arabic. The Arabic word ?? needs to be translated to ??. > And the month should be spelled ????. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aprilop at fn.de Tue Jul 12 11:32:32 2022 From: aprilop at fn.de (Andreas Prilop) Date: Tue, 12 Jul 2022 16:32:32 +0000 Subject: Different Bidirectional Character Types In-Reply-To: References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> Message-ID: On 12 July 2022, I wrote: > And the month should be spelled ????. This applies to both Arabic and Persian. From asmusf at ix.netcom.com Tue Jul 12 20:49:08 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Tue, 12 Jul 2022 18:49:08 -0700 Subject: Different Bidirectional Character Types In-Reply-To: <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> Message-ID: <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com> An HTML attachment was scrubbed... URL: From ishida at w3.org Wed Jul 13 04:58:22 2022 From: ishida at w3.org (r12a) Date: Wed, 13 Jul 2022 10:58:22 +0100 Subject: Different Bidirectional Character Types In-Reply-To: <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com> Message-ID: <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org> The approach differs not only by script and which digits are used, but also by language.? Arabic and Persian both use the Arabic script, but do things differently when it comes to ordering components of a range or expression. Also, we should probably mention that some scripts don't display simple numbers like Arabic/Hebrew, either. For example, in Adlam & N'Ko and various historical scripts numbers have the most-significant digit on the right. ri Asmus Freytag via Unicode wrote on 13/07/2022 02:49: > > I suggest we add something like the following to the Bidi FAQ: > > Q: Do modern bidirectional scripts all behave the same? > > While Arabic and Hebrew agree on the same ordering of digits, with the > most-significant digit on the left, the layout of entire numbers in > context, including groups of numbers or use of number?separators, > numerical and other punctuation differs both by script and, in the > case of Arabic, by which set of digits is used. No matter how the > layout is resolved the order of characters in memory?essentially > follows the order they are typed. > > Here are some papers that explore this in-depth with examples: > https://r12a.github.io/scripts/arabic/arb.html#expressions > https://r12a.github.io/scripts/arabic/block.html#ar061C > -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Wed Jul 13 09:43:02 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Wed, 13 Jul 2022 07:43:02 -0700 Subject: Different Bidirectional Character Types In-Reply-To: <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com> <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org> Message-ID: On 7/13/2022 2:58 AM, r12a wrote: > The approach differs not only by script and which digits are used, but > also by language. Arabic and Persian both use the Arabic script, but > do things differently when it comes to ordering components of a range > or expression. Isn't that difference handled by having two different sets of digits? As opposed to relying on a language tag. A./ > > Also, we should probably mention that some scripts don't display > simple numbers like Arabic/Hebrew, either. For example, in Adlam & > N'Ko and various historical scripts numbers have the most-significant > digit on the right. > > ri > > Asmus Freytag via Unicode wrote on 13/07/2022 02:49: >> >> I suggest we add something like the following to the Bidi FAQ: >> >> Q: Do modern bidirectional scripts all behave the same? >> >> While Arabic and Hebrew agree on the same ordering of digits, with >> the most-significant digit on the left, the layout of entire numbers >> in context, including groups of numbers or use of number?separators, >> numerical and other punctuation differs both by script and, in the >> case of Arabic, by which set of digits is used. No matter how the >> layout is resolved the order of characters in memory?essentially >> follows the order they are typed. >> >> Here are some papers that explore this in-depth with examples: >> https://r12a.github.io/scripts/arabic/arb.html#expressions >> https://r12a.github.io/scripts/arabic/block.html#ar061C >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ishida at w3.org Wed Jul 13 09:51:29 2022 From: ishida at w3.org (r12a) Date: Wed, 13 Jul 2022 15:51:29 +0100 Subject: Different Bidirectional Character Types In-Reply-To: References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com> <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org> Message-ID: Asmus Freytag wrote on 13/07/2022 15:43: > On 7/13/2022 2:58 AM, r12a wrote: >> The approach differs not only by script and which digits are used, >> but also by language. Arabic and Persian both use the Arabic script, >> but do things differently when it comes to ordering components of a >> range or expression. > > Isn't that difference handled by having two different sets of digits? > As opposed to relying on a language tag. > See the cases in figs. 3 and 4 at https://r12a.github.io/scripts/arabic/arb.html#expressions.? Same digits, different expectations about directionality. I wasn't talking about language tags or behaviour arising from character properties (indeed the language tag doesn't make a difference) ? i was talking about the user expectations differing from language to language about the order in which digits appear in the text. hth ri -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Wed Jul 13 10:03:07 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Wed, 13 Jul 2022 08:03:07 -0700 Subject: Different Bidirectional Character Types In-Reply-To: References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com> <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org> Message-ID: An HTML attachment was scrubbed... URL: From richard.wordingham at ntlworld.com Wed Jul 13 14:44:46 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Wed, 13 Jul 2022 20:44:46 +0100 Subject: Different Bidirectional Character Types In-Reply-To: References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com> <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org> Message-ID: <20220713204446.29499a36@JRWUBU2> On Wed, 13 Jul 2022 08:03:07 -0700 Asmus Freytag via Unicode wrote: > On 7/13/2022 7:51 AM, r12a via Unicode wrote: > Asmus Freytag wrote on 13/07/2022 15:43: > > On 7/13/2022 2:58 AM, r12a wrote: > >> The approach differs not only by script and which digits are used, > >> but also by language.? Arabic and Persian both use the Arabic > >> script, but do things differently when it comes to ordering > >> components of a range or expression. > >>> > >> Isn't that difference handled by having two different sets of > >> digits? As opposed to relying on a language tag. > >> > > See the cases in figs. 3 and 4 at > > https://r12a.github.io/scripts/arabic/arb.html#expressions.? Same > > digits, different expectations about directionality. > > > > I wasn't talking about language tags or behaviour arising from > > character properties (indeed the language tag doesn't make a > > difference) ? i was talking about the user expectations differing > > from language to language about the order in which digits appear in > > the text. > > > If I understand correctly, this would be a case that's not handled by > the UBA, then. Would that be worth calling out, you think? And to answer the original question, it would be good to start with the user expectations, and then explain how the UBA reduces (does it?) the jiggery pokery required of the typist to get the desired outcome. In particular, we seem to be exploiting a difference in glyph styles and promoting it it a character difference to get a left-to-right ordering. Richard. Richard. From richard.wordingham at ntlworld.com Wed Jul 13 14:51:08 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Wed, 13 Jul 2022 20:51:08 +0100 Subject: Different Bidirectional Character Types In-Reply-To: <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com> <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org> Message-ID: <20220713205108.11c29995@JRWUBU2> On Wed, 13 Jul 2022 10:58:22 +0100 r12a via Unicode wrote: > The approach differs not only by script and which digits are used, > but also by language.? Arabic and Persian both use the Arabic script, > but do things differently when it comes to ordering components of a > range or expression. > > Also, we should probably mention that some scripts don't display > simple numbers like Arabic/Hebrew, either. For example, in Adlam & > N'Ko and various historical scripts numbers have the most-significant > digit on the right. What are these historical scripts with the most significant digit on the right? Richard. From ishida at w3.org Thu Jul 14 00:49:06 2022 From: ishida at w3.org (r12a) Date: Thu, 14 Jul 2022 06:49:06 +0100 Subject: Different Bidirectional Character Types In-Reply-To: <20220713205108.11c29995@JRWUBU2> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com> <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org> <20220713205108.11c29995@JRWUBU2> Message-ID: <87bfec58-acd0-3247-28a7-0ebda3573577@w3.org> Richard Wordingham via Unicode wrote on 13/07/2022 20:51: > What are these historical scripts with the most significant digit on > the right? Go to https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=[:Bidi_Class=Right_To_Left:] and find sections that contain "Numbers". The list includes Imperial Aramaic, Palmyrene, Nabataean, etc... ri -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.wordingham at ntlworld.com Thu Jul 14 15:59:08 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Thu, 14 Jul 2022 21:59:08 +0100 Subject: Different Bidirectional Character Types In-Reply-To: <87bfec58-acd0-3247-28a7-0ebda3573577@w3.org> References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de> <20220702105446.033065ab@JRWUBU2> <8335fibxda.fsf@gnu.org> <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com> <20220712023947.157285c9@JRWUBU2> <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org> <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com> <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org> <20220713205108.11c29995@JRWUBU2> <87bfec58-acd0-3247-28a7-0ebda3573577@w3.org> Message-ID: <20220714215908.5e920b8e@JRWUBU2> On Thu, 14 Jul 2022 06:49:06 +0100 r12a via Unicode wrote: > Richard Wordingham via Unicode wrote on 13/07/2022 20:51: > > What are these historical scripts with the most significant digit on > > the right? > > Go to > https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=[:Bidi_Class=Right_To_Left:] > and find sections that contain "Numbers". > > The list includes Imperial Aramaic, Palmyrene, Nabataean, etc... But I see no ancient digits, and very much nothing that TUS calls a digit. The Mende Kikakui 'digits' would qualify, except that they're a 20th century system. Richard. From markus.icu at gmail.com Mon Jul 18 14:18:31 2022 From: markus.icu at gmail.com (Markus Scherer) Date: Mon, 18 Jul 2022 12:18:31 -0700 Subject: Unqualified vs. minimally-qualified emoji In-Reply-To: <08b3fded-7ae3-81a8-c223-2a878d53d929@gmx.de> References: <08b3fded-7ae3-81a8-c223-2a878d53d929@gmx.de> Message-ID: Dear Matthias, On Wed, Apr 6, 2022 at 10:02 PM Matthias Reitinger via Unicode < unicode at corp.unicode.org> wrote: > ... > > With this definitions I would expect the code point sequence > > 1F441 FE0F 200D 1F5E8 > (EYE, VARIATION SELECTOR-16, ZERO WIDTH JOINER, LEFT SPEECH BUBBLE) > > to be a minimally-qualified emoji: > > ... > > However, emoji-test.txt [2] lists this sequence as "unqualified". > > Can someone please explain why? Did I misinterpret the definitions, or is > this > an error in the emoji-test.txt file? > Did you get an answer to your question? If not, then you could try to submit a bug report: https://www.unicode.org/reporting.html "Report Error in Publication/Data" Best regards, markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From gtbot2007 at gmail.com Sat Jul 23 07:57:07 2022 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Sat, 23 Jul 2022 08:57:07 -0400 Subject: Hoefler Text Ornaments Message-ID: I don't understand why Wingdings/Webdings and Zapf Dingbats get to be in Unicode but not Hoefler Text Ornaments. (Not going to ask about Apple Symbols because that's a icon font not a dingbat font) -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.wordingham at ntlworld.com Sat Jul 23 11:12:44 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Sat, 23 Jul 2022 17:12:44 +0100 Subject: Tai Tham Text Encoding Message-ID: <20220723171244.7fb392af@JRWUBU2> Most characters for writing words in the Tai Tham script in normal texts have been encoded, though there are a few exceptions, of which TAI THAM LETTER LAO LOW HA is the most prominent exception. (This is mostly handled by repurposing TAI THAM LETTER LOW HA, which is not used in Lao. Their relationship is like U+11034 BRAHMI LETTER LLA and U+11075 BRAHMI LETTER OLD LETTER LLA.) On close reading of the TUS, perhaps we also need to disunify U+1A58 TAI THAM SIGN MAI KANG LAI depending on how it may be positioned relative to a following syllable with a preposed vowel. (It was originally proposed as two separate characters, distinguished by shape rather than positioning.) We may need some monstrosities such as 'INVISIBLE MAI SAM' (though I'd rather use CGJ). However, I am having a hard time persuading people that there is a defined encoding for combinations of characters that rendering engines should respect. What I regard as the basic definition of the encoding of text is contained in the approved proposals, rather than in TUS or any emanation thereof. What should I call the specification of the encoding of text, as opposed to the encoding of characters? Would it be suitable to refer to it as 'text encoding'? I am trying to work out what in the way of Tai Tham text encoding is laid down by the TUS and its emanations, such as the Unicode Character Database. It is significant that the Indic syllabic category is informative and by policy does not reflect sequencing requirements. What I am left with is the general properties of marks, the principle of canonical equivalence (which is still widely flouted) and the specific text in the Tai Tham section. Now, extracting specifications are a bit tricky. For example, consider "*Tone Marks*. Tai Tham has two combining tone marks, U+1A75 tai tham sign tone-1 and U+1A76 tai tham sign tone-2, which are used in Tai Lue and in Northern Thai. These are rendered above the vowel over the base consonant." In modern Tai Khuen, what I take to be TONE-1 is rendered to the right of the larger vowels over the base consonant, such as VOWEL SIGN I. Should I therefore conclude that what I have taken to be TONE-1 is something else? That would be ridiculous. We also have the statement in TUS Section 2.11 that "all sequences of character codes are permitted". I think I can extract some meaning from the text in the same section: "Tone marks are represented in logical order fol- lowing the vowel over the base consonant or consonant stack. If there is no vowel over a base consonant, then the tone is rendered directly over the consonant; this is the same way tones are treated in the Thai script." Consider the word ?????? in a typical Northern Thai style. The central stack, from top to bottom, is TONE-1, SIGN I, HIGH KA, SIGN OA BELOW. If there were 'no vowel over the base consonant', then TONE-1 would be rendered directly over the base consonant, which is not how it is written. Therefore the term 'vowel' refers to a vowel character rather than a complete phonetic vowel. Therefore the logical order of the marks above and below is either , as in the proposals, or . The USE insists on ! (The USE order could be corrected by its override method.) By contrast, there is some useful text on the position of U+1A7B TAI THAM SIGN MAI SAM in character code sequences. In summary, my main two questions are: Is 'encoding of text' the correct phrase for the definition of the correct arrangement? Is it appropriate to submit a proposal for the standardisation of Tai Tham text encoding? Richard. From beckiergb at gmail.com Sat Jul 23 14:04:20 2022 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Sat, 23 Jul 2022 12:04:20 -0700 Subject: Hoefler Text Ornaments In-Reply-To: References: Message-ID: Because Apple has more sense than Microsoft and decided their dingbat fonts don't need to be in Unicode. Someone back in 2011 collected all the glyphs from Apple's dingbat fonts: http://unicode.org/wg2/docs/n4127.pdf And Apple provided a response: http://unicode.org/L2/L2011/11309-apple-resp-n4127.pdf -- Rebecca Bettencourt On Sat, Jul 23, 2022 at 6:06 AM Gabriel Tellez via Unicode < unicode at corp.unicode.org> wrote: > I don't understand why Wingdings/Webdings and Zapf Dingbats get to be in > Unicode but not Hoefler Text Ornaments. (Not going to ask about Apple > Symbols because that's a icon font not a dingbat font) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Sat Jul 23 17:07:17 2022 From: jameskass at code2001.com (James Kass) Date: Sat, 23 Jul 2022 22:07:17 +0000 Subject: Hoefler Text Ornaments In-Reply-To: References: Message-ID: In 11309-apple-resp-n4127, John H. Jenkins wrote, "Apple feels that, absent evidence of widespread use, dingbats and similar glyphs are not suitable for general-purpose encoding." and "Apple feels that, in general, characters should be encoded in the Universal Character Set only on the basis of demonstrated need for general text interchange." In N4127, Karl Pentzlin noted that no effort was made to determine unification with existing characters, even in cases where unification was obvious.? For example, Hoefler Glyph 57 "ORN-FLEURDELIS" is shown in N4127 with a pointer to U+269C (?).? So some of the Hoefler ornaments are already exchangeable in Unicode. Apple didn't forbid future encoding of Hoefler ornaments, but rather keeps the existing bar of demonstrable usage in place. Any proposal to complete the Hoefler repertoire in Unicode would need to carefully examine unification and then show that plain-text interchange is necessary. On 2022-07-23 7:04 PM, Rebecca Bettencourt via Unicode wrote: > Because Apple has more sense than Microsoft and decided their dingbat fonts > don't need to be in Unicode. > > Someone back in 2011 collected all the glyphs from Apple's dingbat fonts: > http://unicode.org/wg2/docs/n4127.pdf > > And Apple provided a response: > http://unicode.org/L2/L2011/11309-apple-resp-n4127.pdf > > -- Rebecca Bettencourt > > > On Sat, Jul 23, 2022 at 6:06 AM Gabriel Tellez via Unicode < > unicode at corp.unicode.org> wrote: > >> I don't understand why Wingdings/Webdings and Zapf Dingbats get to be in >> Unicode but not Hoefler Text Ornaments. (Not going to ask about Apple >> Symbols because that's a icon font not a dingbat font) >> From ivanpan3 at gmail.com Sat Jul 23 18:07:32 2022 From: ivanpan3 at gmail.com (Ivan Panchenko) Date: Sun, 24 Jul 2022 01:07:32 +0200 Subject: =?UTF-8?B?Q2hhbmdlIOKAnFJlbGF0aW9u4oCdIHRvIOKAnExvZ2ljYWwgb3BlcmF0b3LigJ06IFUrMg==?= =?UTF-8?B?MjYzIChTVFJJQ1RMWSBFUVVJVkFMRU5UIFRPKQ==?= Message-ID: The character U+2263 (? STRICTLY EQUIVALENT TO) is found under the subhead ?Relations?. I think it would be more appropriate to put it under ?Logical operator? (for comparison: U+2227) because it stands for a connective in modal logic: ? is strictly equivalent to ? if ? necessarily implies ? and ? necessarily implies ?. Source: Fitch (1952, p. 77). https://books.google.com/books?id=a3wIAQAAIAAJ&q=%22strictly+equivalent%22 One might object that ?is strictly equivalent to? (as opposed to ?necessarily if and only if?) is used in metalanguage for a relation between logical formulas (? use?mention distinction). However, this is not what the symbol ??? itself actually means, it is just that an alternative to saying ?if and only if? is to say ?is equivalent to? and mention (rather than use) the linked logical formulas. Likewise, one might read ?? ? ?? either as ?if ? then ?? or as ?? (materially) implies ??. This does not change the fact that ??? and ??? are symbols of the logical OBJECT language. (As a side note, usage of the triple bar ??? and of ?identity? in mathematics is convoluted: In ordinary language, two distinct things might be said to be ?equal? when they are equal in a certain respect (e.g., ?sexual equality?). In mathematics, ?equals? (=) is simply used in the sense of strict identity rather than for equivalence relations or congruence relations in general, though convention has it that the equals sign is more often read as ?equals? or ?is equal to? than ?is identical to?, and ?(solving an) equation? is used while ?identity? occurs in ?identity function? and what is expressed by a statement of equality can be called an identity (e.g., ?Euler?s identity?). As described so far, there is no actual difference between ?is equal to? and ?is identical to? at all, however, it seems that because we are only justified in proclaiming that an identity holds if the statement is generally valid, this usage of ?identity? got CORRUPTED into saying things like ?This equation is an identity? (meaning that the equation holds for all values) and ?is identically equal to? (?); you can even find a few Google hits for ?identically less?/?identically greater?. ? Besides, ??? is used for equivalence relations and for the logical equivalence connective. When John Conway (in ?On Numbers and Games?) used ??? for identity (expressing that two objects are one and the same object) and ?=? for equality in a weaker sense than described above for mathematics, he might have been influenced by the fact that ??? is sometimes read as ?is identical(ly equal) to?, even though this so-called ?identity? is something different from Conway?s identity altogether. Donald Knuth used the symbols the other way round, which I like better.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From harjitmoe at outlook.com Sun Jul 24 04:04:27 2022 From: harjitmoe at outlook.com (Harriet Riddle) Date: Sun, 24 Jul 2022 10:04:27 +0100 Subject: Hoefler Text Ornaments In-Reply-To: References: Message-ID: For further reference: the Nishiki-teki PUA scheme includes several ornaments from the Hoefler, Bodoni and Caslon sets (). These are not all created equal: for example, among the Caslon ornaments encoded there, one can see the English Rose, Scottish Thistle and Irish Harp embelms (PUA+FEF95, PUA+FEF96 and PUA+FEF97) for example, which are emblematic characters with clear identity and traditional meanings of their own (not incomparable with the aforementioned French fleur-de-lis), but one can also see a large number of nondescript and largely fungible arabesques for which distinct semantic usages are highly improbable. --Har. James Kass via Unicode wrote: > > In 11309-apple-resp-n4127, John H. Jenkins wrote, > "Apple feels that, absent evidence of widespread use, dingbats and > similar glyphs are not suitable for general-purpose encoding." > > and > > "Apple feels that, in general, characters should be encoded in the > Universal > Character Set only on the basis of demonstrated need for general text > interchange." > > In N4127, Karl Pentzlin noted that no effort was made to determine > unification with existing characters, even in cases where unification > was obvious.? For example, Hoefler Glyph 57 "ORN-FLEURDELIS" is shown > in N4127 with a pointer to U+269C (?). So some of the Hoefler > ornaments are already exchangeable in Unicode. > > Apple didn't forbid future encoding of Hoefler ornaments, but rather > keeps the existing bar of demonstrable usage in place. > > Any proposal to complete the Hoefler repertoire in Unicode would need > to carefully examine unification and then show that plain-text > interchange is necessary. > > > On 2022-07-23 7:04 PM, Rebecca Bettencourt via Unicode wrote: >> Because Apple has more sense than Microsoft and decided their dingbat >> fonts >> don't need to be in Unicode. >> >> Someone back in 2011 collected all the glyphs from Apple's dingbat >> fonts: >> http://unicode.org/wg2/docs/n4127.pdf >> >> And Apple provided a response: >> http://unicode.org/L2/L2011/11309-apple-resp-n4127.pdf >> >> -- Rebecca Bettencourt >> >> >> On Sat, Jul 23, 2022 at 6:06 AM Gabriel Tellez via Unicode < >> unicode at corp.unicode.org> wrote: >> >>> I don't understand why Wingdings/Webdings and Zapf Dingbats get to >>> be in >>> Unicode but not Hoefler Text Ornaments. (Not going to ask about Apple >>> Symbols because that's a icon font not a dingbat font) >>> > From karl-pentzlin at acssoft.de Sun Jul 24 15:49:48 2022 From: karl-pentzlin at acssoft.de (Karl Pentzlin) Date: Sun, 24 Jul 2022 22:49:48 +0200 Subject: Hoefler Text Ornaments In-Reply-To: References: Message-ID: <943300971.20220724224948@acssoft.de> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode: JKvU> In N4127, Karl Pentzlin noted that no effort was made to determine unification with existing characters, even in cases where unification was obvious. The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol Fonts: A Quick Survey", simply listing the (then) current use on the PUA by Apple. It was definitively not a proposal (alone by the fact that it listed PUA code points), and it was explicitly stated as subject of that document: ?The characters found are listed here without any further interpretation ? Especially, no names ? or properties are given, and it is not examined whether they can unified with existing Unicode characters, even for cases where this is obvious.? This document was intended as a starting point for discussions which of these symbols deserve an encoding or unification in Unicode (after the Wingdings/Webdings discussion which resulted in encodings or unifications for almost all of them), but as apparently there was no interest in such discussions, no subsequent documents besides the Apple comment L2/11-309 (especially no proposals) had followed. - Karl Pentzlin From markus.icu at gmail.com Sun Jul 24 18:42:44 2022 From: markus.icu at gmail.com (Markus Scherer) Date: Sun, 24 Jul 2022 16:42:44 -0700 Subject: Tai Tham Text Encoding In-Reply-To: <20220723171244.7fb392af@JRWUBU2> References: <20220723171244.7fb392af@JRWUBU2> Message-ID: On Sat, Jul 23, 2022 at 9:16 AM Richard Wordingham via Unicode < unicode at corp.unicode.org> wrote: > In summary, my main two questions are: > > Is 'encoding of text' the correct phrase for the definition of the > correct arrangement? It sounds reasonable, but will be easily confused with what are otherwise called "charsets" and "code pages" etc. It seems like we have a term for what you are after, but I can't put my finger on it right now :-) Is it appropriate to submit a proposal for the > standardisation of Tai Tham text encoding? > I think so. Proposals are best if they are specific, that is, which text is to be added or changed where, and to what. Changes to the core spec (the "book")? A new Unicode Technical Note? Best regards, markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.wordingham at ntlworld.com Sun Jul 24 19:21:13 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Mon, 25 Jul 2022 01:21:13 +0100 Subject: Tai Tham Text Encoding In-Reply-To: References: <20220723171244.7fb392af@JRWUBU2> Message-ID: <20220725012113.04378cf6@JRWUBU2> On Sun, 24 Jul 2022 16:42:44 -0700 Markus Scherer via Unicode wrote: > On Sat, Jul 23, 2022 at 9:16 AM Richard Wordingham via Unicode < > unicode at corp.unicode.org> wrote: > > > In summary, my main two questions are: > > > > Is 'encoding of text' the correct phrase for the definition of the > > correct arrangement? > > > It sounds reasonable, but will be easily confused with what are > otherwise called "charsets" and "code pages" etc. > It seems like we have a term for what you are after, but I can't put > my finger on it right now :-) > > Is it appropriate to submit a proposal for the > > standardisation of Tai Tham text encoding? Perhaps "standardisation of Tai Tham string encoding"? I'm not entirely sure, because 'string' implies that one already has a linear arrangement, but I am talking of how to select that string (more precisely a trace, because of canonical equivalence), and the question is the amount of zigzagging. "String selection" might be technically correct, but could be taken as meaning 'choice of words'. Perhaps "standardisation of Tai Tham character sequencing", but that suggests visual orthographic rules. Richard. From gtbot2007 at gmail.com Mon Jul 25 06:30:08 2022 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Mon, 25 Jul 2022 07:30:08 -0400 Subject: Hoefler Text Ornaments In-Reply-To: <943300971.20220724224948@acssoft.de> References: <943300971.20220724224948@acssoft.de> Message-ID: Turns out there is also Bodoni Onaments (a font that I somehow missed) and Type Embellishments One (a font that isn't on my computer but sounds like it should be by default?). On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode < unicode at corp.unicode.org> wrote: > Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode: > > JKvU> In N4127, Karl Pentzlin noted that no effort was made to determine > unification with existing characters, even in cases where unification was > obvious. > > The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol Fonts: A > Quick Survey", simply listing the (then) current use on the PUA by Apple. > It was definitively not a proposal (alone by the fact that it listed PUA > code points), and it was explicitly stated as subject of that document: > ?The characters found are listed here without any further interpretation ? > Especially, no names ? or properties are given, and it is not examined > whether they can unified with existing Unicode characters, even for cases > where this is obvious.? > > This document was intended as a starting point for discussions which of > these symbols deserve an encoding or unification in Unicode (after the > Wingdings/Webdings discussion which resulted in encodings or unifications > for almost all of them), but as apparently there was no interest in such > discussions, no subsequent documents besides the Apple comment L2/11-309 > (especially no proposals) had followed. > > - Karl Pentzlin > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marius.spix at web.de Mon Jul 25 18:25:48 2022 From: marius.spix at web.de (Marius Spix) Date: Tue, 26 Jul 2022 01:25:48 +0200 Subject: Hoefler Text Ornaments In-Reply-To: References: <943300971.20220724224948@acssoft.de> Message-ID: <20220726012548.51a1ebb5@spixxi> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook 97 and had been in circulation for a long time. Maybe it could be considered as well. I tried to map the glyphs. U+F041 = U+1F56D RINGING BELL U+F042 = U+1F511 KEY U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH SOLIDUS U+F045 = new_codepoint PEOPLE FACING RIGHT U+F046 = new_codepoint MEETING ROOM (table with three silhouettes) U+F047 = U+1F4CE PAPERCLIP U+F049 = U+1F382 BIRTHDAY CAKE U+F04A = new_codepoint WAX SEAL (???) U+F04D = new_codepoint ?????? (glyph has two variants: octagon with two arrows pointing inthe middle or two crossed pencils) U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???) -- Marius Spix On Mon, 25 Jul 2022 07:30:08 -0400 Gabriel Tellez via Unicode wrote: > Turns out there is also Bodoni Onaments (a font that I somehow missed) > and Type Embellishments One (a font that isn't on my computer but > sounds like it should be by default?). > > On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode < > unicode at corp.unicode.org> wrote: > > > Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode: > > > > JKvU> In N4127, Karl Pentzlin noted that no effort was made to > > JKvU> determine > > unification with existing characters, even in cases where > > unification was obvious. > > > > The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol > > Fonts: A Quick Survey", simply listing the (then) current use on > > the PUA by Apple. It was definitively not a proposal (alone by the > > fact that it listed PUA code points), and it was explicitly stated > > as subject of that document: ?The characters found are listed here > > without any further interpretation ? Especially, no names ? or > > properties are given, and it is not examined whether they can > > unified with existing Unicode characters, even for cases where this > > is obvious.? > > > > This document was intended as a starting point for discussions > > which of these symbols deserve an encoding or unification in > > Unicode (after the Wingdings/Webdings discussion which resulted in > > encodings or unifications for almost all of them), but as > > apparently there was no interest in such discussions, no subsequent > > documents besides the Apple comment L2/11-309 (especially no > > proposals) had followed. > > > > - Karl Pentzlin > > > > From gtbot2007 at gmail.com Mon Jul 25 18:51:43 2022 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Mon, 25 Jul 2022 19:51:43 -0400 Subject: Hoefler Text Ornaments In-Reply-To: <20220726012548.51a1ebb5@spixxi> References: <943300971.20220724224948@acssoft.de> <20220726012548.51a1ebb5@spixxi> Message-ID: OUTLOOK.ttf is questionable as its an icon font and not a dingbat one (though you can say the same with webdings), but since it's such a small font I think it could pass On Mon, Jul 25, 2022 at 7:26 PM Marius Spix wrote: > There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook > 97 and had been in circulation for a long time. Maybe it could be > considered as well. > > I tried to map the glyphs. > > U+F041 = U+1F56D RINGING BELL > U+F042 = U+1F511 KEY > U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS > U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH > SOLIDUS > U+F045 = new_codepoint PEOPLE FACING RIGHT > U+F046 = new_codepoint MEETING ROOM (table with three silhouettes) > U+F047 = U+1F4CE PAPERCLIP > U+F049 = U+1F382 BIRTHDAY CAKE > U+F04A = new_codepoint WAX SEAL (???) > U+F04D = new_codepoint ?????? (glyph has two variants: octagon with two > arrows pointing inthe middle or two crossed pencils) > U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???) > > -- > > Marius Spix > > > On Mon, 25 Jul 2022 07:30:08 -0400 > Gabriel Tellez via Unicode wrote: > > > Turns out there is also Bodoni Onaments (a font that I somehow missed) > > and Type Embellishments One (a font that isn't on my computer but > > sounds like it should be by default?). > > > > On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode < > > unicode at corp.unicode.org> wrote: > > > > > Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode: > > > > > > JKvU> In N4127, Karl Pentzlin noted that no effort was made to > > > JKvU> determine > > > unification with existing characters, even in cases where > > > unification was obvious. > > > > > > The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol > > > Fonts: A Quick Survey", simply listing the (then) current use on > > > the PUA by Apple. It was definitively not a proposal (alone by the > > > fact that it listed PUA code points), and it was explicitly stated > > > as subject of that document: ?The characters found are listed here > > > without any further interpretation ? Especially, no names ? or > > > properties are given, and it is not examined whether they can > > > unified with existing Unicode characters, even for cases where this > > > is obvious.? > > > > > > This document was intended as a starting point for discussions > > > which of these symbols deserve an encoding or unification in > > > Unicode (after the Wingdings/Webdings discussion which resulted in > > > encodings or unifications for almost all of them), but as > > > apparently there was no interest in such discussions, no subsequent > > > documents besides the Apple comment L2/11-309 (especially no > > > proposals) had followed. > > > > > > - Karl Pentzlin > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Mon Jul 25 19:24:57 2022 From: jameskass at code2001.com (James Kass) Date: Tue, 26 Jul 2022 00:24:57 +0000 Subject: Hoefler Text Ornaments In-Reply-To: References: <943300971.20220724224948@acssoft.de> <20220726012548.51a1ebb5@spixxi> Message-ID: <53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com> As a visual aid, the MS Outlook glyphs are provided in the attached graphic file.? Some of the glyphs noted by Marius Spix appear to have been removed from the font by the time XP arrived, the graphic shows the font version included with Windows XP. Having established that certain glyphs exist, the next question is whether people are exchanging them in plain-text.? If not, then could it be demonstrated that users would benefit from the ability to do so?? If not, then there is no path towards their encoding in the Standard. On 2022-07-25 11:51 PM, Gabriel Tellez via Unicode wrote: > OUTLOOK.ttf is questionable as its an icon font and not a dingbat one > (though you can say the same with webdings), but since it's such a small > font I think it could pass > > On Mon, Jul 25, 2022 at 7:26 PM Marius Spix wrote: > >> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook >> 97 and had been in circulation for a long time. Maybe it could be >> considered as well. >> >> I tried to map the glyphs. >> >> U+F041 = U+1F56D RINGING BELL >> U+F042 = U+1F511 KEY >> U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS >> U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH >> SOLIDUS >> U+F045 = new_codepoint PEOPLE FACING RIGHT >> U+F046 = new_codepoint MEETING ROOM (table with three silhouettes) >> U+F047 = U+1F4CE PAPERCLIP >> U+F049 = U+1F382 BIRTHDAY CAKE >> U+F04A = new_codepoint WAX SEAL (???) >> U+F04D = new_codepoint ?????? (glyph has two variants: octagon with two >> arrows pointing inthe middle or two crossed pencils) >> U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???) >> >> -- >> >> Marius Spix >> >> >> On Mon, 25 Jul 2022 07:30:08 -0400 >> Gabriel Tellez via Unicode wrote: >> >>> Turns out there is also Bodoni Onaments (a font that I somehow missed) >>> and Type Embellishments One (a font that isn't on my computer but >>> sounds like it should be by default?). >>> >>> On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode < >>> unicode at corp.unicode.org> wrote: >>> >>>> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode: >>>> >>>> JKvU> In N4127, Karl Pentzlin noted that no effort was made to >>>> JKvU> determine >>>> unification with existing characters, even in cases where >>>> unification was obvious. >>>> >>>> The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol >>>> Fonts: A Quick Survey", simply listing the (then) current use on >>>> the PUA by Apple. It was definitively not a proposal (alone by the >>>> fact that it listed PUA code points), and it was explicitly stated >>>> as subject of that document: ?The characters found are listed here >>>> without any further interpretation ? Especially, no names ? or >>>> properties are given, and it is not examined whether they can >>>> unified with existing Unicode characters, even for cases where this >>>> is obvious.? >>>> >>>> This document was intended as a starting point for discussions >>>> which of these symbols deserve an encoding or unification in >>>> Unicode (after the Wingdings/Webdings discussion which resulted in >>>> encodings or unifications for almost all of them), but as >>>> apparently there was no interest in such discussions, no subsequent >>>> documents besides the Apple comment L2/11-309 (especially no >>>> proposals) had followed. >>>> >>>> - Karl Pentzlin >>>> >>>> >> -------------- next part -------------- A non-text attachment was scrubbed... Name: OutlookGlyphs.PNG Type: image/png Size: 4085 bytes Desc: not available URL: From beckiergb at gmail.com Mon Jul 25 22:08:23 2022 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Mon, 25 Jul 2022 20:08:23 -0700 Subject: Hoefler Text Ornaments In-Reply-To: <53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com> References: <943300971.20220724224948@acssoft.de> <20220726012548.51a1ebb5@spixxi> <53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com> Message-ID: Despite my first response to this thread taking a dig at Microsoft, my actual understanding is they didn't get Wingdings and Webdings into Unicode for no reason; they were able to demonstrate that there are a considerable number of web pages, emails, and documents using those fonts. They simply enjoy a level of popularity that none of the other fonts mentioned in this thread do. Very few people are using Hoefler Text Ornaments, Type Embellishments One, etc. in their documents, and the ones who are seem to get by just fine using private use code points. Compare the many people confused by the stray J appearing in old emails stripped of their formatting (in which the specification of Wingdings for that character would display it as a smiley face). If you feel there is enough of a case for Hoefler Text Ornaments, you can certainly create a proposal. But you'll have to at the very least provide some statistics as to how many people actually use them. Also consider that whatever statistics Apple may have had, it certainly wasn't enough to convince them they needed encoding. On Mon, Jul 25, 2022, 5:29 PM James Kass via Unicode < unicode at corp.unicode.org> wrote: > > As a visual aid, the MS Outlook glyphs are provided in the attached > graphic file. Some of the glyphs noted by Marius Spix appear to have > been removed from the font by the time XP arrived, the graphic shows the > font version included with Windows XP. > > Having established that certain glyphs exist, the next question is > whether people are exchanging them in plain-text. If not, then could it > be demonstrated that users would benefit from the ability to do so? If > not, then there is no path towards their encoding in the Standard. > > On 2022-07-25 11:51 PM, Gabriel Tellez via Unicode wrote: > > OUTLOOK.ttf is questionable as its an icon font and not a dingbat one > > (though you can say the same with webdings), but since it's such a small > > font I think it could pass > > > > On Mon, Jul 25, 2022 at 7:26 PM Marius Spix wrote: > > > >> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook > >> 97 and had been in circulation for a long time. Maybe it could be > >> considered as well. > >> > >> I tried to map the glyphs. > >> > >> U+F041 = U+1F56D RINGING BELL > >> U+F042 = U+1F511 KEY > >> U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS > >> U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH > >> SOLIDUS > >> U+F045 = new_codepoint PEOPLE FACING RIGHT > >> U+F046 = new_codepoint MEETING ROOM (table with three silhouettes) > >> U+F047 = U+1F4CE PAPERCLIP > >> U+F049 = U+1F382 BIRTHDAY CAKE > >> U+F04A = new_codepoint WAX SEAL (???) > >> U+F04D = new_codepoint ?????? (glyph has two variants: octagon with two > >> arrows pointing inthe middle or two crossed pencils) > >> U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???) > >> > >> -- > >> > >> Marius Spix > >> > >> > >> On Mon, 25 Jul 2022 07:30:08 -0400 > >> Gabriel Tellez via Unicode wrote: > >> > >>> Turns out there is also Bodoni Onaments (a font that I somehow missed) > >>> and Type Embellishments One (a font that isn't on my computer but > >>> sounds like it should be by default?). > >>> > >>> On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode < > >>> unicode at corp.unicode.org> wrote: > >>> > >>>> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode: > >>>> > >>>> JKvU> In N4127, Karl Pentzlin noted that no effort was made to > >>>> JKvU> determine > >>>> unification with existing characters, even in cases where > >>>> unification was obvious. > >>>> > >>>> The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol > >>>> Fonts: A Quick Survey", simply listing the (then) current use on > >>>> the PUA by Apple. It was definitively not a proposal (alone by the > >>>> fact that it listed PUA code points), and it was explicitly stated > >>>> as subject of that document: ?The characters found are listed here > >>>> without any further interpretation ? Especially, no names ? or > >>>> properties are given, and it is not examined whether they can > >>>> unified with existing Unicode characters, even for cases where this > >>>> is obvious.? > >>>> > >>>> This document was intended as a starting point for discussions > >>>> which of these symbols deserve an encoding or unification in > >>>> Unicode (after the Wingdings/Webdings discussion which resulted in > >>>> encodings or unifications for almost all of them), but as > >>>> apparently there was no interest in such discussions, no subsequent > >>>> documents besides the Apple comment L2/11-309 (especially no > >>>> proposals) had followed. > >>>> > >>>> - Karl Pentzlin > >>>> > >>>> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gtbot2007 at gmail.com Tue Jul 26 09:03:33 2022 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Tue, 26 Jul 2022 10:03:33 -0400 Subject: Hoefler Text Ornaments In-Reply-To: References: <943300971.20220724224948@acssoft.de> <20220726012548.51a1ebb5@spixxi> <53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com> Message-ID: Do normal people (who don?t know what a Unicode is) even use Webdings/Windings with the Unicode code points? Because if they don?t then it?s no different then people using the PUA for these fonts. On Mon, Jul 25, 2022 at 11:15 PM Rebecca Bettencourt via Unicode < unicode at corp.unicode.org> wrote: > Despite my first response to this thread taking a dig at Microsoft, my > actual understanding is they didn't get Wingdings and Webdings into Unicode > for no reason; they were able to demonstrate that there are a considerable > number of web pages, emails, and documents using those fonts. They simply > enjoy a level of popularity that none of the other fonts mentioned in this > thread do. Very few people are using Hoefler Text Ornaments, Type > Embellishments One, etc. in their documents, and the ones who are seem to > get by just fine using private use code points. Compare the many people > confused by the stray J appearing in old emails stripped of their > formatting (in which the specification of Wingdings for that character > would display it as a smiley face). > > If you feel there is enough of a case for Hoefler Text Ornaments, you can > certainly create a proposal. But you'll have to at the very least provide > some statistics as to how many people actually use them. Also consider that > whatever statistics Apple may have had, it certainly wasn't enough to > convince them they needed encoding. > > On Mon, Jul 25, 2022, 5:29 PM James Kass via Unicode < > unicode at corp.unicode.org> wrote: > >> >> As a visual aid, the MS Outlook glyphs are provided in the attached >> graphic file. Some of the glyphs noted by Marius Spix appear to have >> been removed from the font by the time XP arrived, the graphic shows the >> font version included with Windows XP. >> >> Having established that certain glyphs exist, the next question is >> whether people are exchanging them in plain-text. If not, then could it >> be demonstrated that users would benefit from the ability to do so? If >> not, then there is no path towards their encoding in the Standard. >> >> On 2022-07-25 11:51 PM, Gabriel Tellez via Unicode wrote: >> > OUTLOOK.ttf is questionable as its an icon font and not a dingbat one >> > (though you can say the same with webdings), but since it's such a small >> > font I think it could pass >> > >> > On Mon, Jul 25, 2022 at 7:26 PM Marius Spix wrote: >> > >> >> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook >> >> 97 and had been in circulation for a long time. Maybe it could be >> >> considered as well. >> >> >> >> I tried to map the glyphs. >> >> >> >> U+F041 = U+1F56D RINGING BELL >> >> U+F042 = U+1F511 KEY >> >> U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS >> >> U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH >> >> SOLIDUS >> >> U+F045 = new_codepoint PEOPLE FACING RIGHT >> >> U+F046 = new_codepoint MEETING ROOM (table with three silhouettes) >> >> U+F047 = U+1F4CE PAPERCLIP >> >> U+F049 = U+1F382 BIRTHDAY CAKE >> >> U+F04A = new_codepoint WAX SEAL (???) >> >> U+F04D = new_codepoint ?????? (glyph has two variants: octagon with two >> >> arrows pointing inthe middle or two crossed pencils) >> >> U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???) >> >> >> >> -- >> >> >> >> Marius Spix >> >> >> >> >> >> On Mon, 25 Jul 2022 07:30:08 -0400 >> >> Gabriel Tellez via Unicode wrote: >> >> >> >>> Turns out there is also Bodoni Onaments (a font that I somehow missed) >> >>> and Type Embellishments One (a font that isn't on my computer but >> >>> sounds like it should be by default?). >> >>> >> >>> On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode < >> >>> unicode at corp.unicode.org> wrote: >> >>> >> >>>> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode: >> >>>> >> >>>> JKvU> In N4127, Karl Pentzlin noted that no effort was made to >> >>>> JKvU> determine >> >>>> unification with existing characters, even in cases where >> >>>> unification was obvious. >> >>>> >> >>>> The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol >> >>>> Fonts: A Quick Survey", simply listing the (then) current use on >> >>>> the PUA by Apple. It was definitively not a proposal (alone by the >> >>>> fact that it listed PUA code points), and it was explicitly stated >> >>>> as subject of that document: ?The characters found are listed here >> >>>> without any further interpretation ? Especially, no names ? or >> >>>> properties are given, and it is not examined whether they can >> >>>> unified with existing Unicode characters, even for cases where this >> >>>> is obvious.? >> >>>> >> >>>> This document was intended as a starting point for discussions >> >>>> which of these symbols deserve an encoding or unification in >> >>>> Unicode (after the Wingdings/Webdings discussion which resulted in >> >>>> encodings or unifications for almost all of them), but as >> >>>> apparently there was no interest in such discussions, no subsequent >> >>>> documents besides the Apple comment L2/11-309 (especially no >> >>>> proposals) had followed. >> >>>> >> >>>> - Karl Pentzlin >> >>>> >> >>>> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From beckiergb at gmail.com Tue Jul 26 09:30:43 2022 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Tue, 26 Jul 2022 07:30:43 -0700 Subject: Hoefler Text Ornaments In-Reply-To: References: <943300971.20220724224948@acssoft.de> <20220726012548.51a1ebb5@spixxi> <53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com> Message-ID: On Tue, Jul 26, 2022 at 7:03 AM Gabriel Tellez wrote: > Do normal people (who don?t know what a Unicode is) even use > Webdings/Windings with the Unicode code points? Because if they don?t then > it?s no different then people using the PUA for these fonts. > Sure. Usually from the Insert Symbol function in Microsoft Word. > On Mon, Jul 25, 2022 at 11:15 PM Rebecca Bettencourt via Unicode < > unicode at corp.unicode.org> wrote: > >> Despite my first response to this thread taking a dig at Microsoft, my >> actual understanding is they didn't get Wingdings and Webdings into Unicode >> for no reason; they were able to demonstrate that there are a considerable >> number of web pages, emails, and documents using those fonts. They simply >> enjoy a level of popularity that none of the other fonts mentioned in this >> thread do. Very few people are using Hoefler Text Ornaments, Type >> Embellishments One, etc. in their documents, and the ones who are seem to >> get by just fine using private use code points. Compare the many people >> confused by the stray J appearing in old emails stripped of their >> formatting (in which the specification of Wingdings for that character >> would display it as a smiley face). >> >> If you feel there is enough of a case for Hoefler Text Ornaments, you can >> certainly create a proposal. But you'll have to at the very least provide >> some statistics as to how many people actually use them. Also consider that >> whatever statistics Apple may have had, it certainly wasn't enough to >> convince them they needed encoding. >> >> On Mon, Jul 25, 2022, 5:29 PM James Kass via Unicode < >> unicode at corp.unicode.org> wrote: >> >>> >>> As a visual aid, the MS Outlook glyphs are provided in the attached >>> graphic file. Some of the glyphs noted by Marius Spix appear to have >>> been removed from the font by the time XP arrived, the graphic shows the >>> font version included with Windows XP. >>> >>> Having established that certain glyphs exist, the next question is >>> whether people are exchanging them in plain-text. If not, then could it >>> be demonstrated that users would benefit from the ability to do so? If >>> not, then there is no path towards their encoding in the Standard. >>> >>> On 2022-07-25 11:51 PM, Gabriel Tellez via Unicode wrote: >>> > OUTLOOK.ttf is questionable as its an icon font and not a dingbat one >>> > (though you can say the same with webdings), but since it's such a >>> small >>> > font I think it could pass >>> > >>> > On Mon, Jul 25, 2022 at 7:26 PM Marius Spix >>> wrote: >>> > >>> >> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook >>> >> 97 and had been in circulation for a long time. Maybe it could be >>> >> considered as well. >>> >> >>> >> I tried to map the glyphs. >>> >> >>> >> U+F041 = U+1F56D RINGING BELL >>> >> U+F042 = U+1F511 KEY >>> >> U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS >>> >> U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH >>> >> SOLIDUS >>> >> U+F045 = new_codepoint PEOPLE FACING RIGHT >>> >> U+F046 = new_codepoint MEETING ROOM (table with three silhouettes) >>> >> U+F047 = U+1F4CE PAPERCLIP >>> >> U+F049 = U+1F382 BIRTHDAY CAKE >>> >> U+F04A = new_codepoint WAX SEAL (???) >>> >> U+F04D = new_codepoint ?????? (glyph has two variants: octagon with >>> two >>> >> arrows pointing inthe middle or two crossed pencils) >>> >> U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???) >>> >> >>> >> -- >>> >> >>> >> Marius Spix >>> >> >>> >> >>> >> On Mon, 25 Jul 2022 07:30:08 -0400 >>> >> Gabriel Tellez via Unicode wrote: >>> >> >>> >>> Turns out there is also Bodoni Onaments (a font that I somehow >>> missed) >>> >>> and Type Embellishments One (a font that isn't on my computer but >>> >>> sounds like it should be by default?). >>> >>> >>> >>> On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode < >>> >>> unicode at corp.unicode.org> wrote: >>> >>> >>> >>>> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode: >>> >>>> >>> >>>> JKvU> In N4127, Karl Pentzlin noted that no effort was made to >>> >>>> JKvU> determine >>> >>>> unification with existing characters, even in cases where >>> >>>> unification was obvious. >>> >>>> >>> >>>> The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol >>> >>>> Fonts: A Quick Survey", simply listing the (then) current use on >>> >>>> the PUA by Apple. It was definitively not a proposal (alone by the >>> >>>> fact that it listed PUA code points), and it was explicitly stated >>> >>>> as subject of that document: ?The characters found are listed here >>> >>>> without any further interpretation ? Especially, no names ? or >>> >>>> properties are given, and it is not examined whether they can >>> >>>> unified with existing Unicode characters, even for cases where this >>> >>>> is obvious.? >>> >>>> >>> >>>> This document was intended as a starting point for discussions >>> >>>> which of these symbols deserve an encoding or unification in >>> >>>> Unicode (after the Wingdings/Webdings discussion which resulted in >>> >>>> encodings or unifications for almost all of them), but as >>> >>>> apparently there was no interest in such discussions, no subsequent >>> >>>> documents besides the Apple comment L2/11-309 (especially no >>> >>>> proposals) had followed. >>> >>>> >>> >>>> - Karl Pentzlin >>> >>>> >>> >>>> >>> >> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From sdowney at gmail.com Tue Jul 26 10:49:49 2022 From: sdowney at gmail.com (Steve Downey) Date: Tue, 26 Jul 2022 11:49:49 -0400 Subject: Hoefler Text Ornaments In-Reply-To: References: <943300971.20220724224948@acssoft.de> <20220726012548.51a1ebb5@spixxi> <53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com> Message-ID: Yes, because helpful programmers, like me, transcode their marked up encoding into Unicode. In any case, the cat is out of the bag and the horses have left the barn, and wingdings and webdings really were incredibly popular before Unicode standardization, for largely the same reasons that emoji are today. For another decorative set to be encoded, I think there would need to be evidence of a body of text using those symbols for which there is a desire to re-encode today, such that without encoding the symbols meaning would be lost. It's a deliberately high bar. On Tue, Jul 26, 2022 at 10:08 AM Gabriel Tellez via Unicode < unicode at corp.unicode.org> wrote: > Do normal people (who don?t know what a Unicode is) even use > Webdings/Windings with the Unicode code points? Because if they don?t then > it?s no different then people using the PUA for these fonts. > > On Mon, Jul 25, 2022 at 11:15 PM Rebecca Bettencourt via Unicode < > unicode at corp.unicode.org> wrote: > >> Despite my first response to this thread taking a dig at Microsoft, my >> actual understanding is they didn't get Wingdings and Webdings into Unicode >> for no reason; they were able to demonstrate that there are a considerable >> number of web pages, emails, and documents using those fonts. They simply >> enjoy a level of popularity that none of the other fonts mentioned in this >> thread do. Very few people are using Hoefler Text Ornaments, Type >> Embellishments One, etc. in their documents, and the ones who are seem to >> get by just fine using private use code points. Compare the many people >> confused by the stray J appearing in old emails stripped of their >> formatting (in which the specification of Wingdings for that character >> would display it as a smiley face). >> >> If you feel there is enough of a case for Hoefler Text Ornaments, you can >> certainly create a proposal. But you'll have to at the very least provide >> some statistics as to how many people actually use them. Also consider that >> whatever statistics Apple may have had, it certainly wasn't enough to >> convince them they needed encoding. >> >> On Mon, Jul 25, 2022, 5:29 PM James Kass via Unicode < >> unicode at corp.unicode.org> wrote: >> >>> >>> As a visual aid, the MS Outlook glyphs are provided in the attached >>> graphic file. Some of the glyphs noted by Marius Spix appear to have >>> been removed from the font by the time XP arrived, the graphic shows the >>> font version included with Windows XP. >>> >>> Having established that certain glyphs exist, the next question is >>> whether people are exchanging them in plain-text. If not, then could it >>> be demonstrated that users would benefit from the ability to do so? If >>> not, then there is no path towards their encoding in the Standard. >>> >>> On 2022-07-25 11:51 PM, Gabriel Tellez via Unicode wrote: >>> > OUTLOOK.ttf is questionable as its an icon font and not a dingbat one >>> > (though you can say the same with webdings), but since it's such a >>> small >>> > font I think it could pass >>> > >>> > On Mon, Jul 25, 2022 at 7:26 PM Marius Spix >>> wrote: >>> > >>> >> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook >>> >> 97 and had been in circulation for a long time. Maybe it could be >>> >> considered as well. >>> >> >>> >> I tried to map the glyphs. >>> >> >>> >> U+F041 = U+1F56D RINGING BELL >>> >> U+F042 = U+1F511 KEY >>> >> U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS >>> >> U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH >>> >> SOLIDUS >>> >> U+F045 = new_codepoint PEOPLE FACING RIGHT >>> >> U+F046 = new_codepoint MEETING ROOM (table with three silhouettes) >>> >> U+F047 = U+1F4CE PAPERCLIP >>> >> U+F049 = U+1F382 BIRTHDAY CAKE >>> >> U+F04A = new_codepoint WAX SEAL (???) >>> >> U+F04D = new_codepoint ?????? (glyph has two variants: octagon with >>> two >>> >> arrows pointing inthe middle or two crossed pencils) >>> >> U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???) >>> >> >>> >> -- >>> >> >>> >> Marius Spix >>> >> >>> >> >>> >> On Mon, 25 Jul 2022 07:30:08 -0400 >>> >> Gabriel Tellez via Unicode wrote: >>> >> >>> >>> Turns out there is also Bodoni Onaments (a font that I somehow >>> missed) >>> >>> and Type Embellishments One (a font that isn't on my computer but >>> >>> sounds like it should be by default?). >>> >>> >>> >>> On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode < >>> >>> unicode at corp.unicode.org> wrote: >>> >>> >>> >>>> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode: >>> >>>> >>> >>>> JKvU> In N4127, Karl Pentzlin noted that no effort was made to >>> >>>> JKvU> determine >>> >>>> unification with existing characters, even in cases where >>> >>>> unification was obvious. >>> >>>> >>> >>>> The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol >>> >>>> Fonts: A Quick Survey", simply listing the (then) current use on >>> >>>> the PUA by Apple. It was definitively not a proposal (alone by the >>> >>>> fact that it listed PUA code points), and it was explicitly stated >>> >>>> as subject of that document: ?The characters found are listed here >>> >>>> without any further interpretation ? Especially, no names ? or >>> >>>> properties are given, and it is not examined whether they can >>> >>>> unified with existing Unicode characters, even for cases where this >>> >>>> is obvious.? >>> >>>> >>> >>>> This document was intended as a starting point for discussions >>> >>>> which of these symbols deserve an encoding or unification in >>> >>>> Unicode (after the Wingdings/Webdings discussion which resulted in >>> >>>> encodings or unifications for almost all of them), but as >>> >>>> apparently there was no interest in such discussions, no subsequent >>> >>>> documents besides the Apple comment L2/11-309 (especially no >>> >>>> proposals) had followed. >>> >>>> >>> >>>> - Karl Pentzlin >>> >>>> >>> >>>> >>> >> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Tue Jul 26 17:38:07 2022 From: jameskass at code2001.com (James Kass) Date: Tue, 26 Jul 2022 22:38:07 +0000 Subject: Hoefler Text Ornaments In-Reply-To: References: <943300971.20220724224948@acssoft.de> <20220726012548.51a1ebb5@spixxi> <53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com> Message-ID: <87be2316-9562-a6fc-489f-b1f0f4f1aebe@code2001.com> On 2022-07-26 3:49 PM, Steve Downey via Unicode wrote: > For another decorative set to be encoded, I > think there would need to be evidence of a body of text using those symbols > for which there is a desire to re-encode today, such that without encoding > the symbols meaning would be lost. It's a deliberately high bar. There were several steps along the way to getting webdings/wingdings encoded in The Standard.? Here's a link to an updated proposal from 2011: https://www.unicode.org/L2/L2011/11344-wingdings.pdf The introductory text delves into the rationale for encoding and might be of interest to anyone contemplating submitting proposals for similar additions. From michel at suignard.com Tue Jul 26 22:25:46 2022 From: michel at suignard.com (Michel Suignard) Date: Wed, 27 Jul 2022 03:25:46 +0000 Subject: Hoefler Text Ornaments In-Reply-To: <87be2316-9562-a6fc-489f-b1f0f4f1aebe@code2001.com> References: <943300971.20220724224948@acssoft.de> <20220726012548.51a1ebb5@spixxi> <53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com> <87be2316-9562-a6fc-489f-b1f0f4f1aebe@code2001.com> Message-ID: Yes, it was not a simple and quick process. Michel -----Original Message----- From: Unicode On Behalf Of James Kass via Unicode Sent: Tuesday, July 26, 2022 3:38 PM To: unicode at corp.unicode.org Subject: Re: Hoefler Text Ornaments On 2022-07-26 3:49 PM, Steve Downey via Unicode wrote: > For another decorative set to be encoded, I think there would need to > be evidence of a body of text using those symbols for which there is a > desire to re-encode today, such that without encoding the symbols > meaning would be lost. It's a deliberately high bar. There were several steps along the way to getting webdings/wingdings encoded in The Standard.? Here's a link to an updated proposal from 2011: https://www.unicode.org/L2/L2011/11344-wingdings.pdf The introductory text delves into the rationale for encoding and might be of interest to anyone contemplating submitting proposals for similar additions. From wjgo_10009 at btinternet.com Tue Jul 26 16:14:29 2022 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Tue, 26 Jul 2022 22:14:29 +0100 (BST) Subject: Hoefler Text Ornaments In-Reply-To: References: <943300971.20220724224948@acssoft.de> <20220726012548.51a1ebb5@spixxi> <53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com> Message-ID: <754517ce.2e335.1823c5c78ed.Webtop.88@btinternet.com> Steve Downey wrote: > It's a deliberately high bar. Indeed. I appreciate that there are reasons for that very high bar. As I have symbols that I have devised that I wish to express in plain text yet conserve the meaning, I have devised a technique that goes some way to achieving that result for me. Perhaps a similar technique could be applied for encoding Hoefler Text Ornaments. Please find attached a graphic showing nine symbols, that are for yes, indefinite yes, somewhat yes, needing more information, not knowing, refusing to answer, somewhat no, indefinite no, no. I encode these as plain text using a four character sequence for each symbol. I use %791 for yes, through to %799 for no. Display is by using an OpenType colour font that I produced myself. There is no guarantee that the encoding will be unique, yet it is, in my opinion, better than using a Private Use Area encoding as it is better for conserving meaning, as, in the absence of a suitable font, there is a graceful fallback to the encoding sequence for each symbol. William Overington Tuesday 26 July 2022 -------------- next part -------------- A non-text attachment was scrubbed... Name: answers.png Type: image/png Size: 2487 bytes Desc: not available URL: