From aprilop at fn.de Fri Jul 1 07:15:16 2022
From: aprilop at fn.de (Andreas Prilop)
Date: Fri, 01 Jul 2022 12:15:16 +0000
Subject: Different Bidirectional Character Types
Message-ID: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
Reference:
https://unicode.org/reports/tr9/#Bidirectional_Character_Types
Why do Hebrew letters and Arabic letters have different
bidirectional character types?
Some effects can be seen using this HTML code:
רוקפורד 555-2368
روكفورد 555-2368
או 3−2=1
أو 3−2=1
Why do Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?)
have different bidirectional character types?
Some effects can be seen using this HTML code:
١٩٩٩ ١٢ ٣١
١٩٩٩-١٢-٣١
۱۹۹۹ ۱۲ ۳۱
۱۹۹۹-۱۲-۳۱
From aprilop at fn.de Fri Jul 1 07:36:40 2022
From: aprilop at fn.de (Andreas Prilop)
Date: Fri, 01 Jul 2022 12:36:40 +0000
Subject: Different Bidirectional Character Types
In-Reply-To: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
Message-ID: <5462E34A-9826-4FD8-91EA-E73BD97643B0@fn.de>
I wrote:
> Why do Hebrew letters and Arabic letters have different
> bidirectional character types?
> Some effects can be seen using this HTML code:
Or visit
https://corp.unicode.org/pipermail/unicode/2022-July/010191.html
From asmusf at ix.netcom.com Fri Jul 1 11:05:46 2022
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Fri, 1 Jul 2022 09:05:46 -0700
Subject: Different Bidirectional Character Types
In-Reply-To: <5462E34A-9826-4FD8-91EA-E73BD97643B0@fn.de>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<5462E34A-9826-4FD8-91EA-E73BD97643B0@fn.de>
Message-ID:
An HTML attachment was scrubbed...
URL:
From aprilop at fn.de Fri Jul 1 12:02:35 2022
From: aprilop at fn.de (Andreas Prilop)
Date: Fri, 01 Jul 2022 17:02:35 +0000
Subject: Different Bidirectional Character Types
In-Reply-To:
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<5462E34A-9826-4FD8-91EA-E73BD97643B0@fn.de>
Message-ID:
On 1 July 2022, Asmus Freytag wrote:
>> Why do Hebrew letters and Arabic letters have different
>> bidirectional character types?
>> https://corp.unicode.org/pipermail/unicode/2022-July/010191.html
>
> If this is not explained in the text of UAX#9 can you point out
> where there explanation would need to be improved?
I cannot find an explanation *why* Hebrew and Arabic letters
should behave differently.
Why ?555-2368? after Hebrew letters
but ?2368-555? after Arabic letters?
Why ?31-12-1999? with Arabic-Indic digits
but ?1999-12-31? with Persian letters?
Why?
From haberg-1 at telia.com Sat Jul 2 04:01:00 2022
From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=)
Date: Sat, 2 Jul 2022 11:01:00 +0200
Subject: Different Bidirectional Character Types
In-Reply-To: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
Message-ID:
> On 1 Jul 2022, at 14:15, Andreas Prilop via Unicode wrote:
>
> Reference:
> https://unicode.org/reports/tr9/#Bidirectional_Character_Types
>
> Why do Hebrew letters and Arabic letters have different
> bidirectional character types?
I cannot parse this, but in Hebrew, Arabic, and Persian, text is written RTL, but numbers LTR. For example, trying A123 in a translator supporting those scripts, I get:
?123
? ???
? ???
From richard.wordingham at ntlworld.com Sat Jul 2 04:54:46 2022
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Sat, 2 Jul 2022 10:54:46 +0100
Subject: Different Bidirectional Character Types
In-Reply-To:
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
Message-ID: <20220702105446.033065ab@JRWUBU2>
On Sat, 2 Jul 2022 11:01:00 +0200
Hans ?berg via Unicode wrote:
> > On 1 Jul 2022, at 14:15, Andreas Prilop via Unicode
> > wrote:
> >
> > Reference:
> > https://unicode.org/reports/tr9/#Bidirectional_Character_Types
> >
> > Why do Hebrew letters and Arabic letters have different
> > bidirectional character types?
>
> I cannot parse this, but in Hebrew, Arabic, and Persian, text is
> written RTL, but numbers LTR. For example, trying A123 in a
> translator supporting those scripts, I get: ?123 ? ???
> ? ???
>
>
For numbers, using natural language, you don't mean LTR, but 'with the
most significant digit on the left'. It is a convention that the when
encoding 'four and twenty' using digits, the most significant digit is
stored first. N'ko decimal numbers have the most significant digit on
the right, with the result that N'ko digits have bidi class
Right_To_Left, as do N'ko letters.
As to parsing the question, at the literal level Hebrew letters have
bidi class Right_To_Left (R) while Arabic letters have bidi class
Arabic_Letter (AL); Moroccan decimal digits (e.g U+0030) have bidi
class European_Number (EN), Egyptian decimal digits have bidi class
Arabic_Number (AN), Urdu decimal digits have bidi class European_Number
(EN) and Hindi decimal digits (e.g. U+0966) have bidi class
Left_to_Right (L). When one throws dollar signs, which have bidi
class European_Terminator (ET) into the mix, these differences matter to
the bidi algorithm.
Richard.
From eliz at gnu.org Sat Jul 2 05:13:53 2022
From: eliz at gnu.org (Eli Zaretskii)
Date: Sat, 02 Jul 2022 13:13:53 +0300
Subject: Different Bidirectional Character Types
In-Reply-To: <20220702105446.033065ab@JRWUBU2> (message from Richard
Wordingham via Unicode on Sat, 2 Jul 2022 10:54:46 +0100)
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
Message-ID: <835ykfddpa.fsf@gnu.org>
> Date: Sat, 2 Jul 2022 10:54:46 +0100
> From: Richard Wordingham via Unicode
>
> On Sat, 2 Jul 2022 11:01:00 +0200
> Hans ?berg via Unicode wrote:
>
> > > On 1 Jul 2022, at 14:15, Andreas Prilop via Unicode
> > > wrote:
> > >
> > > Reference:
> > > https://unicode.org/reports/tr9/#Bidirectional_Character_Types
> > >
> > > Why do Hebrew letters and Arabic letters have different
> > > bidirectional character types?
> >
> > I cannot parse this, but in Hebrew, Arabic, and Persian, text is
> > written RTL, but numbers LTR. For example, trying A123 in a
> > translator supporting those scripts, I get: ?123 ? ???
> > ? ???
> >
> >
>
> For numbers, using natural language, you don't mean LTR, but 'with the
> most significant digit on the left'. It is a convention that the when
> encoding 'four and twenty' using digits, the most significant digit is
> stored first. N'ko decimal numbers have the most significant digit on
> the right, with the result that N'ko digits have bidi class
> Right_To_Left, as do N'ko letters.
>
> As to parsing the question, at the literal level Hebrew letters have
> bidi class Right_To_Left (R) while Arabic letters have bidi class
> Arabic_Letter (AL); Moroccan decimal digits (e.g U+0030) have bidi
> class European_Number (EN), Egyptian decimal digits have bidi class
> Arabic_Number (AN), Urdu decimal digits have bidi class European_Number
> (EN) and Hindi decimal digits (e.g. U+0966) have bidi class
> Left_to_Right (L). When one throws dollar signs, which have bidi
> class European_Terminator (ET) into the mix, these differences matter to
> the bidi algorithm.
I think a simpler answer is that Arabic letters (bidi class AL) in
some cases make European Numbers (EN) behave like Arabic Numbers (AN);
see rule W2 of UAX#9. And Arabic Numbers then affect how other "weak"
characters are reordered, see W6.
IOW, these distinctions are needed to produce the expected reordered
order in each case.
From aprilop at fn.de Sat Jul 2 06:22:09 2022
From: aprilop at fn.de (Andreas Prilop)
Date: Sat, 02 Jul 2022 11:22:09 +0000
Subject: Different Bidirectional Character Types
In-Reply-To: <835ykfddpa.fsf@gnu.org>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2> <835ykfddpa.fsf@gnu.org>
Message-ID:
On 2 July 2022, Eli Zaretskii wrote:
> I think a simpler answer is that Arabic letters (bidi class AL) in
> some cases make European Numbers (EN) behave like Arabic Numbers (AN);
> see rule W2 of UAX#9. And Arabic Numbers then affect how other "weak"
> characters are reordered, see W6.
My question was: Why?
http://google.com/search?q=555-2368+%22%D7%A8%D7%95%D7%A7%D7%A4%D7%95%D7%A8%D7%93%22&filter=0
displays the number ?555-2368?.
http://google.com/search?q=555-2368+%22%D8%B1%D9%88%D9%83%D9%81%D9%88%D8%B1%D8%AF%22&filter=0
displays the number ?2368-555?.
Why this difference?
And why are Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?)
treated differently?
From eliz at gnu.org Sat Jul 2 06:56:29 2022
From: eliz at gnu.org (Eli Zaretskii)
Date: Sat, 02 Jul 2022 14:56:29 +0300
Subject: Different Bidirectional Character Types
In-Reply-To: (message from
Andreas Prilop via Unicode on Sat, 02 Jul 2022 11:22:09 +0000)
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2> <835ykfddpa.fsf@gnu.org>
Message-ID: <83v8sfbudu.fsf@gnu.org>
> Date: Sat, 02 Jul 2022 11:22:09 +0000
> From: Andreas Prilop via Unicode
>
> On 2 July 2022, Eli Zaretskii wrote:
>
> > I think a simpler answer is that Arabic letters (bidi class AL) in
> > some cases make European Numbers (EN) behave like Arabic Numbers (AN);
> > see rule W2 of UAX#9. And Arabic Numbers then affect how other "weak"
> > characters are reordered, see W6.
>
> My question was: Why?
>
> http://google.com/search?q=555-2368+%22%D7%A8%D7%95%D7%A7%D7%A4%D7%95%D7%A8%D7%93%22&filter=0
> displays the number ?555-2368?.
>
> http://google.com/search?q=555-2368+%22%D8%B1%D9%88%D9%83%D9%81%D9%88%D8%B1%D8%AF%22&filter=0
> displays the number ?2368-555?.
>
> Why this difference?
>
> And why are Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?)
> treated differently?
Because the expected order on display is different.
The expected order differs because the way different script are
written differs, the reasons are largely historical and cultural,
AFAIK.
IOW, the reasons for these differences are instrumental, not
theoretical: we need the characters to behave differently when
reordered.
From haberg-1 at telia.com Sat Jul 2 14:46:52 2022
From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=)
Date: Sat, 2 Jul 2022 21:46:52 +0200
Subject: Different Bidirectional Character Types
In-Reply-To: <20220702105446.033065ab@JRWUBU2>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
Message-ID:
> On 2 Jul 2022, at 11:54, Richard Wordingham via Unicode wrote:
>
> On Sat, 2 Jul 2022 11:01:00 +0200
> Hans ?berg via Unicode wrote:
>
>>> On 1 Jul 2022, at 14:15, Andreas Prilop via Unicode
>>> wrote:
>>>
>>> Reference:
>>> https://unicode.org/reports/tr9/#Bidirectional_Character_Types
>>>
>>> Why do Hebrew letters and Arabic letters have different
>>> bidirectional character types?
>>
>> I cannot parse this, but in Hebrew, Arabic, and Persian, text is
>> written RTL, but numbers LTR. For example, trying A123 in a
>> translator supporting those scripts, I get: ?123 ? ???
>> ? ???
>
> For numbers, using natural language, you don't mean LTR, but 'with the
> most significant digit on the left'.
I asked some Arab speaking how they think about it when writing numbers, and they said they indeed think about it as writing LTR, and not RTL with changed endianness. In a file with RTL/LTR markers, by this, the digits get the same order. I assumed this is how Unicode represents it, but it would be nice with clarification.
From textexin at xencraft.com Sat Jul 2 16:02:31 2022
From: textexin at xencraft.com (Tex)
Date: Sat, 2 Jul 2022 14:02:31 -0700
Subject: Different Bidirectional Character Types
In-Reply-To:
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2> <835ykfddpa.fsf@gnu.org>
Message-ID: <001e01d88e57$0be510d0$23af3270$@xencraft.com>
On my windows system, using either chrome or firefox, both links display the same for me.
What setup are you using Andreas?
tex
-----Original Message-----
From: Unicode [mailto:unicode-bounces at corp.unicode.org] On Behalf Of Andreas Prilop via Unicode
Sent: Saturday, July 2, 2022 4:22 AM
To: unicode at corp.unicode.org
Subject: Re: Different Bidirectional Character Types
On 2 July 2022, Eli Zaretskii wrote:
> I think a simpler answer is that Arabic letters (bidi class AL) in
> some cases make European Numbers (EN) behave like Arabic Numbers (AN);
> see rule W2 of UAX#9. And Arabic Numbers then affect how other "weak"
> characters are reordered, see W6.
My question was: Why?
http://google.com/search?q=555-2368+%22%D7%A8%D7%95%D7%A7%D7%A4%D7%95%D7%A8%D7%93%22&filter=0
displays the number ?555-2368?.
http://google.com/search?q=555-2368+%22%D8%B1%D9%88%D9%83%D9%81%D9%88%D8%B1%D8%AF%22&filter=0
displays the number ?2368-555?.
Why this difference?
And why are Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?)
treated differently?
From eliz at gnu.org Sun Jul 3 00:04:17 2022
From: eliz at gnu.org (Eli Zaretskii)
Date: Sun, 03 Jul 2022 08:04:17 +0300
Subject: Different Bidirectional Character Types
In-Reply-To: (message from
Hans =?utf-8?Q?=C3=85berg?= via Unicode on Sat, 2 Jul 2022 21:46:52 +0200)
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
Message-ID: <8335fibxda.fsf@gnu.org>
> Date: Sat, 2 Jul 2022 21:46:52 +0200
> Cc: unicode at corp.unicode.org
> From: Hans ?berg via Unicode
>
> > For numbers, using natural language, you don't mean LTR, but 'with the
> > most significant digit on the left'.
>
> I asked some Arab speaking how they think about it when writing numbers, and they said they indeed think about it as writing LTR, and not RTL with changed endianness. In a file with RTL/LTR markers, by this, the digits get the same order. I assumed this is how Unicode represents it, but it would be nice with clarification.
I thin UAX#9 clarifies it perfectly: numbers are displayed in LTR
order.
From aprilop at fn.de Sun Jul 3 00:51:36 2022
From: aprilop at fn.de (Andreas Prilop)
Date: Sun, 03 Jul 2022 05:51:36 +0000
Subject: Different Bidirectional Character Types
In-Reply-To: <001e01d88e57$0be510d0$23af3270$@xencraft.com>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2> <835ykfddpa.fsf@gnu.org>
<001e01d88e57$0be510d0$23af3270$@xencraft.com>
Message-ID: <067FEF07-4060-42B8-AED7-8D71C32D627F@fn.de>
On 2 July 2022, Tex wrote:
>> http://google.com/search?q=555-2368+%22%D7%A8%D7%95%D7%A7%D7%A4%D7%95%D7%A8%D7%93%22&filter=0
>> displays the number ?555-2368?.
>>
>> http://google.com/search?q=555-2368+%22%D8%B1%D9%88%D9%83%D9%81%D9%88%D8%B1%D8%AF%22&filter=0
>> displays the number ?2368-555?.
>
> On my windows system, using either chrome or firefox, both links display the same for me.
Sorry for the confusion.
Not the link itself, but the results, the found pages.
I search for ?555-2368?.
With Hebrew letters, the display is ?555-2368?.
With Arabic letters, the display is ?2368-555?.
Look at the results.
And my other question
>> And why are Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?)
>> treated differently?
I write ?1999-12-31?.
The display is ?1999-12-31? with Persian digits.
The display is ?31-12-1999? with Arabic-Indic digits.
https://corp.unicode.org/pipermail/unicode/2022-July/010191.html
From richard.wordingham at ntlworld.com Sun Jul 3 04:13:08 2022
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Sun, 3 Jul 2022 10:13:08 +0100
Subject: Different Bidirectional Character Types
In-Reply-To: <8335fibxda.fsf@gnu.org>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
Message-ID: <20220703101308.55a36bf0@JRWUBU2>
On Sun, 03 Jul 2022 08:04:17 +0300
Eli Zaretskii via Unicode wrote:
> > Date: Sat, 2 Jul 2022 21:46:52 +0200
> > Cc: unicode at corp.unicode.org
> > From: Hans ?berg via Unicode
> >
> > > For numbers, using natural language, you don't mean LTR, but
> > > 'with the most significant digit on the left'.
> >
> > I asked some Arab speaking how they think about it when writing
> > numbers, and they said they indeed think about it as writing LTR,
> > and not RTL with changed endianness. In a file with RTL/LTR
> > markers, by this, the digits get the same order. I assumed this is
> > how Unicode represents it, but it would be nice with clarification.
> >
>
> I thin UAX#9 clarifies it perfectly: numbers are displayed in LTR
> order.
But Hans is forwarding an answer as to which digit comes first when
divorced from computers.
The order of writing can in general be quite variable. For example,
although the ordering vowel then tone is widely taught in Thailand, at
least for vertical stacks, I've seen evidence of people trying to write
the marks in a Tai Tham stack . (The marks were TONE-1, MAI KANG
and then SIGN OA BELOW. Many people want to write the last of these
,
seventy years ago they would have said the consonant first; nowadays,
they usually say the preposed vowel first. The fine details of the
old scheme seem to be lost.
Richard.
From aprilop at fn.de Sun Jul 3 04:20:07 2022
From: aprilop at fn.de (Andreas Prilop)
Date: Sun, 03 Jul 2022 09:20:07 +0000
Subject: Different Bidirectional Character Types
In-Reply-To: <8335fibxda.fsf@gnu.org>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
Message-ID: <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
On 3 July 2022, Eli Zaretskii wrote:
> I thin UAX#9 clarifies it perfectly: numbers are displayed in LTR order.
This is undisputed.
I ask about the differences
?555-2368? vs. ?2368-555?
?1=3?2? vs. ?1=2?3?
?1999-12-31? vs. ?31-12-1999?
The Bidirectional Algorithm is responsible for these differences. But why?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From eliz at gnu.org Sun Jul 3 04:42:39 2022
From: eliz at gnu.org (Eli Zaretskii)
Date: Sun, 03 Jul 2022 12:42:39 +0300
Subject: Different Bidirectional Character Types
In-Reply-To: <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de> (message from
Andreas Prilop via Unicode on Sun, 03 Jul 2022 09:20:07 +0000)
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
Message-ID: <83tu7ya5ww.fsf@gnu.org>
> Date: Sun, 03 Jul 2022 09:20:07 +0000
> From: Andreas Prilop via Unicode
>
> I ask about the differences
>
> ?555-2368? vs. ?2368-555?
>
> ?1=3?2? vs. ?1=2?3?
>
> ?1999-12-31? vs. ?31-12-1999?
>
> The Bidirectional Algorithm is responsible for these differences. But why?
Because that's how the users of each script want the text to be
displayed in these cases. The UBA was specified as it is to satisfy
the expectations of the users of the respective scripts. Those
expectations have to do with history, traditions, and culture.
And please note that your cases are no longer just numbers, they
involve the dash ('-'), which is a "weak" character, and its
reordering for display depends on surrounding text.
From textexin at xencraft.com Sun Jul 3 15:36:27 2022
From: textexin at xencraft.com (Tex)
Date: Sun, 3 Jul 2022 13:36:27 -0700
Subject: Different Bidirectional Character Types
In-Reply-To: <067FEF07-4060-42B8-AED7-8D71C32D627F@fn.de>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2> <835ykfddpa.fsf@gnu.org>
<001e01d88e57$0be510d0$23af3270$@xencraft.com>
<067FEF07-4060-42B8-AED7-8D71C32D627F@fn.de>
Message-ID: <003901d88f1c$925bdde0$b71399a0$@xencraft.com>
I understood you meant the results. Perhaps we are seeing different results. Mine are consistent "555-2368?.
If you want I can send you screen shots.
Ah ok, I moved the number after the Arabic text in the search string and then the search flips the numbers around the hyphen.
For Hebrew ahead of the number it does not.
tex
-----Original Message-----
From: Unicode [mailto:unicode-bounces at corp.unicode.org] On Behalf Of Andreas Prilop via Unicode
Sent: Saturday, July 2, 2022 10:52 PM
To: unicode at corp.unicode.org
Subject: Re: Different Bidirectional Character Types
On 2 July 2022, Tex wrote:
>> http://google.com/search?q=555-2368+%22%D7%A8%D7%95%D7%A7%D7%A4%D7%95%D7%A8%D7%93%22&filter=0
>> displays the number ?555-2368?.
>>
>> http://google.com/search?q=555-2368+%22%D8%B1%D9%88%D9%83%D9%81%D9%88%D8%B1%D8%AF%22&filter=0
>> displays the number ?2368-555?.
>
> On my windows system, using either chrome or firefox, both links display the same for me.
Sorry for the confusion.
Not the link itself, but the results, the found pages.
I search for ?555-2368?.
With Hebrew letters, the display is ?555-2368?.
With Arabic letters, the display is ?2368-555?.
Look at the results.
And my other question
>> And why are Arabic-Indic digits (U+0660 ?) and Persian digits (U+06F0 ?)
>> treated differently?
I write ?1999-12-31?.
The display is ?1999-12-31? with Persian digits.
The display is ?31-12-1999? with Arabic-Indic digits.
https://corp.unicode.org/pipermail/unicode/2022-July/010191.html
From asmusf at ix.netcom.com Tue Jul 5 20:44:54 2022
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Tue, 5 Jul 2022 18:44:54 -0700
Subject: Different Bidirectional Character Types
In-Reply-To: <7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
Message-ID: <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
An HTML attachment was scrubbed...
URL:
From ishida at w3.org Mon Jul 11 05:39:27 2022
From: ishida at w3.org (r12a)
Date: Mon, 11 Jul 2022 11:39:27 +0100
Subject: Different Bidirectional Character Types
In-Reply-To: <551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
Message-ID:
Does this help clarify the original question?
Modern Standard Arabic: Expressions & sequences
https://r12a.github.io/scripts/arabic/arb.html#expressions
?see also
https://r12a.github.io/scripts/arabic/block.html#ar061C
ri
Asmus Freytag via Unicode wrote on 06/07/2022 02:44:
> On 7/3/2022 2:20 AM, Andreas Prilop via Unicode wrote:
>> On 3 July 2022, Eli Zaretskii wrote:
>>
>>> I thin UAX#9 clarifies it perfectly: numbers are displayed in LTR order.
>> This is undisputed.
>> I ask about the differences
>>
>> ?555-2368? vs. ?2368-555?
>>
>> ?1=3?2? vs. ?1=2?3?
>>
>> ?1999-12-31? vs. ?31-12-1999?
>>
>> The Bidirectional Algorithm is responsible for these differences. But why?
>
> The real answer is that this matches differences in displaying lists
> of numbers (!) not order of digits, in Hebrew vs. Arabic.
>
> The Bidi algorithm uses the classes AL and AN (and rules that resolve
> them) to implement these inherent differences in the way the various
> scripts handle such cases (multiple groups of digits separated by punct).
>
> As I mentioned, I raised a public review issue to make sure that UAX#9
> either *specifically and explicitly* cites or, alternatively,
> incorporates language that explains scripts have different preferences
> in resolving groups of numbers (not: digits) and points in a high
> level to where in the spec these preferences are addressed.
>
> I agree, it's not enough to reverse engineer the algorithm and
> conclude that it behaves as specd. It should be a simple matter to
> understand why it was designed the way it was.
>
> A./
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From asmusf at ix.netcom.com Mon Jul 11 12:34:49 2022
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Mon, 11 Jul 2022 10:34:49 -0700
Subject: Different Bidirectional Character Types
In-Reply-To:
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
Message-ID: <9b381ef2-6e43-5488-ba26-eeae1f6ad7aa@ix.netcom.com>
An HTML attachment was scrubbed...
URL:
From aprilop at fn.de Mon Jul 11 14:07:19 2022
From: aprilop at fn.de (Andreas Prilop)
Date: Mon, 11 Jul 2022 19:07:19 +0000
Subject: Different Bidirectional Character Types
In-Reply-To: <9b381ef2-6e43-5488-ba26-eeae1f6ad7aa@ix.netcom.com>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<9b381ef2-6e43-5488-ba26-eeae1f6ad7aa@ix.netcom.com>
Message-ID: <7C607CF2-7DBD-4D05-B47A-59A31301A29F@fn.de>
On 11 July 2022, Asmus Freytag wrote:
>> https://r12a.github.io/scripts/arabic/arb.html#expressions
>> https://r12a.github.io/scripts/arabic/block.html#ar061C
>
> I think these are excellent summaries and we should make sure
> we include a high-level version of this in the intro to UAX#9
> so that readers at least know what types of issues the algorithm
> tries to address.
I agree. Thank you very much for these links!
They are in deed very helpful.
From richard.wordingham at ntlworld.com Mon Jul 11 20:39:47 2022
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Tue, 12 Jul 2022 02:39:47 +0100
Subject: Different Bidirectional Character Types
In-Reply-To:
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
Message-ID: <20220712023947.157285c9@JRWUBU2>
On Mon, 11 Jul 2022 11:39:27 +0100
r12a via Unicode wrote:
> Does this help clarify the original question?
>
> Modern Standard Arabic: Expressions & sequences
> https://r12a.github.io/scripts/arabic/arb.html#expressions
It gives an inkling. However, I don't understand, "The
underlying order of characters, and the typing order remain the same."
The text of Figures 4 and 5 has to differ by more than the language
tagging. Is this a corruption of an example which had (Near Eastern)
Arabic numerals for Figure 4 and Eastern Arabic numerals for Figure 5?
Supporting quotations would help, as this example looks weird. Are
Persian number ranges calqued from European languages?
Richard.
From ishida at w3.org Tue Jul 12 01:27:59 2022
From: ishida at w3.org (r12a)
Date: Tue, 12 Jul 2022 07:27:59 +0100
Subject: Different Bidirectional Character Types
In-Reply-To: <20220712023947.157285c9@JRWUBU2>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
Message-ID: <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
Richard Wordingham via Unicode wrote on 12/07/2022 02:39:
> On Mon, 11 Jul 2022 11:39:27 +0100
> r12a via Unicode wrote:
>
>> Does this help clarify the original question?
>>
>> Modern Standard Arabic: Expressions & sequences
>> https://r12a.github.io/scripts/arabic/arb.html#expressions
> It gives an inkling. However, I don't understand, "The
> underlying order of characters, and the typing order remain the same."
> The text of Figures 4 and 5 has to differ by more than the language
> tagging. Is this a corruption of an example which had (Near Eastern)
> Arabic numerals for Figure 4 and Eastern Arabic numerals for Figure 5?
>
> Supporting quotations would help, as this example looks weird. Are
> Persian number ranges calqued from European languages?
To make it clearer that this is just about the order of the displayed
text, i changed the sentence you mentioned to "The underlying order of
the digits...".? The difference is actually produced in this case by the
addition of an LRM to the Persian, and not by the language setting. If
you click on the image you'll see the characters that make up each example.
ri
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From aprilop at fn.de Tue Jul 12 10:48:31 2022
From: aprilop at fn.de (Andreas Prilop)
Date: Tue, 12 Jul 2022 15:48:31 +0000
Subject: Different Bidirectional Character Types
In-Reply-To: <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
Message-ID:
On 12 July 2022, r12a wrote:
>>> https://r12a.github.io/scripts/arabic/arb.html#expressions
>
> The difference is actually produced in this case by the addition
> of an LRM to the Persian, and not by the language setting.
It is still Arabic. The Arabic word ?? needs to be translated to ??.
And the month should be spelled ????.
From ishida at w3.org Tue Jul 12 11:08:48 2022
From: ishida at w3.org (r12a)
Date: Tue, 12 Jul 2022 17:08:48 +0100
Subject: Different Bidirectional Character Types
In-Reply-To:
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
Message-ID: <1364573f-44f6-c393-8366-8cedb0e695d4@w3.org>
Erk. Thanks for pointing that out. Should be fixed now.
ri
Andreas Prilop via Unicode wrote on 12/07/2022 16:48:
> On 12 July 2022, r12a wrote:
>
>>>> https://r12a.github.io/scripts/arabic/arb.html#expressions
>> The difference is actually produced in this case by the addition
>> of an LRM to the Persian, and not by the language setting.
> It is still Arabic. The Arabic word ?? needs to be translated to ??.
> And the month should be spelled ????.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From aprilop at fn.de Tue Jul 12 11:32:32 2022
From: aprilop at fn.de (Andreas Prilop)
Date: Tue, 12 Jul 2022 16:32:32 +0000
Subject: Different Bidirectional Character Types
In-Reply-To:
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
Message-ID:
On 12 July 2022, I wrote:
> And the month should be spelled ????.
This applies to both Arabic and Persian.
From asmusf at ix.netcom.com Tue Jul 12 20:49:08 2022
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Tue, 12 Jul 2022 18:49:08 -0700
Subject: Different Bidirectional Character Types
In-Reply-To: <5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
Message-ID: <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com>
An HTML attachment was scrubbed...
URL:
From ishida at w3.org Wed Jul 13 04:58:22 2022
From: ishida at w3.org (r12a)
Date: Wed, 13 Jul 2022 10:58:22 +0100
Subject: Different Bidirectional Character Types
In-Reply-To: <79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
<79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com>
Message-ID: <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org>
The approach differs not only by script and which digits are used, but
also by language.? Arabic and Persian both use the Arabic script, but do
things differently when it comes to ordering components of a range or
expression.
Also, we should probably mention that some scripts don't display simple
numbers like Arabic/Hebrew, either. For example, in Adlam & N'Ko and
various historical scripts numbers have the most-significant digit on
the right.
ri
Asmus Freytag via Unicode wrote on 13/07/2022 02:49:
>
> I suggest we add something like the following to the Bidi FAQ:
>
> Q: Do modern bidirectional scripts all behave the same?
>
> While Arabic and Hebrew agree on the same ordering of digits, with the
> most-significant digit on the left, the layout of entire numbers in
> context, including groups of numbers or use of number?separators,
> numerical and other punctuation differs both by script and, in the
> case of Arabic, by which set of digits is used. No matter how the
> layout is resolved the order of characters in memory?essentially
> follows the order they are typed.
>
> Here are some papers that explore this in-depth with examples:
> https://r12a.github.io/scripts/arabic/arb.html#expressions
> https://r12a.github.io/scripts/arabic/block.html#ar061C
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From asmusf at ix.netcom.com Wed Jul 13 09:43:02 2022
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Wed, 13 Jul 2022 07:43:02 -0700
Subject: Different Bidirectional Character Types
In-Reply-To: <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
<79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com>
<84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org>
Message-ID:
On 7/13/2022 2:58 AM, r12a wrote:
> The approach differs not only by script and which digits are used, but
> also by language. Arabic and Persian both use the Arabic script, but
> do things differently when it comes to ordering components of a range
> or expression.
Isn't that difference handled by having two different sets of digits? As
opposed to relying on a language tag.
A./
>
> Also, we should probably mention that some scripts don't display
> simple numbers like Arabic/Hebrew, either. For example, in Adlam &
> N'Ko and various historical scripts numbers have the most-significant
> digit on the right.
>
> ri
>
> Asmus Freytag via Unicode wrote on 13/07/2022 02:49:
>>
>> I suggest we add something like the following to the Bidi FAQ:
>>
>> Q: Do modern bidirectional scripts all behave the same?
>>
>> While Arabic and Hebrew agree on the same ordering of digits, with
>> the most-significant digit on the left, the layout of entire numbers
>> in context, including groups of numbers or use of number?separators,
>> numerical and other punctuation differs both by script and, in the
>> case of Arabic, by which set of digits is used. No matter how the
>> layout is resolved the order of characters in memory?essentially
>> follows the order they are typed.
>>
>> Here are some papers that explore this in-depth with examples:
>> https://r12a.github.io/scripts/arabic/arb.html#expressions
>> https://r12a.github.io/scripts/arabic/block.html#ar061C
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From ishida at w3.org Wed Jul 13 09:51:29 2022
From: ishida at w3.org (r12a)
Date: Wed, 13 Jul 2022 15:51:29 +0100
Subject: Different Bidirectional Character Types
In-Reply-To:
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
<79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com>
<84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org>
Message-ID:
Asmus Freytag wrote on 13/07/2022 15:43:
> On 7/13/2022 2:58 AM, r12a wrote:
>> The approach differs not only by script and which digits are used,
>> but also by language. Arabic and Persian both use the Arabic script,
>> but do things differently when it comes to ordering components of a
>> range or expression.
>
> Isn't that difference handled by having two different sets of digits?
> As opposed to relying on a language tag.
>
See the cases in figs. 3 and 4 at
https://r12a.github.io/scripts/arabic/arb.html#expressions.? Same
digits, different expectations about directionality.
I wasn't talking about language tags or behaviour arising from character
properties (indeed the language tag doesn't make a difference) ? i was
talking about the user expectations differing from language to language
about the order in which digits appear in the text.
hth
ri
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From asmusf at ix.netcom.com Wed Jul 13 10:03:07 2022
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Wed, 13 Jul 2022 08:03:07 -0700
Subject: Different Bidirectional Character Types
In-Reply-To:
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
<79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com>
<84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org>
Message-ID:
An HTML attachment was scrubbed...
URL:
From richard.wordingham at ntlworld.com Wed Jul 13 14:44:46 2022
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Wed, 13 Jul 2022 20:44:46 +0100
Subject: Different Bidirectional Character Types
In-Reply-To:
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
<79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com>
<84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org>
Message-ID: <20220713204446.29499a36@JRWUBU2>
On Wed, 13 Jul 2022 08:03:07 -0700
Asmus Freytag via Unicode wrote:
> On 7/13/2022 7:51 AM, r12a via Unicode wrote:
> Asmus Freytag wrote on 13/07/2022 15:43:
> > On 7/13/2022 2:58 AM, r12a wrote:
> >> The approach differs not only by script and which digits are used,
> >> but also by language.? Arabic and Persian both use the Arabic
> >> script, but do things differently when it comes to ordering
> >> components of a range or expression.
> >>>
> >> Isn't that difference handled by having two different sets of
> >> digits? As opposed to relying on a language tag.
> >>
> > See the cases in figs. 3 and 4 at
> > https://r12a.github.io/scripts/arabic/arb.html#expressions.? Same
> > digits, different expectations about directionality.
> >
> > I wasn't talking about language tags or behaviour arising from
> > character properties (indeed the language tag doesn't make a
> > difference) ? i was talking about the user expectations differing
> > from language to language about the order in which digits appear in
> > the text.
> >
> If I understand correctly, this would be a case that's not handled by
> the UBA, then. Would that be worth calling out, you think?
And to answer the original question, it would be good to start with the
user expectations, and then explain how the UBA reduces (does it?) the
jiggery pokery required of the typist to get the desired outcome. In
particular, we seem to be exploiting a difference in glyph styles and
promoting it it a character difference to get a left-to-right ordering.
Richard.
Richard.
From richard.wordingham at ntlworld.com Wed Jul 13 14:51:08 2022
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Wed, 13 Jul 2022 20:51:08 +0100
Subject: Different Bidirectional Character Types
In-Reply-To: <84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
<79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com>
<84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org>
Message-ID: <20220713205108.11c29995@JRWUBU2>
On Wed, 13 Jul 2022 10:58:22 +0100
r12a via Unicode wrote:
> The approach differs not only by script and which digits are used,
> but also by language.? Arabic and Persian both use the Arabic script,
> but do things differently when it comes to ordering components of a
> range or expression.
>
> Also, we should probably mention that some scripts don't display
> simple numbers like Arabic/Hebrew, either. For example, in Adlam &
> N'Ko and various historical scripts numbers have the most-significant
> digit on the right.
What are these historical scripts with the most significant digit on
the right?
Richard.
From ishida at w3.org Thu Jul 14 00:49:06 2022
From: ishida at w3.org (r12a)
Date: Thu, 14 Jul 2022 06:49:06 +0100
Subject: Different Bidirectional Character Types
In-Reply-To: <20220713205108.11c29995@JRWUBU2>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
<79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com>
<84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org>
<20220713205108.11c29995@JRWUBU2>
Message-ID: <87bfec58-acd0-3247-28a7-0ebda3573577@w3.org>
Richard Wordingham via Unicode wrote on 13/07/2022 20:51:
> What are these historical scripts with the most significant digit on
> the right?
Go to
https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=[:Bidi_Class=Right_To_Left:]
and find sections that contain "Numbers".
The list includes Imperial Aramaic, Palmyrene, Nabataean, etc...
ri
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From richard.wordingham at ntlworld.com Thu Jul 14 15:59:08 2022
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Thu, 14 Jul 2022 21:59:08 +0100
Subject: Different Bidirectional Character Types
In-Reply-To: <87bfec58-acd0-3247-28a7-0ebda3573577@w3.org>
References: <118F5820-ABBF-47E1-9A90-C313459AD7A0@fn.de>
<20220702105446.033065ab@JRWUBU2>
<8335fibxda.fsf@gnu.org>
<7B6DAF42-3ABF-4C45-BD11-29EBD6B512BB@fn.de>
<551dd7e2-0a05-d640-28ff-9c621351125a@ix.netcom.com>
<20220712023947.157285c9@JRWUBU2>
<5b777fca-a9b1-a06e-18e8-af7172635621@w3.org>
<79c1ed70-ac31-33c2-e8d3-a7212645e44a@ix.netcom.com>
<84c1ae3b-c142-dfcf-6a61-4321d32e8b2c@w3.org>
<20220713205108.11c29995@JRWUBU2>
<87bfec58-acd0-3247-28a7-0ebda3573577@w3.org>
Message-ID: <20220714215908.5e920b8e@JRWUBU2>
On Thu, 14 Jul 2022 06:49:06 +0100
r12a via Unicode wrote:
> Richard Wordingham via Unicode wrote on 13/07/2022 20:51:
> > What are these historical scripts with the most significant digit on
> > the right?
>
> Go to
> https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=[:Bidi_Class=Right_To_Left:]
> and find sections that contain "Numbers".
>
> The list includes Imperial Aramaic, Palmyrene, Nabataean, etc...
But I see no ancient digits, and very much nothing that TUS calls a
digit. The Mende Kikakui 'digits' would qualify, except that they're a
20th century system.
Richard.
From markus.icu at gmail.com Mon Jul 18 14:18:31 2022
From: markus.icu at gmail.com (Markus Scherer)
Date: Mon, 18 Jul 2022 12:18:31 -0700
Subject: Unqualified vs. minimally-qualified emoji
In-Reply-To: <08b3fded-7ae3-81a8-c223-2a878d53d929@gmx.de>
References: <08b3fded-7ae3-81a8-c223-2a878d53d929@gmx.de>
Message-ID:
Dear Matthias,
On Wed, Apr 6, 2022 at 10:02 PM Matthias Reitinger via Unicode <
unicode at corp.unicode.org> wrote:
> ...
>
> With this definitions I would expect the code point sequence
>
> 1F441 FE0F 200D 1F5E8
> (EYE, VARIATION SELECTOR-16, ZERO WIDTH JOINER, LEFT SPEECH BUBBLE)
>
> to be a minimally-qualified emoji:
>
> ...
>
> However, emoji-test.txt [2] lists this sequence as "unqualified".
>
> Can someone please explain why? Did I misinterpret the definitions, or is
> this
> an error in the emoji-test.txt file?
>
Did you get an answer to your question?
If not, then you could try to submit a bug report:
https://www.unicode.org/reporting.html "Report Error in Publication/Data"
Best regards,
markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From gtbot2007 at gmail.com Sat Jul 23 07:57:07 2022
From: gtbot2007 at gmail.com (Gabriel Tellez)
Date: Sat, 23 Jul 2022 08:57:07 -0400
Subject: Hoefler Text Ornaments
Message-ID:
I don't understand why Wingdings/Webdings and Zapf Dingbats get to be in
Unicode but not Hoefler Text Ornaments. (Not going to ask about Apple
Symbols because that's a icon font not a dingbat font)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From richard.wordingham at ntlworld.com Sat Jul 23 11:12:44 2022
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Sat, 23 Jul 2022 17:12:44 +0100
Subject: Tai Tham Text Encoding
Message-ID: <20220723171244.7fb392af@JRWUBU2>
Most characters for writing words in the Tai Tham script in normal
texts have been encoded, though there are a few exceptions, of which
TAI THAM LETTER LAO LOW HA is the most prominent exception. (This is
mostly handled by repurposing TAI THAM LETTER LOW HA, which is not used
in Lao. Their relationship is like U+11034 BRAHMI LETTER LLA and
U+11075 BRAHMI LETTER OLD LETTER LLA.) On close reading of the TUS,
perhaps we also need to disunify U+1A58 TAI THAM SIGN MAI KANG LAI
depending on how it may be positioned relative to a following syllable
with a preposed vowel. (It was originally proposed as two separate
characters, distinguished by shape rather than positioning.) We may
need some monstrosities such as 'INVISIBLE MAI SAM' (though I'd rather
use CGJ).
However, I am having a hard time persuading people that there is a
defined encoding for combinations of characters that rendering engines
should respect. What I regard as the basic definition of the encoding
of text is contained in the approved proposals, rather than in TUS or
any emanation thereof.
What should I call the specification of the encoding of text, as
opposed to the encoding of characters? Would it be suitable to refer
to it as 'text encoding'?
I am trying to work out what in the way of Tai Tham text encoding is
laid down by the TUS and its emanations, such as the Unicode Character
Database. It is significant that the Indic syllabic category is
informative and by policy does not reflect sequencing requirements.
What I am left with is the general properties of marks, the principle
of canonical equivalence (which is still widely flouted) and the
specific text in the Tai Tham section.
Now, extracting specifications are a bit tricky. For example, consider
"*Tone Marks*. Tai Tham has two combining tone marks, U+1A75 tai tham
sign tone-1 and U+1A76 tai tham sign tone-2, which are used in Tai Lue
and in Northern Thai. These are rendered above the vowel over the base
consonant." In modern Tai Khuen, what I take to be TONE-1 is rendered
to the right of the larger vowels over the base consonant, such as
VOWEL SIGN I. Should I therefore conclude that what I have taken to be
TONE-1 is something else? That would be ridiculous. We also have the
statement in TUS Section 2.11 that "all sequences of character codes
are permitted".
I think I can extract some meaning from the text in the same section:
"Tone marks are represented in logical order fol-
lowing the vowel over the base consonant or consonant stack. If there
is no vowel over a base consonant, then the tone is rendered directly
over the consonant; this is the same way tones are treated in the Thai
script."
Consider the word ?????? in a typical Northern Thai style. The central stack, from top
to bottom, is TONE-1, SIGN I, HIGH KA, SIGN OA BELOW. If there were 'no
vowel over the base consonant', then TONE-1 would be rendered directly
over the base consonant, which is not how it is written. Therefore the
term 'vowel' refers to a vowel character rather than a complete
phonetic vowel. Therefore the logical order of the marks above and
below is either , as in the
proposals, or . The USE insists on ! (The USE order could be corrected by its override
method.)
By contrast, there is some useful text on the position of U+1A7B TAI
THAM SIGN MAI SAM in character code sequences.
In summary, my main two questions are:
Is 'encoding of text' the correct phrase for the definition of the
correct arrangement? Is it appropriate to submit a proposal for the
standardisation of Tai Tham text encoding?
Richard.
From beckiergb at gmail.com Sat Jul 23 14:04:20 2022
From: beckiergb at gmail.com (Rebecca Bettencourt)
Date: Sat, 23 Jul 2022 12:04:20 -0700
Subject: Hoefler Text Ornaments
In-Reply-To:
References:
Message-ID:
Because Apple has more sense than Microsoft and decided their dingbat fonts
don't need to be in Unicode.
Someone back in 2011 collected all the glyphs from Apple's dingbat fonts:
http://unicode.org/wg2/docs/n4127.pdf
And Apple provided a response:
http://unicode.org/L2/L2011/11309-apple-resp-n4127.pdf
-- Rebecca Bettencourt
On Sat, Jul 23, 2022 at 6:06 AM Gabriel Tellez via Unicode <
unicode at corp.unicode.org> wrote:
> I don't understand why Wingdings/Webdings and Zapf Dingbats get to be in
> Unicode but not Hoefler Text Ornaments. (Not going to ask about Apple
> Symbols because that's a icon font not a dingbat font)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From jameskass at code2001.com Sat Jul 23 17:07:17 2022
From: jameskass at code2001.com (James Kass)
Date: Sat, 23 Jul 2022 22:07:17 +0000
Subject: Hoefler Text Ornaments
In-Reply-To:
References:
Message-ID:
In 11309-apple-resp-n4127, John H. Jenkins wrote,
"Apple feels that, absent evidence of widespread use, dingbats and
similar glyphs are not suitable for general-purpose encoding."
and
"Apple feels that, in general, characters should be encoded in the Universal
Character Set only on the basis of demonstrated need for general text
interchange."
In N4127, Karl Pentzlin noted that no effort was made to determine
unification with existing characters, even in cases where unification
was obvious.? For example, Hoefler Glyph 57 "ORN-FLEURDELIS" is shown in
N4127 with a pointer to U+269C (?).? So some of the Hoefler ornaments
are already exchangeable in Unicode.
Apple didn't forbid future encoding of Hoefler ornaments, but rather
keeps the existing bar of demonstrable usage in place.
Any proposal to complete the Hoefler repertoire in Unicode would need to
carefully examine unification and then show that plain-text interchange
is necessary.
On 2022-07-23 7:04 PM, Rebecca Bettencourt via Unicode wrote:
> Because Apple has more sense than Microsoft and decided their dingbat fonts
> don't need to be in Unicode.
>
> Someone back in 2011 collected all the glyphs from Apple's dingbat fonts:
> http://unicode.org/wg2/docs/n4127.pdf
>
> And Apple provided a response:
> http://unicode.org/L2/L2011/11309-apple-resp-n4127.pdf
>
> -- Rebecca Bettencourt
>
>
> On Sat, Jul 23, 2022 at 6:06 AM Gabriel Tellez via Unicode <
> unicode at corp.unicode.org> wrote:
>
>> I don't understand why Wingdings/Webdings and Zapf Dingbats get to be in
>> Unicode but not Hoefler Text Ornaments. (Not going to ask about Apple
>> Symbols because that's a icon font not a dingbat font)
>>
From ivanpan3 at gmail.com Sat Jul 23 18:07:32 2022
From: ivanpan3 at gmail.com (Ivan Panchenko)
Date: Sun, 24 Jul 2022 01:07:32 +0200
Subject: =?UTF-8?B?Q2hhbmdlIOKAnFJlbGF0aW9u4oCdIHRvIOKAnExvZ2ljYWwgb3BlcmF0b3LigJ06IFUrMg==?=
=?UTF-8?B?MjYzIChTVFJJQ1RMWSBFUVVJVkFMRU5UIFRPKQ==?=
Message-ID:
The character U+2263 (? STRICTLY EQUIVALENT TO) is found under the subhead
?Relations?. I think it would be more appropriate to put it under ?Logical
operator? (for comparison: U+2227) because it stands for a connective in
modal logic: ? is strictly equivalent to ? if ? necessarily implies ?
and ? necessarily implies ?. Source: Fitch (1952, p. 77).
https://books.google.com/books?id=a3wIAQAAIAAJ&q=%22strictly+equivalent%22
One might object that ?is strictly equivalent to? (as opposed to
?necessarily if and only if?) is used in metalanguage for a relation
between logical formulas (? use?mention distinction). However, this is not
what the symbol ??? itself actually means, it is just that an alternative
to saying ?if and only if? is to say ?is equivalent to? and mention (rather
than use) the linked logical formulas. Likewise, one might read ?? ? ??
either as ?if ? then ?? or as ?? (materially) implies ??. This does not
change the fact that ??? and ??? are symbols of the logical OBJECT language.
(As a side note, usage of the triple bar ??? and of ?identity? in
mathematics is convoluted: In ordinary language, two distinct things might
be said to be ?equal? when they are equal in a certain respect (e.g.,
?sexual equality?). In mathematics, ?equals? (=) is simply used in the
sense of strict identity rather than for equivalence relations or
congruence relations in general, though convention has it that the equals
sign is more often read as ?equals? or ?is equal to? than ?is identical
to?, and ?(solving an) equation? is used while ?identity? occurs in
?identity function? and what is expressed by a statement of equality can be
called an identity (e.g., ?Euler?s identity?). As described so far, there
is no actual difference between ?is equal to? and ?is identical to? at all,
however, it seems that because we are only justified in proclaiming that an
identity holds if the statement is generally valid, this usage of
?identity? got CORRUPTED into saying things like ?This equation is an
identity? (meaning that the equation holds for all values) and ?is
identically equal to? (?); you can even find a few Google hits for
?identically less?/?identically greater?. ? Besides, ??? is used for
equivalence relations and for the logical equivalence connective.
When John Conway (in ?On Numbers and Games?) used ??? for identity
(expressing that two objects are one and the same object) and ?=? for
equality in a weaker sense than described above for mathematics, he might
have been influenced by the fact that ??? is sometimes read as ?is
identical(ly equal) to?, even though this so-called ?identity? is something
different from Conway?s identity altogether. Donald Knuth used the symbols
the other way round, which I like better.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From harjitmoe at outlook.com Sun Jul 24 04:04:27 2022
From: harjitmoe at outlook.com (Harriet Riddle)
Date: Sun, 24 Jul 2022 10:04:27 +0100
Subject: Hoefler Text Ornaments
In-Reply-To:
References:
Message-ID:
For further reference: the Nishiki-teki PUA scheme includes several
ornaments from the Hoefler, Bodoni and Caslon sets
(). These are not all
created equal: for example, among the Caslon ornaments encoded there,
one can see the English Rose, Scottish Thistle and Irish Harp embelms
(PUA+FEF95, PUA+FEF96 and PUA+FEF97) for example, which are emblematic
characters with clear identity and traditional meanings of their own
(not incomparable with the aforementioned French fleur-de-lis), but one
can also see a large number of nondescript and largely fungible
arabesques for which distinct semantic usages are highly improbable.
--Har.
James Kass via Unicode wrote:
>
> In 11309-apple-resp-n4127, John H. Jenkins wrote,
> "Apple feels that, absent evidence of widespread use, dingbats and
> similar glyphs are not suitable for general-purpose encoding."
>
> and
>
> "Apple feels that, in general, characters should be encoded in the
> Universal
> Character Set only on the basis of demonstrated need for general text
> interchange."
>
> In N4127, Karl Pentzlin noted that no effort was made to determine
> unification with existing characters, even in cases where unification
> was obvious.? For example, Hoefler Glyph 57 "ORN-FLEURDELIS" is shown
> in N4127 with a pointer to U+269C (?). So some of the Hoefler
> ornaments are already exchangeable in Unicode.
>
> Apple didn't forbid future encoding of Hoefler ornaments, but rather
> keeps the existing bar of demonstrable usage in place.
>
> Any proposal to complete the Hoefler repertoire in Unicode would need
> to carefully examine unification and then show that plain-text
> interchange is necessary.
>
>
> On 2022-07-23 7:04 PM, Rebecca Bettencourt via Unicode wrote:
>> Because Apple has more sense than Microsoft and decided their dingbat
>> fonts
>> don't need to be in Unicode.
>>
>> Someone back in 2011 collected all the glyphs from Apple's dingbat
>> fonts:
>> http://unicode.org/wg2/docs/n4127.pdf
>>
>> And Apple provided a response:
>> http://unicode.org/L2/L2011/11309-apple-resp-n4127.pdf
>>
>> -- Rebecca Bettencourt
>>
>>
>> On Sat, Jul 23, 2022 at 6:06 AM Gabriel Tellez via Unicode <
>> unicode at corp.unicode.org> wrote:
>>
>>> I don't understand why Wingdings/Webdings and Zapf Dingbats get to
>>> be in
>>> Unicode but not Hoefler Text Ornaments. (Not going to ask about Apple
>>> Symbols because that's a icon font not a dingbat font)
>>>
>
From karl-pentzlin at acssoft.de Sun Jul 24 15:49:48 2022
From: karl-pentzlin at acssoft.de (Karl Pentzlin)
Date: Sun, 24 Jul 2022 22:49:48 +0200
Subject: Hoefler Text Ornaments
In-Reply-To:
References:
Message-ID: <943300971.20220724224948@acssoft.de>
Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode:
JKvU> In N4127, Karl Pentzlin noted that no effort was made to determine unification with existing characters, even in cases where unification was obvious.
The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol Fonts: A Quick Survey", simply listing the (then) current use on the PUA by Apple. It was definitively not a proposal (alone by the fact that it listed PUA code points), and it was explicitly stated as subject of that document:
?The characters found are listed here without any further interpretation ? Especially, no names ? or properties are given, and it is not examined whether they can unified with existing Unicode characters, even for cases where this is obvious.?
This document was intended as a starting point for discussions which of these symbols deserve an encoding or unification in Unicode (after the Wingdings/Webdings discussion which resulted in encodings or unifications for almost all of them), but as apparently there was no interest in such discussions, no subsequent documents besides the Apple comment L2/11-309 (especially no proposals) had followed.
- Karl Pentzlin
From markus.icu at gmail.com Sun Jul 24 18:42:44 2022
From: markus.icu at gmail.com (Markus Scherer)
Date: Sun, 24 Jul 2022 16:42:44 -0700
Subject: Tai Tham Text Encoding
In-Reply-To: <20220723171244.7fb392af@JRWUBU2>
References: <20220723171244.7fb392af@JRWUBU2>
Message-ID:
On Sat, Jul 23, 2022 at 9:16 AM Richard Wordingham via Unicode <
unicode at corp.unicode.org> wrote:
> In summary, my main two questions are:
>
> Is 'encoding of text' the correct phrase for the definition of the
> correct arrangement?
It sounds reasonable, but will be easily confused with what are otherwise
called "charsets" and "code pages" etc.
It seems like we have a term for what you are after, but I can't put my
finger on it right now :-)
Is it appropriate to submit a proposal for the
> standardisation of Tai Tham text encoding?
>
I think so. Proposals are best if they are specific, that is, which text is
to be added or changed where, and to what. Changes to the core spec (the
"book")? A new Unicode Technical Note?
Best regards,
markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From richard.wordingham at ntlworld.com Sun Jul 24 19:21:13 2022
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Mon, 25 Jul 2022 01:21:13 +0100
Subject: Tai Tham Text Encoding
In-Reply-To:
References: <20220723171244.7fb392af@JRWUBU2>
Message-ID: <20220725012113.04378cf6@JRWUBU2>
On Sun, 24 Jul 2022 16:42:44 -0700
Markus Scherer via Unicode wrote:
> On Sat, Jul 23, 2022 at 9:16 AM Richard Wordingham via Unicode <
> unicode at corp.unicode.org> wrote:
>
> > In summary, my main two questions are:
> >
> > Is 'encoding of text' the correct phrase for the definition of the
> > correct arrangement?
>
>
> It sounds reasonable, but will be easily confused with what are
> otherwise called "charsets" and "code pages" etc.
> It seems like we have a term for what you are after, but I can't put
> my finger on it right now :-)
>
> Is it appropriate to submit a proposal for the
> > standardisation of Tai Tham text encoding?
Perhaps "standardisation of Tai Tham string encoding"? I'm not
entirely sure, because 'string' implies that one already has a linear
arrangement, but I am talking of how to select that string (more
precisely a trace, because of canonical equivalence), and the question
is the amount of zigzagging. "String selection" might be technically
correct, but could be taken as meaning 'choice of words'. Perhaps
"standardisation of Tai Tham character sequencing", but that suggests
visual orthographic rules.
Richard.
From gtbot2007 at gmail.com Mon Jul 25 06:30:08 2022
From: gtbot2007 at gmail.com (Gabriel Tellez)
Date: Mon, 25 Jul 2022 07:30:08 -0400
Subject: Hoefler Text Ornaments
In-Reply-To: <943300971.20220724224948@acssoft.de>
References:
<943300971.20220724224948@acssoft.de>
Message-ID:
Turns out there is also Bodoni Onaments (a font that I somehow missed)
and Type Embellishments One (a font that isn't on my computer but sounds
like it should be by default?).
On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode <
unicode at corp.unicode.org> wrote:
> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode:
>
> JKvU> In N4127, Karl Pentzlin noted that no effort was made to determine
> unification with existing characters, even in cases where unification was
> obvious.
>
> The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol Fonts: A
> Quick Survey", simply listing the (then) current use on the PUA by Apple.
> It was definitively not a proposal (alone by the fact that it listed PUA
> code points), and it was explicitly stated as subject of that document:
> ?The characters found are listed here without any further interpretation ?
> Especially, no names ? or properties are given, and it is not examined
> whether they can unified with existing Unicode characters, even for cases
> where this is obvious.?
>
> This document was intended as a starting point for discussions which of
> these symbols deserve an encoding or unification in Unicode (after the
> Wingdings/Webdings discussion which resulted in encodings or unifications
> for almost all of them), but as apparently there was no interest in such
> discussions, no subsequent documents besides the Apple comment L2/11-309
> (especially no proposals) had followed.
>
> - Karl Pentzlin
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From marius.spix at web.de Mon Jul 25 18:25:48 2022
From: marius.spix at web.de (Marius Spix)
Date: Tue, 26 Jul 2022 01:25:48 +0200
Subject: Hoefler Text Ornaments
In-Reply-To:
References:
<943300971.20220724224948@acssoft.de>
Message-ID: <20220726012548.51a1ebb5@spixxi>
There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook
97 and had been in circulation for a long time. Maybe it could be
considered as well.
I tried to map the glyphs.
U+F041 = U+1F56D RINGING BELL
U+F042 = U+1F511 KEY
U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS
U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH
SOLIDUS
U+F045 = new_codepoint PEOPLE FACING RIGHT
U+F046 = new_codepoint MEETING ROOM (table with three silhouettes)
U+F047 = U+1F4CE PAPERCLIP
U+F049 = U+1F382 BIRTHDAY CAKE
U+F04A = new_codepoint WAX SEAL (???)
U+F04D = new_codepoint ?????? (glyph has two variants: octagon with two
arrows pointing inthe middle or two crossed pencils)
U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???)
--
Marius Spix
On Mon, 25 Jul 2022 07:30:08 -0400
Gabriel Tellez via Unicode wrote:
> Turns out there is also Bodoni Onaments (a font that I somehow missed)
> and Type Embellishments One (a font that isn't on my computer but
> sounds like it should be by default?).
>
> On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode <
> unicode at corp.unicode.org> wrote:
>
> > Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode:
> >
> > JKvU> In N4127, Karl Pentzlin noted that no effort was made to
> > JKvU> determine
> > unification with existing characters, even in cases where
> > unification was obvious.
> >
> > The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol
> > Fonts: A Quick Survey", simply listing the (then) current use on
> > the PUA by Apple. It was definitively not a proposal (alone by the
> > fact that it listed PUA code points), and it was explicitly stated
> > as subject of that document: ?The characters found are listed here
> > without any further interpretation ? Especially, no names ? or
> > properties are given, and it is not examined whether they can
> > unified with existing Unicode characters, even for cases where this
> > is obvious.?
> >
> > This document was intended as a starting point for discussions
> > which of these symbols deserve an encoding or unification in
> > Unicode (after the Wingdings/Webdings discussion which resulted in
> > encodings or unifications for almost all of them), but as
> > apparently there was no interest in such discussions, no subsequent
> > documents besides the Apple comment L2/11-309 (especially no
> > proposals) had followed.
> >
> > - Karl Pentzlin
> >
> >
From gtbot2007 at gmail.com Mon Jul 25 18:51:43 2022
From: gtbot2007 at gmail.com (Gabriel Tellez)
Date: Mon, 25 Jul 2022 19:51:43 -0400
Subject: Hoefler Text Ornaments
In-Reply-To: <20220726012548.51a1ebb5@spixxi>
References:
<943300971.20220724224948@acssoft.de>
<20220726012548.51a1ebb5@spixxi>
Message-ID:
OUTLOOK.ttf is questionable as its an icon font and not a dingbat one
(though you can say the same with webdings), but since it's such a small
font I think it could pass
On Mon, Jul 25, 2022 at 7:26 PM Marius Spix wrote:
> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook
> 97 and had been in circulation for a long time. Maybe it could be
> considered as well.
>
> I tried to map the glyphs.
>
> U+F041 = U+1F56D RINGING BELL
> U+F042 = U+1F511 KEY
> U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS
> U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH
> SOLIDUS
> U+F045 = new_codepoint PEOPLE FACING RIGHT
> U+F046 = new_codepoint MEETING ROOM (table with three silhouettes)
> U+F047 = U+1F4CE PAPERCLIP
> U+F049 = U+1F382 BIRTHDAY CAKE
> U+F04A = new_codepoint WAX SEAL (???)
> U+F04D = new_codepoint ?????? (glyph has two variants: octagon with two
> arrows pointing inthe middle or two crossed pencils)
> U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???)
>
> --
>
> Marius Spix
>
>
> On Mon, 25 Jul 2022 07:30:08 -0400
> Gabriel Tellez via Unicode wrote:
>
> > Turns out there is also Bodoni Onaments (a font that I somehow missed)
> > and Type Embellishments One (a font that isn't on my computer but
> > sounds like it should be by default?).
> >
> > On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode <
> > unicode at corp.unicode.org> wrote:
> >
> > > Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode:
> > >
> > > JKvU> In N4127, Karl Pentzlin noted that no effort was made to
> > > JKvU> determine
> > > unification with existing characters, even in cases where
> > > unification was obvious.
> > >
> > > The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol
> > > Fonts: A Quick Survey", simply listing the (then) current use on
> > > the PUA by Apple. It was definitively not a proposal (alone by the
> > > fact that it listed PUA code points), and it was explicitly stated
> > > as subject of that document: ?The characters found are listed here
> > > without any further interpretation ? Especially, no names ? or
> > > properties are given, and it is not examined whether they can
> > > unified with existing Unicode characters, even for cases where this
> > > is obvious.?
> > >
> > > This document was intended as a starting point for discussions
> > > which of these symbols deserve an encoding or unification in
> > > Unicode (after the Wingdings/Webdings discussion which resulted in
> > > encodings or unifications for almost all of them), but as
> > > apparently there was no interest in such discussions, no subsequent
> > > documents besides the Apple comment L2/11-309 (especially no
> > > proposals) had followed.
> > >
> > > - Karl Pentzlin
> > >
> > >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From jameskass at code2001.com Mon Jul 25 19:24:57 2022
From: jameskass at code2001.com (James Kass)
Date: Tue, 26 Jul 2022 00:24:57 +0000
Subject: Hoefler Text Ornaments
In-Reply-To:
References:
<943300971.20220724224948@acssoft.de>
<20220726012548.51a1ebb5@spixxi>
Message-ID: <53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com>
As a visual aid, the MS Outlook glyphs are provided in the attached
graphic file.? Some of the glyphs noted by Marius Spix appear to have
been removed from the font by the time XP arrived, the graphic shows the
font version included with Windows XP.
Having established that certain glyphs exist, the next question is
whether people are exchanging them in plain-text.? If not, then could it
be demonstrated that users would benefit from the ability to do so?? If
not, then there is no path towards their encoding in the Standard.
On 2022-07-25 11:51 PM, Gabriel Tellez via Unicode wrote:
> OUTLOOK.ttf is questionable as its an icon font and not a dingbat one
> (though you can say the same with webdings), but since it's such a small
> font I think it could pass
>
> On Mon, Jul 25, 2022 at 7:26 PM Marius Spix wrote:
>
>> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook
>> 97 and had been in circulation for a long time. Maybe it could be
>> considered as well.
>>
>> I tried to map the glyphs.
>>
>> U+F041 = U+1F56D RINGING BELL
>> U+F042 = U+1F511 KEY
>> U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS
>> U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH
>> SOLIDUS
>> U+F045 = new_codepoint PEOPLE FACING RIGHT
>> U+F046 = new_codepoint MEETING ROOM (table with three silhouettes)
>> U+F047 = U+1F4CE PAPERCLIP
>> U+F049 = U+1F382 BIRTHDAY CAKE
>> U+F04A = new_codepoint WAX SEAL (???)
>> U+F04D = new_codepoint ?????? (glyph has two variants: octagon with two
>> arrows pointing inthe middle or two crossed pencils)
>> U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???)
>>
>> --
>>
>> Marius Spix
>>
>>
>> On Mon, 25 Jul 2022 07:30:08 -0400
>> Gabriel Tellez via Unicode wrote:
>>
>>> Turns out there is also Bodoni Onaments (a font that I somehow missed)
>>> and Type Embellishments One (a font that isn't on my computer but
>>> sounds like it should be by default?).
>>>
>>> On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode <
>>> unicode at corp.unicode.org> wrote:
>>>
>>>> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode:
>>>>
>>>> JKvU> In N4127, Karl Pentzlin noted that no effort was made to
>>>> JKvU> determine
>>>> unification with existing characters, even in cases where
>>>> unification was obvious.
>>>>
>>>> The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol
>>>> Fonts: A Quick Survey", simply listing the (then) current use on
>>>> the PUA by Apple. It was definitively not a proposal (alone by the
>>>> fact that it listed PUA code points), and it was explicitly stated
>>>> as subject of that document: ?The characters found are listed here
>>>> without any further interpretation ? Especially, no names ? or
>>>> properties are given, and it is not examined whether they can
>>>> unified with existing Unicode characters, even for cases where this
>>>> is obvious.?
>>>>
>>>> This document was intended as a starting point for discussions
>>>> which of these symbols deserve an encoding or unification in
>>>> Unicode (after the Wingdings/Webdings discussion which resulted in
>>>> encodings or unifications for almost all of them), but as
>>>> apparently there was no interest in such discussions, no subsequent
>>>> documents besides the Apple comment L2/11-309 (especially no
>>>> proposals) had followed.
>>>>
>>>> - Karl Pentzlin
>>>>
>>>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OutlookGlyphs.PNG
Type: image/png
Size: 4085 bytes
Desc: not available
URL:
From beckiergb at gmail.com Mon Jul 25 22:08:23 2022
From: beckiergb at gmail.com (Rebecca Bettencourt)
Date: Mon, 25 Jul 2022 20:08:23 -0700
Subject: Hoefler Text Ornaments
In-Reply-To: <53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com>
References:
<943300971.20220724224948@acssoft.de>
<20220726012548.51a1ebb5@spixxi>
<53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com>
Message-ID:
Despite my first response to this thread taking a dig at Microsoft, my
actual understanding is they didn't get Wingdings and Webdings into Unicode
for no reason; they were able to demonstrate that there are a considerable
number of web pages, emails, and documents using those fonts. They simply
enjoy a level of popularity that none of the other fonts mentioned in this
thread do. Very few people are using Hoefler Text Ornaments, Type
Embellishments One, etc. in their documents, and the ones who are seem to
get by just fine using private use code points. Compare the many people
confused by the stray J appearing in old emails stripped of their
formatting (in which the specification of Wingdings for that character
would display it as a smiley face).
If you feel there is enough of a case for Hoefler Text Ornaments, you can
certainly create a proposal. But you'll have to at the very least provide
some statistics as to how many people actually use them. Also consider that
whatever statistics Apple may have had, it certainly wasn't enough to
convince them they needed encoding.
On Mon, Jul 25, 2022, 5:29 PM James Kass via Unicode <
unicode at corp.unicode.org> wrote:
>
> As a visual aid, the MS Outlook glyphs are provided in the attached
> graphic file. Some of the glyphs noted by Marius Spix appear to have
> been removed from the font by the time XP arrived, the graphic shows the
> font version included with Windows XP.
>
> Having established that certain glyphs exist, the next question is
> whether people are exchanging them in plain-text. If not, then could it
> be demonstrated that users would benefit from the ability to do so? If
> not, then there is no path towards their encoding in the Standard.
>
> On 2022-07-25 11:51 PM, Gabriel Tellez via Unicode wrote:
> > OUTLOOK.ttf is questionable as its an icon font and not a dingbat one
> > (though you can say the same with webdings), but since it's such a small
> > font I think it could pass
> >
> > On Mon, Jul 25, 2022 at 7:26 PM Marius Spix wrote:
> >
> >> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook
> >> 97 and had been in circulation for a long time. Maybe it could be
> >> considered as well.
> >>
> >> I tried to map the glyphs.
> >>
> >> U+F041 = U+1F56D RINGING BELL
> >> U+F042 = U+1F511 KEY
> >> U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS
> >> U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH
> >> SOLIDUS
> >> U+F045 = new_codepoint PEOPLE FACING RIGHT
> >> U+F046 = new_codepoint MEETING ROOM (table with three silhouettes)
> >> U+F047 = U+1F4CE PAPERCLIP
> >> U+F049 = U+1F382 BIRTHDAY CAKE
> >> U+F04A = new_codepoint WAX SEAL (???)
> >> U+F04D = new_codepoint ?????? (glyph has two variants: octagon with two
> >> arrows pointing inthe middle or two crossed pencils)
> >> U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???)
> >>
> >> --
> >>
> >> Marius Spix
> >>
> >>
> >> On Mon, 25 Jul 2022 07:30:08 -0400
> >> Gabriel Tellez via Unicode wrote:
> >>
> >>> Turns out there is also Bodoni Onaments (a font that I somehow missed)
> >>> and Type Embellishments One (a font that isn't on my computer but
> >>> sounds like it should be by default?).
> >>>
> >>> On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode <
> >>> unicode at corp.unicode.org> wrote:
> >>>
> >>>> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode:
> >>>>
> >>>> JKvU> In N4127, Karl Pentzlin noted that no effort was made to
> >>>> JKvU> determine
> >>>> unification with existing characters, even in cases where
> >>>> unification was obvious.
> >>>>
> >>>> The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol
> >>>> Fonts: A Quick Survey", simply listing the (then) current use on
> >>>> the PUA by Apple. It was definitively not a proposal (alone by the
> >>>> fact that it listed PUA code points), and it was explicitly stated
> >>>> as subject of that document: ?The characters found are listed here
> >>>> without any further interpretation ? Especially, no names ? or
> >>>> properties are given, and it is not examined whether they can
> >>>> unified with existing Unicode characters, even for cases where this
> >>>> is obvious.?
> >>>>
> >>>> This document was intended as a starting point for discussions
> >>>> which of these symbols deserve an encoding or unification in
> >>>> Unicode (after the Wingdings/Webdings discussion which resulted in
> >>>> encodings or unifications for almost all of them), but as
> >>>> apparently there was no interest in such discussions, no subsequent
> >>>> documents besides the Apple comment L2/11-309 (especially no
> >>>> proposals) had followed.
> >>>>
> >>>> - Karl Pentzlin
> >>>>
> >>>>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From gtbot2007 at gmail.com Tue Jul 26 09:03:33 2022
From: gtbot2007 at gmail.com (Gabriel Tellez)
Date: Tue, 26 Jul 2022 10:03:33 -0400
Subject: Hoefler Text Ornaments
In-Reply-To:
References:
<943300971.20220724224948@acssoft.de>
<20220726012548.51a1ebb5@spixxi>
<53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com>
Message-ID:
Do normal people (who don?t know what a Unicode is) even use
Webdings/Windings with the Unicode code points? Because if they don?t then
it?s no different then people using the PUA for these fonts.
On Mon, Jul 25, 2022 at 11:15 PM Rebecca Bettencourt via Unicode <
unicode at corp.unicode.org> wrote:
> Despite my first response to this thread taking a dig at Microsoft, my
> actual understanding is they didn't get Wingdings and Webdings into Unicode
> for no reason; they were able to demonstrate that there are a considerable
> number of web pages, emails, and documents using those fonts. They simply
> enjoy a level of popularity that none of the other fonts mentioned in this
> thread do. Very few people are using Hoefler Text Ornaments, Type
> Embellishments One, etc. in their documents, and the ones who are seem to
> get by just fine using private use code points. Compare the many people
> confused by the stray J appearing in old emails stripped of their
> formatting (in which the specification of Wingdings for that character
> would display it as a smiley face).
>
> If you feel there is enough of a case for Hoefler Text Ornaments, you can
> certainly create a proposal. But you'll have to at the very least provide
> some statistics as to how many people actually use them. Also consider that
> whatever statistics Apple may have had, it certainly wasn't enough to
> convince them they needed encoding.
>
> On Mon, Jul 25, 2022, 5:29 PM James Kass via Unicode <
> unicode at corp.unicode.org> wrote:
>
>>
>> As a visual aid, the MS Outlook glyphs are provided in the attached
>> graphic file. Some of the glyphs noted by Marius Spix appear to have
>> been removed from the font by the time XP arrived, the graphic shows the
>> font version included with Windows XP.
>>
>> Having established that certain glyphs exist, the next question is
>> whether people are exchanging them in plain-text. If not, then could it
>> be demonstrated that users would benefit from the ability to do so? If
>> not, then there is no path towards their encoding in the Standard.
>>
>> On 2022-07-25 11:51 PM, Gabriel Tellez via Unicode wrote:
>> > OUTLOOK.ttf is questionable as its an icon font and not a dingbat one
>> > (though you can say the same with webdings), but since it's such a small
>> > font I think it could pass
>> >
>> > On Mon, Jul 25, 2022 at 7:26 PM Marius Spix wrote:
>> >
>> >> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook
>> >> 97 and had been in circulation for a long time. Maybe it could be
>> >> considered as well.
>> >>
>> >> I tried to map the glyphs.
>> >>
>> >> U+F041 = U+1F56D RINGING BELL
>> >> U+F042 = U+1F511 KEY
>> >> U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS
>> >> U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH
>> >> SOLIDUS
>> >> U+F045 = new_codepoint PEOPLE FACING RIGHT
>> >> U+F046 = new_codepoint MEETING ROOM (table with three silhouettes)
>> >> U+F047 = U+1F4CE PAPERCLIP
>> >> U+F049 = U+1F382 BIRTHDAY CAKE
>> >> U+F04A = new_codepoint WAX SEAL (???)
>> >> U+F04D = new_codepoint ?????? (glyph has two variants: octagon with two
>> >> arrows pointing inthe middle or two crossed pencils)
>> >> U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???)
>> >>
>> >> --
>> >>
>> >> Marius Spix
>> >>
>> >>
>> >> On Mon, 25 Jul 2022 07:30:08 -0400
>> >> Gabriel Tellez via Unicode wrote:
>> >>
>> >>> Turns out there is also Bodoni Onaments (a font that I somehow missed)
>> >>> and Type Embellishments One (a font that isn't on my computer but
>> >>> sounds like it should be by default?).
>> >>>
>> >>> On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode <
>> >>> unicode at corp.unicode.org> wrote:
>> >>>
>> >>>> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode:
>> >>>>
>> >>>> JKvU> In N4127, Karl Pentzlin noted that no effort was made to
>> >>>> JKvU> determine
>> >>>> unification with existing characters, even in cases where
>> >>>> unification was obvious.
>> >>>>
>> >>>> The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol
>> >>>> Fonts: A Quick Survey", simply listing the (then) current use on
>> >>>> the PUA by Apple. It was definitively not a proposal (alone by the
>> >>>> fact that it listed PUA code points), and it was explicitly stated
>> >>>> as subject of that document: ?The characters found are listed here
>> >>>> without any further interpretation ? Especially, no names ? or
>> >>>> properties are given, and it is not examined whether they can
>> >>>> unified with existing Unicode characters, even for cases where this
>> >>>> is obvious.?
>> >>>>
>> >>>> This document was intended as a starting point for discussions
>> >>>> which of these symbols deserve an encoding or unification in
>> >>>> Unicode (after the Wingdings/Webdings discussion which resulted in
>> >>>> encodings or unifications for almost all of them), but as
>> >>>> apparently there was no interest in such discussions, no subsequent
>> >>>> documents besides the Apple comment L2/11-309 (especially no
>> >>>> proposals) had followed.
>> >>>>
>> >>>> - Karl Pentzlin
>> >>>>
>> >>>>
>> >>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From beckiergb at gmail.com Tue Jul 26 09:30:43 2022
From: beckiergb at gmail.com (Rebecca Bettencourt)
Date: Tue, 26 Jul 2022 07:30:43 -0700
Subject: Hoefler Text Ornaments
In-Reply-To:
References:
<943300971.20220724224948@acssoft.de>
<20220726012548.51a1ebb5@spixxi>
<53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com>
Message-ID:
On Tue, Jul 26, 2022 at 7:03 AM Gabriel Tellez wrote:
> Do normal people (who don?t know what a Unicode is) even use
> Webdings/Windings with the Unicode code points? Because if they don?t then
> it?s no different then people using the PUA for these fonts.
>
Sure. Usually from the Insert Symbol function in Microsoft Word.
> On Mon, Jul 25, 2022 at 11:15 PM Rebecca Bettencourt via Unicode <
> unicode at corp.unicode.org> wrote:
>
>> Despite my first response to this thread taking a dig at Microsoft, my
>> actual understanding is they didn't get Wingdings and Webdings into Unicode
>> for no reason; they were able to demonstrate that there are a considerable
>> number of web pages, emails, and documents using those fonts. They simply
>> enjoy a level of popularity that none of the other fonts mentioned in this
>> thread do. Very few people are using Hoefler Text Ornaments, Type
>> Embellishments One, etc. in their documents, and the ones who are seem to
>> get by just fine using private use code points. Compare the many people
>> confused by the stray J appearing in old emails stripped of their
>> formatting (in which the specification of Wingdings for that character
>> would display it as a smiley face).
>>
>> If you feel there is enough of a case for Hoefler Text Ornaments, you can
>> certainly create a proposal. But you'll have to at the very least provide
>> some statistics as to how many people actually use them. Also consider that
>> whatever statistics Apple may have had, it certainly wasn't enough to
>> convince them they needed encoding.
>>
>> On Mon, Jul 25, 2022, 5:29 PM James Kass via Unicode <
>> unicode at corp.unicode.org> wrote:
>>
>>>
>>> As a visual aid, the MS Outlook glyphs are provided in the attached
>>> graphic file. Some of the glyphs noted by Marius Spix appear to have
>>> been removed from the font by the time XP arrived, the graphic shows the
>>> font version included with Windows XP.
>>>
>>> Having established that certain glyphs exist, the next question is
>>> whether people are exchanging them in plain-text. If not, then could it
>>> be demonstrated that users would benefit from the ability to do so? If
>>> not, then there is no path towards their encoding in the Standard.
>>>
>>> On 2022-07-25 11:51 PM, Gabriel Tellez via Unicode wrote:
>>> > OUTLOOK.ttf is questionable as its an icon font and not a dingbat one
>>> > (though you can say the same with webdings), but since it's such a
>>> small
>>> > font I think it could pass
>>> >
>>> > On Mon, Jul 25, 2022 at 7:26 PM Marius Spix
>>> wrote:
>>> >
>>> >> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook
>>> >> 97 and had been in circulation for a long time. Maybe it could be
>>> >> considered as well.
>>> >>
>>> >> I tried to map the glyphs.
>>> >>
>>> >> U+F041 = U+1F56D RINGING BELL
>>> >> U+F042 = U+1F511 KEY
>>> >> U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS
>>> >> U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH
>>> >> SOLIDUS
>>> >> U+F045 = new_codepoint PEOPLE FACING RIGHT
>>> >> U+F046 = new_codepoint MEETING ROOM (table with three silhouettes)
>>> >> U+F047 = U+1F4CE PAPERCLIP
>>> >> U+F049 = U+1F382 BIRTHDAY CAKE
>>> >> U+F04A = new_codepoint WAX SEAL (???)
>>> >> U+F04D = new_codepoint ?????? (glyph has two variants: octagon with
>>> two
>>> >> arrows pointing inthe middle or two crossed pencils)
>>> >> U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???)
>>> >>
>>> >> --
>>> >>
>>> >> Marius Spix
>>> >>
>>> >>
>>> >> On Mon, 25 Jul 2022 07:30:08 -0400
>>> >> Gabriel Tellez via Unicode wrote:
>>> >>
>>> >>> Turns out there is also Bodoni Onaments (a font that I somehow
>>> missed)
>>> >>> and Type Embellishments One (a font that isn't on my computer but
>>> >>> sounds like it should be by default?).
>>> >>>
>>> >>> On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode <
>>> >>> unicode at corp.unicode.org> wrote:
>>> >>>
>>> >>>> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode:
>>> >>>>
>>> >>>> JKvU> In N4127, Karl Pentzlin noted that no effort was made to
>>> >>>> JKvU> determine
>>> >>>> unification with existing characters, even in cases where
>>> >>>> unification was obvious.
>>> >>>>
>>> >>>> The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol
>>> >>>> Fonts: A Quick Survey", simply listing the (then) current use on
>>> >>>> the PUA by Apple. It was definitively not a proposal (alone by the
>>> >>>> fact that it listed PUA code points), and it was explicitly stated
>>> >>>> as subject of that document: ?The characters found are listed here
>>> >>>> without any further interpretation ? Especially, no names ? or
>>> >>>> properties are given, and it is not examined whether they can
>>> >>>> unified with existing Unicode characters, even for cases where this
>>> >>>> is obvious.?
>>> >>>>
>>> >>>> This document was intended as a starting point for discussions
>>> >>>> which of these symbols deserve an encoding or unification in
>>> >>>> Unicode (after the Wingdings/Webdings discussion which resulted in
>>> >>>> encodings or unifications for almost all of them), but as
>>> >>>> apparently there was no interest in such discussions, no subsequent
>>> >>>> documents besides the Apple comment L2/11-309 (especially no
>>> >>>> proposals) had followed.
>>> >>>>
>>> >>>> - Karl Pentzlin
>>> >>>>
>>> >>>>
>>> >>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From sdowney at gmail.com Tue Jul 26 10:49:49 2022
From: sdowney at gmail.com (Steve Downey)
Date: Tue, 26 Jul 2022 11:49:49 -0400
Subject: Hoefler Text Ornaments
In-Reply-To:
References:
<943300971.20220724224948@acssoft.de>
<20220726012548.51a1ebb5@spixxi>
<53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com>
Message-ID:
Yes, because helpful programmers, like me, transcode their marked up
encoding into Unicode. In any case, the cat is out of the bag and the
horses have left the barn, and wingdings and webdings really were
incredibly popular before Unicode standardization, for largely the same
reasons that emoji are today. For another decorative set to be encoded, I
think there would need to be evidence of a body of text using those symbols
for which there is a desire to re-encode today, such that without encoding
the symbols meaning would be lost. It's a deliberately high bar.
On Tue, Jul 26, 2022 at 10:08 AM Gabriel Tellez via Unicode <
unicode at corp.unicode.org> wrote:
> Do normal people (who don?t know what a Unicode is) even use
> Webdings/Windings with the Unicode code points? Because if they don?t then
> it?s no different then people using the PUA for these fonts.
>
> On Mon, Jul 25, 2022 at 11:15 PM Rebecca Bettencourt via Unicode <
> unicode at corp.unicode.org> wrote:
>
>> Despite my first response to this thread taking a dig at Microsoft, my
>> actual understanding is they didn't get Wingdings and Webdings into Unicode
>> for no reason; they were able to demonstrate that there are a considerable
>> number of web pages, emails, and documents using those fonts. They simply
>> enjoy a level of popularity that none of the other fonts mentioned in this
>> thread do. Very few people are using Hoefler Text Ornaments, Type
>> Embellishments One, etc. in their documents, and the ones who are seem to
>> get by just fine using private use code points. Compare the many people
>> confused by the stray J appearing in old emails stripped of their
>> formatting (in which the specification of Wingdings for that character
>> would display it as a smiley face).
>>
>> If you feel there is enough of a case for Hoefler Text Ornaments, you can
>> certainly create a proposal. But you'll have to at the very least provide
>> some statistics as to how many people actually use them. Also consider that
>> whatever statistics Apple may have had, it certainly wasn't enough to
>> convince them they needed encoding.
>>
>> On Mon, Jul 25, 2022, 5:29 PM James Kass via Unicode <
>> unicode at corp.unicode.org> wrote:
>>
>>>
>>> As a visual aid, the MS Outlook glyphs are provided in the attached
>>> graphic file. Some of the glyphs noted by Marius Spix appear to have
>>> been removed from the font by the time XP arrived, the graphic shows the
>>> font version included with Windows XP.
>>>
>>> Having established that certain glyphs exist, the next question is
>>> whether people are exchanging them in plain-text. If not, then could it
>>> be demonstrated that users would benefit from the ability to do so? If
>>> not, then there is no path towards their encoding in the Standard.
>>>
>>> On 2022-07-25 11:51 PM, Gabriel Tellez via Unicode wrote:
>>> > OUTLOOK.ttf is questionable as its an icon font and not a dingbat one
>>> > (though you can say the same with webdings), but since it's such a
>>> small
>>> > font I think it could pass
>>> >
>>> > On Mon, Jul 25, 2022 at 7:26 PM Marius Spix
>>> wrote:
>>> >
>>> >> There is also the font "MS Outlook". OUTLOOK.ttf was part of Outlook
>>> >> 97 and had been in circulation for a long time. Maybe it could be
>>> >> considered as well.
>>> >>
>>> >> I tried to map the glyphs.
>>> >>
>>> >> U+F041 = U+1F56D RINGING BELL
>>> >> U+F042 = U+1F511 KEY
>>> >> U+F043 = U+1F5D8 CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS
>>> >> U+F044 = new_codepoint CLOCKWISE RIGHT AND LEFT SEMICIRCLE ARROWS WITH
>>> >> SOLIDUS
>>> >> U+F045 = new_codepoint PEOPLE FACING RIGHT
>>> >> U+F046 = new_codepoint MEETING ROOM (table with three silhouettes)
>>> >> U+F047 = U+1F4CE PAPERCLIP
>>> >> U+F049 = U+1F382 BIRTHDAY CAKE
>>> >> U+F04A = new_codepoint WAX SEAL (???)
>>> >> U+F04D = new_codepoint ?????? (glyph has two variants: octagon with
>>> two
>>> >> arrows pointing inthe middle or two crossed pencils)
>>> >> U+F04E ? U+1F4EC OPEN MAILBOX WITH RAISED FLAG (???)
>>> >>
>>> >> --
>>> >>
>>> >> Marius Spix
>>> >>
>>> >>
>>> >> On Mon, 25 Jul 2022 07:30:08 -0400
>>> >> Gabriel Tellez via Unicode wrote:
>>> >>
>>> >>> Turns out there is also Bodoni Onaments (a font that I somehow
>>> missed)
>>> >>> and Type Embellishments One (a font that isn't on my computer but
>>> >>> sounds like it should be by default?).
>>> >>>
>>> >>> On Sun, Jul 24, 2022 at 4:52 PM Karl Pentzlin via Unicode <
>>> >>> unicode at corp.unicode.org> wrote:
>>> >>>
>>> >>>> Am Sonntag, 24. Juli 2022 um 00:07 schrieb James Kass via Unicode:
>>> >>>>
>>> >>>> JKvU> In N4127, Karl Pentzlin noted that no effort was made to
>>> >>>> JKvU> determine
>>> >>>> unification with existing characters, even in cases where
>>> >>>> unification was obvious.
>>> >>>>
>>> >>>> The title of N4127 (L2/11-276) from 2011-07-15 was "Apple Symbol
>>> >>>> Fonts: A Quick Survey", simply listing the (then) current use on
>>> >>>> the PUA by Apple. It was definitively not a proposal (alone by the
>>> >>>> fact that it listed PUA code points), and it was explicitly stated
>>> >>>> as subject of that document: ?The characters found are listed here
>>> >>>> without any further interpretation ? Especially, no names ? or
>>> >>>> properties are given, and it is not examined whether they can
>>> >>>> unified with existing Unicode characters, even for cases where this
>>> >>>> is obvious.?
>>> >>>>
>>> >>>> This document was intended as a starting point for discussions
>>> >>>> which of these symbols deserve an encoding or unification in
>>> >>>> Unicode (after the Wingdings/Webdings discussion which resulted in
>>> >>>> encodings or unifications for almost all of them), but as
>>> >>>> apparently there was no interest in such discussions, no subsequent
>>> >>>> documents besides the Apple comment L2/11-309 (especially no
>>> >>>> proposals) had followed.
>>> >>>>
>>> >>>> - Karl Pentzlin
>>> >>>>
>>> >>>>
>>> >>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From jameskass at code2001.com Tue Jul 26 17:38:07 2022
From: jameskass at code2001.com (James Kass)
Date: Tue, 26 Jul 2022 22:38:07 +0000
Subject: Hoefler Text Ornaments
In-Reply-To:
References:
<943300971.20220724224948@acssoft.de>
<20220726012548.51a1ebb5@spixxi>
<53b6cf0d-259c-3a82-c5e9-939d16788f71@code2001.com>
Message-ID: <87be2316-9562-a6fc-489f-b1f0f4f1aebe@code2001.com>
On 2022-07-26 3:49 PM, Steve Downey via Unicode wrote:
> For another decorative set to be encoded, I
> think there would need to be evidence of a body of text using those symbols
> for which there is a desire to re-encode today, such that without encoding
> the symbols meaning would be lost. It's a deliberately high bar.
There were several steps along the way to getting webdings/wingdings
encoded in The Standard.? Here's a link to an updated proposal from 2011:
https://www.unicode.org/L2/L2011/11344-wingdings.pdf
The introductory text delves into the rationale for encoding and might
be of interest to anyone contemplating submitting proposals for similar
additions.
From michel at suignard.com Tue Jul 26 22:25:46 2022
From: michel at suignard.com (Michel Suignard)
Date: Wed, 27 Jul 2022 03:25:46 +0000
Subject: Hoefler Text Ornaments
In-Reply-To: <87be2316-9562-a6fc-489f-b1f0f4f1aebe@code2001.com>
References: