From hsivonen at mozilla.com  Fri Feb  2 07:41:10 2024
From: hsivonen at mozilla.com (Henri Sivonen)
Date: Fri, 2 Feb 2024 15:41:10 +0200
Subject: Use case documentation for UTS 46 parameters
Message-ID: <CAJHk+8SdeeVozf4c+nA6qfe=P3=ND4WP3dad5m7VFH9FJKRzjw@mail.gmail.com>

Hi,

The Processing steps in UTS 46 take various boolean flags. Are the use
cases for each one documented somewhere? That is, when and why would one
want to set each flag to true or false?

-- 
Henri Sivonen
hsivonen at mozilla.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240202/19e761fd/attachment.htm>

From wjgo_10009 at btinternet.com  Fri Feb  2 16:33:51 2024
From: wjgo_10009 at btinternet.com (William_J_G Overington)
Date: Fri, 2 Feb 2024 22:33:51 +0000 (GMT)
Subject: Private Use Area characters and the eudcedit program
Message-ID: <78ea0e0b.225a.18d6bf4f134.Webtop.92@btinternet.com>

Regarding the issue raised in the thread

https://forum.affinity.serif.com/index.php?/topic/197938-private-characters-created-with-microsoft-eudceditexe

can anyone explain what is happening please?

William Overington

Friday 2 February 2024


From sosipiuk at gmail.com  Fri Feb  2 16:56:24 2024
From: sosipiuk at gmail.com (=?UTF-8?Q?S=C5=82awomir_Osipiuk?=)
Date: Fri, 02 Feb 2024 22:56:24 +0000
Subject: Private Use Area characters and the eudcedit program
In-Reply-To: <78ea0e0b.225a.18d6bf4f134.Webtop.92@btinternet.com>
References: <78ea0e0b.225a.18d6bf4f134.Webtop.92@btinternet.com>
Message-ID: <1706914274948.1201958261.3436189612@gmail.com>


Taking an educated guess: If the software in question uses Windows' font 
rendering API, it will render the private characters as intended. If the 
software uses its own rendering functions or libraries, it will not render 
the custom private characters because it has no awareness of them.

On Friday, 02 February 2024, 17:33:51 (-05:00), William_J_G Overington via 
Unicode wrote:

 > Regarding the issue raised in the thread
 > 
 > 
https://forum.affinity.serif.com/index.php?/topic/197938-private-characters-created-with-microsoft-eudceditexe
 
> 
 > can anyone explain what is happening please?
 > 
 > William Overington
 > 
 > Friday 2 February 2024


From list+unicode at jdlh.com  Fri Feb  2 18:04:40 2024
From: list+unicode at jdlh.com (Jim DeLaHunt)
Date: Fri, 2 Feb 2024 16:04:40 -0800
Subject: Private Use Area characters and the eudcedit program
In-Reply-To: <78ea0e0b.225a.18d6bf4f134.Webtop.92@btinternet.com>
References: <78ea0e0b.225a.18d6bf4f134.Webtop.92@btinternet.com>
Message-ID: <af7af89d-9f1d-4f75-8932-3b96e5d61b91@jdlh.com>

On 2024-02-02 14:33, William_J_G Overington via Unicode wrote:

> Regarding the issue raised in the thread
>
> https://forum.affinity.serif.com/index.php?/topic/197938-private-characters-created-with-microsoft-eudceditexe 
>
The issue appears to be (copying text from that thread to this):
> I have created two "private" characters using the Windows built-in 
> eudcedit utility. The first one I have saved to a specific font and 
> the second one I have saved to all fonts.
>
> I can locate and copy both characters in Character Map and paste them 
> successfully into Notepad and into my CAD programs, but not Affinity 
> Publisher (v.1)? Is there a special procedure in Publisher that will 
> overcome this, or is the programe not yet equipped to deal with 
> private characters?
>

On 2024-02-02 14:33, William_J_G Overington via Unicode wrote:

> can anyone explain what is happening please?

I can perhaps shed some light, if not explain definitively.

Anyone using EUDCedit would be well advised to learn what Windows has to 
say about what EUDC is and how it works in Windows. A web search finds:

*End-User-Defined and Private Use Area Characters* (2021) 
<https://learn.microsoft.com/en-us/windows/win32/intl/end-user-defined-characters>

> End-user-defined characters (EUDC) in double-byte character sets 
> <https://learn.microsoft.com/en-us/windows/win32/intl/double-byte-character-sets> 
> (DBCSs) and private use area (PUA) characters in Unicode 
> <https://learn.microsoft.com/en-us/windows/win32/intl/unicode> are 
> custom characters. They can be defined and implemented either by an 
> end user or by another party?. Their use enables users to form names 
> and other words using characters that are not available in standard 
> screen and printer fonts.
>
> The EUDC and PUA characters can be assigned differently, or not 
> assigned at all, on different computers. Some code pages have 
> extensions that reuse the EUDC range, ? a manufacturer might provide a 
> custom set of characters in one of these ranges, ? user groups can 
> attempt to provide additional characters in the PUA. Different 
> combinations of these cases can cause conflict. When creating 
> applications that rely on EUDC or PUA characters, you should keep in 
> mind the conflicting interpretations of an individual code point.?
>

*Character Sets and Fonts* (2021) 
<https://learn.microsoft.com/en-us/windows/win32/intl/character-sets-and-fonts>

> To create an EUDC or PUA character, the user chooses a character value 
> that is within the specified range and adds the glyph 
> <https://learn.microsoft.com/en-us/windows/win32/intl/uniscribe-glossary> 
> to the font in the entry that corresponds to that character value. The 
> user creates the glyph using an EUDC editor or using a font package 
> purchased from a font vendor. Any DBCS font can contain EUDCs, and any 
> Unicode font can contain PUA characters. The font is called a 
> "separate" EUDC/PUA font if it contains only EUDCs. The font is an 
> "integrated" EUDC/PUA font if it contains standard characters as well 
> as EUDCs.?
> TrueType fonts can be installed either as .ttf files or as .tte files. 
> Since the operating system hides .tte files, applications cannot 
> enumerate or otherwise examine the installed fonts using GDI API 
> functions. On many operating systems, the system default EUDC/PUA font 
> and separate EUDC/PUA fonts are installed as .tte files. Applications 
> such as EUDC editors and the Control Panel must use registry entries 
> to add, modify, and delete such fonts.?


The backstory is that end-user defined character handling is a text 
requirement originating from ideographic scripts, especially Japan, and 
an era when the glyph complement of Japanese fonts was small (c 5,000 
glyphs) compared to the range of ideographic characters listed in 
dictionaries and fair game to use in text (c 70,000 characters). Authors 
wanting to use such "outside characters" (known as "gaiji" in Japanese) 
in their publications had to resort to special measures like EUDCs. OS 
and application vendors who wanted to sell to serious publishers in the 
Japanese market need to provide EUDC tools.

The need for special measures like EUDC has receded greatly with the 
arrival of ideographic script fonts with very large glyph repertoires. 
Use of EUDC tools in ideographic script documents is likely now a niche. 
Use of EUDC outside of ideographic script context is even more of a niche.

The original question was, "I can [not] locate and copy [my EUDC] 
characters in? Affinity Publisher (v.1)? Is ? the programe not yet 
equipped to deal with private characters?"

It seems pretty likely to me that the program is not equipped to deal 
with EUDC characters. The feature list for Affinity Publisher 
<https://affinity.serif.com/en-us/publisher/full-feature-list/> does not 
mention Japanese or Chinese typography support. If they do not have 
ideographic typography as a major feature, they are even more unlikely 
to have Windows-specific EUDC support.

-- 
.   --Jim DeLaHunt,jdlh at jdlh.com      http://blog.jdlh.com/  (http://jdlh.com/)
       multilingual websites consultant, Vancouver, B.C., Canada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240202/f8535961/attachment-0001.htm>

From pgcon6 at msn.com  Sat Feb  3 12:13:42 2024
From: pgcon6 at msn.com (Peter Constable)
Date: Sat, 3 Feb 2024 18:13:42 +0000
Subject: Private Use Area characters and the eudcedit program
In-Reply-To: <af7af89d-9f1d-4f75-8932-3b96e5d61b91@jdlh.com>
References: <78ea0e0b.225a.18d6bf4f134.Webtop.92@btinternet.com>
 <af7af89d-9f1d-4f75-8932-3b96e5d61b91@jdlh.com>
Message-ID: <DS0PR12MB7535EAA4936464806B8ED63386412@DS0PR12MB7535.namprd12.prod.outlook.com>

Adding to Jim's excellent answer: the EUDC mechanism that has been in Windows since the 90s is integrated into platform font fallback mechanisms (more specifically, font linking). A privately-defined character is "linked" (associated with) particular fonts all fonts installed in the system, and will display anywhere that makes use of Win32 (GDI, User...), GDI+, Uniscribe or DWrite _unless_ lower-level APIs for drawing text are used that bypass the platform font fallback mechanisms. Affinity products use DWrite, but probably use only those lower-level APIs.


Peter

From: Unicode <unicode-bounces at corp.unicode.org> On Behalf Of Jim DeLaHunt via Unicode
Sent: Friday, February 2, 2024 5:05 PM
To: unicode at corp.unicode.org
Subject: Re: Private Use Area characters and the eudcedit program


On 2024-02-02 14:33, William_J_G Overington via Unicode wrote:
Regarding the issue raised in the thread

https://forum.affinity.serif.com/index.php?/topic/197938-private-characters-created-with-microsoft-eudceditexe
The issue appears to be (copying text from that thread to this):

I have created two "private" characters using the Windows built-in eudcedit utility. The first one I have saved to a specific font and the second one I have saved to all fonts.

I can locate and copy both characters in Character Map and paste them successfully into Notepad and into my CAD programs, but not Affinity Publisher (v.1)  Is there a special procedure in Publisher that will overcome this, or is the programe not yet equipped to deal with private characters?


On 2024-02-02 14:33, William_J_G Overington via Unicode wrote:
can anyone explain what is happening please?

I can perhaps shed some light, if not explain definitively.

Anyone using EUDCedit would be well advised to learn what Windows has to say about what EUDC is and how it works in Windows. A web search finds:

End-User-Defined and Private Use Area Characters (2021) <https://learn.microsoft.com/en-us/windows/win32/intl/end-user-defined-characters><https://learn.microsoft.com/en-us/windows/win32/intl/end-user-defined-characters>

End-user-defined characters (EUDC) in double-byte character sets<https://learn.microsoft.com/en-us/windows/win32/intl/double-byte-character-sets> (DBCSs) and private use area (PUA) characters in Unicode<https://learn.microsoft.com/en-us/windows/win32/intl/unicode> are custom characters. They can be defined and implemented either by an end user or by another party.... Their use enables users to form names and other words using characters that are not available in standard screen and printer fonts.

The EUDC and PUA characters can be assigned differently, or not assigned at all, on different computers. Some code pages have extensions that reuse the EUDC range, ... a manufacturer might provide a custom set of characters in one of these ranges, ... user groups can attempt to provide additional characters in the PUA. Different combinations of these cases can cause conflict. When creating applications that rely on EUDC or PUA characters, you should keep in mind the conflicting interpretations of an individual code point....


Character Sets and Fonts (2021) <https://learn.microsoft.com/en-us/windows/win32/intl/character-sets-and-fonts><https://learn.microsoft.com/en-us/windows/win32/intl/character-sets-and-fonts>
To create an EUDC or PUA character, the user chooses a character value that is within the specified range and adds the glyph<https://learn.microsoft.com/en-us/windows/win32/intl/uniscribe-glossary> to the font in the entry that corresponds to that character value. The user creates the glyph using an EUDC editor or using a font package purchased from a font vendor. Any DBCS font can contain EUDCs, and any Unicode font can contain PUA characters. The font is called a "separate" EUDC/PUA font if it contains only EUDCs. The font is an "integrated" EUDC/PUA font if it contains standard characters as well as EUDCs....
TrueType fonts can be installed either as .ttf files or as .tte files. Since the operating system hides .tte files, applications cannot enumerate or otherwise examine the installed fonts using GDI API functions. On many operating systems, the system default EUDC/PUA font and separate EUDC/PUA fonts are installed as .tte files. Applications such as EUDC editors and the Control Panel must use registry entries to add, modify, and delete such fonts....


The backstory is that end-user defined character handling is a text requirement originating from ideographic scripts, especially Japan, and an era when the glyph complement of Japanese fonts was small (c 5,000 glyphs) compared to the range of ideographic characters listed in dictionaries and fair game to use in text (c 70,000 characters). Authors wanting to use such "outside characters" (known as "gaiji" in Japanese) in their publications had to resort to special measures like EUDCs. OS and application vendors who wanted to sell to serious publishers in the Japanese market need to provide EUDC tools.

The need for special measures like EUDC has receded greatly with the arrival of ideographic script fonts with very large glyph repertoires. Use of EUDC tools in ideographic script documents is likely now a niche. Use of EUDC outside of ideographic script context is even more of a niche.

The original question was, "I can [not] locate and copy [my EUDC] characters in... Affinity Publisher (v.1)  Is ... the programe not yet equipped to deal with private characters?"

It seems pretty likely to me that the program is not equipped to deal with EUDC characters. The feature list for Affinity Publisher <https://affinity.serif.com/en-us/publisher/full-feature-list/><https://affinity.serif.com/en-us/publisher/full-feature-list/> does not mention Japanese or Chinese typography support. If they do not have ideographic typography as a major feature, they are even more unlikely to have Windows-specific EUDC support.

--

.   --Jim DeLaHunt, jdlh at jdlh.com<mailto:jdlh at jdlh.com>     http://blog.jdlh.com/ (http://jdlh.com/)

      multilingual websites consultant, Vancouver, B.C., Canada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240203/9ce6f5c0/attachment.htm>

From freek at macfreek.nl  Fri Feb 16 05:27:20 2024
From: freek at macfreek.nl (Freek Dijkstra)
Date: Fri, 16 Feb 2024 12:27:20 +0100
Subject: What's the process for proposing a symbol in the Unicode table?
Message-ID: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>

Hi,

I've long been annoyed that there is no Unicode symbol for the flourish 
of approval ("krul" or "krulletje"), which is a common symbol used in 
the Netherlands, mostly in elemetary schools, but rarely outside the 
Netherlands.

 1. What is the process for submitting assigning a codepoint to a symbol
    currently missing from the Unicode tables?
 2. Has this character (see references below) been proposed before?

References (found by a simple web search):

  * https://en.wikipedia.org/wiki/Flourish_of_approval
  * https://graphicdesign.stackexchange.com/questions/58320/what-is-the-name-or-unicode-for-this-symbol-similar-to-?-dutch-called-krul
  * https://tex.stackexchange.com/questions/313281/how-to-make-a-krul-unofficial-dutch-symbol-for-ok

Following these links, it is easy to see there is widespread adoption 
(with links to NRC, one of the national Dutch newspapers, or a video 
made by NTR, a publicly funded television station).

Note: I'm not a linguist, but IT specialist, and had was highly 
surprised it's not in Unicode when needed some years ago. The wikipedia 
and other articles expressed the same surprise. I came across this issue 
again, so I joined the Unicode as member so I can ask this question.

Regards,
Freek Dijkstra

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240216/1aa69893/attachment.htm>

From bortzmeyer at nic.fr  Fri Feb 16 09:50:45 2024
From: bortzmeyer at nic.fr (Stephane Bortzmeyer)
Date: Fri, 16 Feb 2024 16:50:45 +0100
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>
References: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>
Message-ID: <Zc-EVStpx_0VE60O@nic.fr>

On Fri, Feb 16, 2024 at 12:27:20PM +0100,
 Freek Dijkstra via Unicode <unicode at corp.unicode.org> wrote 
 a message of 188 lines which said:

> 1. What is the process for submitting assigning a codepoint to a symbol
>    currently missing from the Unicode tables?

http://unicode.org/emoji/proposals.html


From jameskass at code2001.com  Fri Feb 16 10:11:13 2024
From: jameskass at code2001.com (James Kass)
Date: Fri, 16 Feb 2024 16:11:13 +0000
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <Zc-EVStpx_0VE60O@nic.fr>
References: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>
 <Zc-EVStpx_0VE60O@nic.fr>
Message-ID: <8fbef790-93aa-4abd-bd27-5351177f9532@code2001.com>


On 2024-02-16 3:50 PM, Stephane Bortzmeyer via Unicode wrote:
> On Fri, Feb 16, 2024 at 12:27:20PM +0100,
>   Freek Dijkstra via Unicode <unicode at corp.unicode.org> wrote
>   a message of 188 lines which said:
>
>> 1. What is the process for submitting assigning a codepoint to a symbol
>>     currently missing from the Unicode tables?
> http://unicode.org/emoji/proposals.html
>
If the symbol is not an emoji:
https://www.unicode.org/pending/symbol-guidelines.html

Submitting character proposals:
http://www.unicode.org/pending/proposals.html


From asmusf at ix.netcom.com  Fri Feb 16 10:41:43 2024
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Fri, 16 Feb 2024 08:41:43 -0800
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <Zc-EVStpx_0VE60O@nic.fr>
References: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>
 <Zc-EVStpx_0VE60O@nic.fr>
Message-ID: <101fd78e-673f-43a9-837e-98b018c3c040@ix.netcom.com>

On 2/16/2024 7:50 AM, Stephane Bortzmeyer via Unicode wrote:
> On Fri, Feb 16, 2024 at 12:27:20PM +0100,
>   Freek Dijkstra via Unicode<unicode at corp.unicode.org>  wrote
>   a message of 188 lines which said:
>
>> 1. What is the process for submitting assigning a codepoint to a symbol
>>     currently missing from the Unicode tables?
> http://unicode.org/emoji/proposals.html
>
This assumes that the "symbol" is an emoji. Which the "Flourish of 
approval" would not necessarily be, unless the idea was to create an 
emoji for it, like the check mark.

The Unicode FAQ has pointers to both emoji and other character proposals,

A./
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240216/6a333de0/attachment.htm>

From doug at ewellic.org  Fri Feb 16 11:38:58 2024
From: doug at ewellic.org (Doug Ewell)
Date: Fri, 16 Feb 2024 17:38:58 +0000
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <101fd78e-673f-43a9-837e-98b018c3c040@ix.netcom.com>
References: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>
 <Zc-EVStpx_0VE60O@nic.fr>
 <101fd78e-673f-43a9-837e-98b018c3c040@ix.netcom.com>
Message-ID: <SJ0PR03MB6598B4344221401B6A6F362ECA4C2@SJ0PR03MB6598.namprd03.prod.outlook.com>

Asmus Freytag wrote:

>>> 1. What is the process for submitting assigning a codepoint to a
>>> symbol currently missing from the Unicode tables?
>>
>> http://unicode.org/emoji/proposals.html
>
> This assumes that the "symbol" is an emoji. Which the "Flourish of
> approval" would not necessarily be, unless the idea was to create an
> emoji for it, like the check mark.

The OP?s post and references seem rather clear that it is intended as a normal character, for use with normal text, often handwritten, and used in plain-text environments (e.g. ?mostly in elementary schools? and ?for grading schoolwork?).

I would think the process for proposing normal characters would need to be followed, and this should not be proposed as an emoji for the purpose of getting it encoded via the easier emoji process.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


From asmusf at ix.netcom.com  Fri Feb 16 12:34:11 2024
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Fri, 16 Feb 2024 10:34:11 -0800
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <SJ0PR03MB6598B4344221401B6A6F362ECA4C2@SJ0PR03MB6598.namprd03.prod.outlook.com>
References: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>
 <Zc-EVStpx_0VE60O@nic.fr>
 <101fd78e-673f-43a9-837e-98b018c3c040@ix.netcom.com>
 <SJ0PR03MB6598B4344221401B6A6F362ECA4C2@SJ0PR03MB6598.namprd03.prod.outlook.com>
Message-ID: <329f4c0f-c6bc-4d52-8b89-2b5cb6cc9204@ix.netcom.com>

On 2/16/2024 9:38 AM, Doug Ewell via Unicode wrote:
> Asmus Freytag wrote:
>
>>>> 1. What is the process for submitting assigning a codepoint to a
>>>> symbol currently missing from the Unicode tables?
>>> http://unicode.org/emoji/proposals.html
>> This assumes that the "symbol" is an emoji. Which the "Flourish of
>> approval" would not necessarily be, unless the idea was to create an
>> emoji for it, like the check mark.
> The OP?s post and references seem rather clear that it is intended as a normal character, for use with normal text, often handwritten, and used in plain-text environments (e.g. ?mostly in elementary schools? and ?for grading schoolwork?).
>
> I would think the process for proposing normal characters would need to be followed, and this should not be proposed as an emoji for the purpose of getting it encoded via the easier emoji process.
>
Well, the similarity to a check mark is there.

We usually don't encode characters intended for use in handwriting, 
except if they are needed to digitally archive manuscripts. Not sure 
grade school papers pass that bar. However, I could be wrong and the 
details depend on how the case for encoding is argued.

In contrast, there are signs that are normally written by hand that also 
qualify as standing for an idea, that would be natural to incorporate in 
informal writing, which is the case for the check mark.

If placing the mark in a text environment where emoji would normally be 
used, would it be seen and understood as "approved" in Dutch culture? 
Would anyone use it that way? Would a Netherlands-based Consortium have 
long added it to their collection?

I don't have any of the answers. It's up to the submitters.

A./


From freek at macfreek.nl  Fri Feb 16 12:40:03 2024
From: freek at macfreek.nl (Freek Dijkstra)
Date: Fri, 16 Feb 2024 19:40:03 +0100
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <SJ0PR03MB6598B4344221401B6A6F362ECA4C2@SJ0PR03MB6598.namprd03.prod.outlook.com>
References: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>
 <Zc-EVStpx_0VE60O@nic.fr>
 <101fd78e-673f-43a9-837e-98b018c3c040@ix.netcom.com>
 <SJ0PR03MB6598B4344221401B6A6F362ECA4C2@SJ0PR03MB6598.namprd03.prod.outlook.com>
Message-ID: <8a48abf7-e1b7-4694-a27f-2b614f042d02@macfreek.nl>

All, thank you for the responses!

Indeed, this is a not an emoji, but a symbol very akin to a checkmark, 
which is found in the Dingbats table 
(https://www.unicode.org/charts/PDF/U2700.pdf). According to a newspaper 
article on its history, it originates somewhere in the 19th century. So 
I'll follow the normal character proposal process.

In the mean time, I not only found the forms to fill in at 
https://www.unicode.org/L2/summary.html, I even find someone who was 
-just like me- "genuinely mildly irritated" with the fact that there was 
no codepoint in Unicode, and even created a website to fix this: 
https://unicode-krul.nl/en

I'll first try to contact them. I suspect that their genuine irritation 
was mild enough that is was eventually abandoned after seeing the effort 
it seemingly takes to get this done. :) Let's hope we're more successful 
this time.

With kind regards,
Freek Dijkstra


On 16-02-2024 18:38, Doug Ewell via Unicode wrote:
> Asmus Freytag wrote:
>
>>>> 1. What is the process for submitting assigning a codepoint to a
>>>> symbol currently missing from the Unicode tables?
>>> http://unicode.org/emoji/proposals.html
>> This assumes that the "symbol" is an emoji. Which the "Flourish of
>> approval" would not necessarily be, unless the idea was to create an
>> emoji for it, like the check mark.
> The OP?s post and references seem rather clear that it is intended as a normal character, for use with normal text, often handwritten, and used in plain-text environments (e.g. ?mostly in elementary schools? and ?for grading schoolwork?).
>
> I would think the process for proposing normal characters would need to be followed, and this should not be proposed as an emoji for the purpose of getting it encoded via the easier emoji process.
>
> --
> Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org
>
>

From doug at ewellic.org  Sat Feb 17 13:18:48 2024
From: doug at ewellic.org (Doug Ewell)
Date: Sat, 17 Feb 2024 19:18:48 +0000
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <8a48abf7-e1b7-4694-a27f-2b614f042d02@macfreek.nl>
References: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>
 <Zc-EVStpx_0VE60O@nic.fr>
 <101fd78e-673f-43a9-837e-98b018c3c040@ix.netcom.com>
 <SJ0PR03MB6598B4344221401B6A6F362ECA4C2@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <8a48abf7-e1b7-4694-a27f-2b614f042d02@macfreek.nl>
Message-ID: <SJ0PR03MB65986D2522F7C33551161878CA532@SJ0PR03MB6598.namprd03.prod.outlook.com>

Freek Dijkstra wrote:

> In the mean time, I not only found the forms to fill in at
> https://www.unicode.org/L2/summary.html, I even find someone who was
> -just like me- "genuinely mildly irritated" with the fact that there
> was no codepoint in Unicode, and even created a website to fix this: 
> https://unicode-krul.nl/en

As you?ve probably guessed, writing an actual proposal and being available to discuss it with the committees (Script Ad Hoc and Unicode Technical Committee) is much more effective than being irritated that the symbol is not already there. Websites about the symbol and about the irritation, or other lobbying efforts, may feel good but are also not the road to encoding.

In the early days of Unicode and ISO 10646, say 25 or 30 years ago, a missing character might be discovered in an existing, commonly used 8-bit character set, and that was often enough to get it added to Unicode. For quite some time now, just about all of the ?obvious? characters have been encoded, and it does take more effort to encode new ones, especially those that have never been represented in digital plain text before.

You will want to show in your proposal that there is demand for representing this symbol, which seems to be a handwritten convention, in computerized text. The Dingbats block is not a good analogy ? those characters came from laser printers and symbol fonts, and thus by definition were used extensively on computers.

> I'll first try to contact them. I suspect that their genuine
> irritation was mild enough that is was eventually abandoned after
> seeing the effort it seemingly takes to get this done. :) Let's hope
> we're more successful this time.

As above. The amount of effort required for a symbol like this is reasonable and justified. Not everything that has ever been written is a candidate for encoding as a character. Good justification and evidence for this one will be needed.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


From jukkakk at gmail.com  Sat Feb 17 13:57:47 2024
From: jukkakk at gmail.com (Jukka K. Korpela)
Date: Sat, 17 Feb 2024 21:57:47 +0200
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <SJ0PR03MB65986D2522F7C33551161878CA532@SJ0PR03MB6598.namprd03.prod.outlook.com>
References: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>
 <Zc-EVStpx_0VE60O@nic.fr> <101fd78e-673f-43a9-837e-98b018c3c040@ix.netcom.com>
 <SJ0PR03MB6598B4344221401B6A6F362ECA4C2@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <8a48abf7-e1b7-4694-a27f-2b614f042d02@macfreek.nl>
 <SJ0PR03MB65986D2522F7C33551161878CA532@SJ0PR03MB6598.namprd03.prod.outlook.com>
Message-ID: <CAGHxYa6L5=rcYju6ue3xFGD8joS-1pV3KDGxZOwMJYVxxw3wNQ@mail.gmail.com>

 Doug Ewell via Unicode (unicode at corp.unicode.org) wrote::

> You will want to show in your proposal that there is demand for representing this symbol,
>  which seems to be a handwritten convention, in computerized text.

I?d like to add to this good advice the point that the symbol should
have demonstrable
use in text, as a character, as opposite to a hand-drawn symbol in the margin.

I think it is less relevant that the symbol has been used in computerized text,
i.e. in text in digital format. Obviously, since the symbol does not
exist in Unicode,
any digital format has had to use an image, or perhaps a Private Use character.
But if you can demonstrate use of the symbol as an image in commonly used
digital formats and the need for encoding it as a character for use in
plain text
formats, I think you would have a case.

Yucca, https://jkorpela.fi


From jameskass at code2001.com  Sat Feb 17 13:59:50 2024
From: jameskass at code2001.com (James Kass)
Date: Sat, 17 Feb 2024 19:59:50 +0000
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <SJ0PR03MB65986D2522F7C33551161878CA532@SJ0PR03MB6598.namprd03.prod.outlook.com>
References: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>
 <Zc-EVStpx_0VE60O@nic.fr>
 <101fd78e-673f-43a9-837e-98b018c3c040@ix.netcom.com>
 <SJ0PR03MB6598B4344221401B6A6F362ECA4C2@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <8a48abf7-e1b7-4694-a27f-2b614f042d02@macfreek.nl>
 <SJ0PR03MB65986D2522F7C33551161878CA532@SJ0PR03MB6598.namprd03.prod.outlook.com>
Message-ID: <a5650b88-75c0-4603-84f3-35542acd0694@code2001.com>


On 2024-02-17 7:18 PM, Doug Ewell via Unicode wrote:
> As above. The amount of effort required for a symbol like this is
> reasonable and justified. Not everything that has ever been written is a
>   candidate for encoding as a character. Good justification and evidence
> for this one will be needed.
The Wikipedia page linked earlier,
https://en.wikipedia.org/wiki/Flourish_of_approval
... suggests using the German pfennig symbol (?) as a substitute for the 
krul.? Evidence that U+20B0 GERMAN PENNY SIGN is being used as a krul in 
real world computer data interchange could be helpful to a proposal.

From christoph.paeper at crissov.de  Sat Feb 17 14:02:05 2024
From: christoph.paeper at crissov.de (=?utf-8?Q?Christoph_P=C3=A4per?=)
Date: Sat, 17 Feb 2024 21:02:05 +0100
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <329f4c0f-c6bc-4d52-8b89-2b5cb6cc9204@ix.netcom.com>
References: <329f4c0f-c6bc-4d52-8b89-2b5cb6cc9204@ix.netcom.com>
Message-ID: <E3B09251-5CF6-49BB-BD34-D86908C45705@crissov.de>

Asmus Freytag via Unicode <unicode at corp.unicode.org>:
> 
> We usually don't encode characters intended for use in handwriting, except if they are needed to digitally archive manuscripts. Not sure grade school papers pass that bar.

Every piece of writing might be digitally archived nowadays, even more so in the future. Therefore, every _established_ literal atomic sign should be encodable, so it can be unambiguously read by machines. I strongly believe this includes paralinguistic signs, whereas nonlinguistic signs (e.g. much of ISO 7000) would require an extension of the scope of Unicode (although several graphic symbols from that and other standards already have a codepoint assigned to them). 

This one is clearly well established, i.e. has at least one canonical form and meaning, even if its use is geographically limited. It cannot be represented by a combination of other, already encoded characters. 


From christoph.paeper at crissov.de  Sat Feb 17 14:17:40 2024
From: christoph.paeper at crissov.de (=?utf-8?Q?Christoph_P=C3=A4per?=)
Date: Sat, 17 Feb 2024 21:17:40 +0100
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <a5650b88-75c0-4603-84f3-35542acd0694@code2001.com>
References: <a5650b88-75c0-4603-84f3-35542acd0694@code2001.com>
Message-ID: <DDC3C760-0B1D-4191-9B12-4DFB8C546C77@crissov.de>

An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240217/a99beb5f/attachment.htm>

From asmusf at ix.netcom.com  Sat Feb 17 15:17:56 2024
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Sat, 17 Feb 2024 13:17:56 -0800
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <DDC3C760-0B1D-4191-9B12-4DFB8C546C77@crissov.de>
References: <a5650b88-75c0-4603-84f3-35542acd0694@code2001.com>
 <DDC3C760-0B1D-4191-9B12-4DFB8C546C77@crissov.de>
Message-ID: <259fc498-fba3-4d28-903b-e269ff59911f@ix.netcom.com>

If someone has made a font or if someone is using a substitute Unicode 
character, that would amount to evidence of the attempt to use the 
symbol in (digital) text. If actual examples of use of such substitutes 
in context can be found, it would suggest the type of use.

The comparison with the dingbats is tricky.

Yes, the whole set is a legacy set, so actual instances can be found in 
digitally prepared documents and a value is attached to being able to 
express that in Unicode plain text.

However, some symbols, like the check mark, are used in ways that might 
be similar to the way the approval mark might be used. For example, it 
can also convey approval and is used in an emojified presentation for 
that purpose.

Being able to express approval with a culturally appropriate icon in 
this manner is potentially an argument in favor.

The details of a proposal, the documentation of actual use, and a clear 
exposition of how this symbol has iconic value all would influence an 
eventual decision.

A./


From freek at macfreek.nl  Sat Feb 17 17:26:59 2024
From: freek at macfreek.nl (Freek Dijkstra)
Date: Sun, 18 Feb 2024 00:26:59 +0100
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <259fc498-fba3-4d28-903b-e269ff59911f@ix.netcom.com>
References: <a5650b88-75c0-4603-84f3-35542acd0694@code2001.com>
 <DDC3C760-0B1D-4191-9B12-4DFB8C546C77@crissov.de>
 <259fc498-fba3-4d28-903b-e269ff59911f@ix.netcom.com>
Message-ID: <b1a59382-c892-471a-9050-21196b12ee4b@macfreek.nl>

Hi Asmus and others,

Let me answer a few questions, and at the same time pose some more 
questions :)

/Asmus Freytag wrote:/
> If placing the mark in a text environment where emoji would normally 
> be used, would it be seen and understood as "approved" in Dutch 
> culture? Would anyone use it that way? 
Here is an example use as part of an older logo used by the organisation 
(VVN) that performed mandatory safety inspections for vehicles:
https://upload.wikimedia.org/wikipedia/commons/4/4e/Goedkeuringskrul_VVN.jpg

/Asmus Freytag wrote:/
> If someone has made a font or if someone is using a substitute Unicode 
> character, that would amount to evidence of the attempt to use the 
> symbol in (digital) text. If actual examples of use of such 
> substitutes in context can be found, it would suggest the type of use.
While I'm not aware of any font or substitute Unicode character (except 
for unicode-krul.nl, but that's not an independant source), here is a 
Q&A on StackExchange with a few dozen people to get the symbol in an 
electronic document after all:
https://tex.stackexchange.com/questions/313281/how-to-make-a-krul-unofficial-dutch-symbol-for-ok

@James Kass, Christoph P?per:
I've also read about the use of the Pfennig symbol or the deleatur as 
substitution. However, both the glyph and the meaning are distinctly 
different. In the last answer of that SE Q&A you'll see an attempt to 
make it fit nevertheless by hiding part of the glyph ? poorly, if I may add.


That said, the SE Q&A does raise a few more serious questions.

1. Would the above be sufficient for the UTC to show proof of need to 
use in electronic form? On one hand, I think is anecdotal evidence, on 
the other hand, it is real usage. A few decades ago, I participated in a 
standardization body where "running code and rough consensus" was the 
motto. I'm yet unfamiliar with the mores of the Unicode UTC. If the 
above is not sufficient, what would? A statement from a formal 
linguistic body? Or from a linguistic user group?

2. The Q&A correctly mentions that this character has two distinct 
glyphs. While I have a personal preference (just because of the way I 
was thought to write it), I rather consult a expert linguistic about 
this. It is said to be around since somewhere in the 19th century, and I 
do not know how it has changed over the decades, or usage in different 
regions of the world (beside the Netherlands, it is also used in 
countries that are former Dutch colonies).

/Asmus Freytag wrote:/
> However, some symbols, like the check mark, are used in ways that 
> might be similar to the way the approval mark might be used. For 
> example, it can also convey approval and is used in an emojified 
> presentation for that purpose. 
3. Yes. It can convey "approval" but can also mean "incorrect" in Sweden 
according to 
https://en.wikipedia.org/wiki/Check_mark#International_differences. And 
this actually seems to indicate that there are more symbols missing. On 
that page, the ?/? symbol in Finland is missing from Unicode and 
Wikipedia uses an image instead (oh, horror), and the hanamaru listed on 
https://en.wikipedia.org/wiki/O_mark specifically lists a work-around 
because Unicode is missing that symbols too (last line in the "Unicode" 
paragraph). I almost get the feeling that Unicode has overlooked a 
(small) category of these symbols, and only included the English ones. 
Sadly, my knowledge of those other symbols is limited, so I can only 
make a proposal for the Flourish of Approval. But just to check: Unicode 
codepoints represent a glyph, not a meaning, right? So the English ? and 
Swedish ? have the same codepoint, even though their meaning is different?

Side note: the check mark seems to come from the letter "v" for "vidit" 
("has seen") according to a professor in a Dutch paper, just like the 
glyph for the Flourish of Approval likely comes from the letter "g", 
from "goed" ("good") or "gezien" ("seen").

4. The discussion on character vs emoji, and the legacy set of symbols 
in the U+2700 table (Dingbats) does raise the question: where should a 
new symbol be placed? It is a symbol, but the miscellaneous symbols in 
the U+2700 table (Dingbats) are currently listed under "Emoji & 
Pictograms". However, this is not a pictogram -- while not a character 
in an alphabet (which has ordering), it is also not a pictogram (it does 
not represent a physical object). So looking at 
https://www.unicode.org/charts/, where should this symbol be placed?


With kind regards,
Freek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240218/6892532b/attachment-0001.htm>

From asmusf at ix.netcom.com  Sat Feb 17 19:32:46 2024
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Sat, 17 Feb 2024 17:32:46 -0800
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <b1a59382-c892-471a-9050-21196b12ee4b@macfreek.nl>
References: <a5650b88-75c0-4603-84f3-35542acd0694@code2001.com>
 <DDC3C760-0B1D-4191-9B12-4DFB8C546C77@crissov.de>
 <259fc498-fba3-4d28-903b-e269ff59911f@ix.netcom.com>
 <b1a59382-c892-471a-9050-21196b12ee4b@macfreek.nl>
Message-ID: <7ae9b049-f7fb-4060-9d25-3273ede52dbe@ix.netcom.com>

Remember, this list is just an informal discussions that might give you 
ideas on how to argue the case for encoding and what likely objections 
you may encounter. It otherwise carries no weight and while it's 
archived, it's not something anyone would turn to in making decisions.

That said.

The cited discussion on SE shows that that there are reasonable 
scenarios where this is used as a symbol/punctuation in text. That it 
would also be "letter-like", that is, derived from a letter shape, makes 
a case for encoding this as a symbol with text representation.

The standalone use on logos makes me wonder whether, should it be 
available, Dutch users would use it as an emoji (e.g. in text messages). 
It can easily be argued from the evidence already shared, that (1) Dutch 
users would readily recognize it (2) there's a desire to not only have 
it in text, but also, at times to have it stand out and act as a full 
statement of its own, very analogous to a check mark with emoji 
presentation.

I would counsel to not view this as an either / or. Perhaps persuing 
this as a standard (text presentation) symbol at first, and then later 
explore whether it falls in the small range of iconic symbols that exist 
in both text and emoji form -- with the check mark being the obvious analog.

The evidence presented in form of the safety inspection sticker makes 
the case that this symbol has acquired a generalized use that is not 
limited to marking student papers. That may have been the origin, but it 
should not limit UTC in taking into account its apparently much broader 
use.

While the solution presented in the context of the TeX SE works well for 
TeX / LaTeX, it doesn't work in general typesetting. This would not be 
the first time that Unicode encodes a symbol that (instead of a PUA 
font) has first been created as a special TeX macro. That would be 
useful to point out. Having a macro that creates an outline on the fly 
is very different from placing a bitmap or other picture in running 
text. It definitely has parallels to creating outlines that you access 
with a PUA code - except that the detour via PUA isn't needed in TeX 
because TeX natively supports named (user defined) macros.

A./

On 2/17/2024 3:26 PM, Freek Dijkstra via Unicode wrote:
> Hi Asmus and others,
>
> Let me answer a few questions, and at the same time pose some more 
> questions :)
>
> /Asmus Freytag wrote:/
>> If placing the mark in a text environment where emoji would normally 
>> be used, would it be seen and understood as "approved" in Dutch 
>> culture? Would anyone use it that way? 
> Here is an example use as part of an older logo used by the 
> organisation (VVN) that performed mandatory safety inspections for 
> vehicles:
> https://upload.wikimedia.org/wikipedia/commons/4/4e/Goedkeuringskrul_VVN.jpg
>
> /Asmus Freytag wrote:/
>> If someone has made a font or if someone is using a substitute 
>> Unicode character, that would amount to evidence of the attempt to 
>> use the symbol in (digital) text. If actual examples of use of such 
>> substitutes in context can be found, it would suggest the type of use.
> While I'm not aware of any font or substitute Unicode character 
> (except for unicode-krul.nl, but that's not an independant source), 
> here is a Q&A on StackExchange with a few dozen people to get the 
> symbol in an electronic document after all:
> https://tex.stackexchange.com/questions/313281/how-to-make-a-krul-unofficial-dutch-symbol-for-ok
>
> @James Kass, Christoph P?per:
> I've also read about the use of the Pfennig symbol or the deleatur as 
> substitution. However, both the glyph and the meaning are distinctly 
> different. In the last answer of that SE Q&A you'll see an attempt to 
> make it fit nevertheless by hiding part of the glyph ? poorly, if I 
> may add.
>
>
> That said, the SE Q&A does raise a few more serious questions.
>
> 1. Would the above be sufficient for the UTC to show proof of need to 
> use in electronic form? On one hand, I think is anecdotal evidence, on 
> the other hand, it is real usage. A few decades ago, I participated in 
> a standardization body where "running code and rough consensus" was 
> the motto. I'm yet unfamiliar with the mores of the Unicode UTC. If 
> the above is not sufficient, what would? A statement from a formal 
> linguistic body? Or from a linguistic user group?
>
> 2. The Q&A correctly mentions that this character has two distinct 
> glyphs. While I have a personal preference (just because of the way I 
> was thought to write it), I rather consult a expert linguistic about 
> this. It is said to be around since somewhere in the 19th century, and 
> I do not know how it has changed over the decades, or usage in 
> different regions of the world (beside the Netherlands, it is also 
> used in countries that are former Dutch colonies).
>
> /Asmus Freytag wrote:/
>> However, some symbols, like the check mark, are used in ways that 
>> might be similar to the way the approval mark might be used. For 
>> example, it can also convey approval and is used in an emojified 
>> presentation for that purpose. 
> 3. Yes. It can convey "approval" but can also mean "incorrect" in 
> Sweden according to 
> https://en.wikipedia.org/wiki/Check_mark#International_differences. 
> And this actually seems to indicate that there are more symbols 
> missing. On that page, the ?/? symbol in Finland is missing from 
> Unicode and Wikipedia uses an image instead (oh, horror), and the 
> hanamaru listed on https://en.wikipedia.org/wiki/O_mark specifically 
> lists a work-around because Unicode is missing that symbols too (last 
> line in the "Unicode" paragraph). I almost get the feeling that 
> Unicode has overlooked a (small) category of these symbols, and only 
> included the English ones. Sadly, my knowledge of those other symbols 
> is limited, so I can only make a proposal for the Flourish of 
> Approval. But just to check: Unicode codepoints represent a glyph, not 
> a meaning, right? So the English ? and Swedish ? have the same 
> codepoint, even though their meaning is different?
>
> Side note: the check mark seems to come from the letter "v" for 
> "vidit" ("has seen") according to a professor in a Dutch paper, just 
> like the glyph for the Flourish of Approval likely comes from the 
> letter "g", from "goed" ("good") or "gezien" ("seen").
>
> 4. The discussion on character vs emoji, and the legacy set of symbols 
> in the U+2700 table (Dingbats) does raise the question: where should a 
> new symbol be placed? It is a symbol, but the miscellaneous symbols in 
> the U+2700 table (Dingbats) are currently listed under "Emoji & 
> Pictograms". However, this is not a pictogram -- while not a character 
> in an alphabet (which has ordering), it is also not a pictogram (it 
> does not represent a physical object). So looking at 
> https://www.unicode.org/charts/, where should this symbol be placed?
>
>
> With kind regards,
> Freek
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240217/fc52fa3a/attachment.htm>

From jameskass at code2001.com  Sat Feb 17 19:35:06 2024
From: jameskass at code2001.com (James Kass)
Date: Sun, 18 Feb 2024 01:35:06 +0000
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <b1a59382-c892-471a-9050-21196b12ee4b@macfreek.nl>
References: <a5650b88-75c0-4603-84f3-35542acd0694@code2001.com>
 <DDC3C760-0B1D-4191-9B12-4DFB8C546C77@crissov.de>
 <259fc498-fba3-4d28-903b-e269ff59911f@ix.netcom.com>
 <b1a59382-c892-471a-9050-21196b12ee4b@macfreek.nl>
Message-ID: <d60c6732-31b8-42e9-a646-102d11600a85@code2001.com>


On 2024-02-17 11:26 PM, Freek Dijkstra via Unicode wrote:
> I almost get the feeling that Unicode has overlooked a (small) 
> category of these symbols, and only included the English ones. Sadly, 
> my knowledge of those other symbols is limited, so I can only make a 
> proposal for the Flourish of Approval. But just to check: Unicode 
> codepoints represent a glyph, not a meaning, right? So the English ? 
> and Swedish ? have the same codepoint, even though their meaning is 
> different?

Unicode encodes characters rather than glyphs.? Please see 
http://www.unicode.org/reports/tr17/tr17-3.html for more information, 
specifically section 2.1 for illustrations.? The check mark (?) has one 
code point because of convention:? there was no distinction between 
Swedish and English usage of the mark in pre-existing character sets.

The Unicode repertoire might be perceived as favoring English symbols, 
but we need to keep in mind that the original goal of Unicode was to 
standardize existing character sets into a universal encoding which 
would serve everyone.? Many of those existing character sets were 
developed by English speaking users, hence the possible appearance of 
favoritism.? Likewise, an even larger batch of those existing character 
sets were developed by ?Westerners?, which can give the appearance of 
favoritism to non-Western users. But over time, many non-English and 
non-Western characters have been added to the Unicode repertoire because 
somebody took the time and made the effort to submit an encoding 
proposal and escort it through the approval process.


From asmusf at ix.netcom.com  Sat Feb 17 19:52:40 2024
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Sat, 17 Feb 2024 17:52:40 -0800
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <d60c6732-31b8-42e9-a646-102d11600a85@code2001.com>
References: <a5650b88-75c0-4603-84f3-35542acd0694@code2001.com>
 <DDC3C760-0B1D-4191-9B12-4DFB8C546C77@crissov.de>
 <259fc498-fba3-4d28-903b-e269ff59911f@ix.netcom.com>
 <b1a59382-c892-471a-9050-21196b12ee4b@macfreek.nl>
 <d60c6732-31b8-42e9-a646-102d11600a85@code2001.com>
Message-ID: <beace9c3-bc7a-416e-b6fd-32efdd458c4d@ix.netcom.com>

On 2/17/2024 5:35 PM, James Kass via Unicode wrote:
>
>
> On 2024-02-17 11:26 PM, Freek Dijkstra via Unicode wrote:
>> I almost get the feeling that Unicode has overlooked a (small) 
>> category of these symbols, and only included the English ones. Sadly, 
>> my knowledge of those other symbols is limited, so I can only make a 
>> proposal for the Flourish of Approval. But just to check: Unicode 
>> codepoints represent a glyph, not a meaning, right? So the English ? 
>> and Swedish ? have the same codepoint, even though their meaning is 
>> different?
>
> Unicode encodes characters rather than glyphs.? Please see 
> http://www.unicode.org/reports/tr17/tr17-3.html for more information, 
> specifically section 2.1 for illustrations.? The check mark (?) has 
> one code point because of convention:? there was no distinction 
> between Swedish and English usage of the mark in pre-existing 
> character sets.
The exception might be where some local convention uses both a check 
mark and some other shape in alternation. In such cases, there may be an 
argument in favor of considering the other shape a different symbol 
instead of implausibly suggesting that the check mark now has a range of 
acceptable glyph variations that includes the other shape (which would 
come as a surprise to most users of the existing check mark ...).
>
> The Unicode repertoire might be perceived as favoring English symbols, 
> but we need to keep in mind that the original goal of Unicode was to 
> standardize existing character sets into a universal encoding which 
> would serve everyone.? Many of those existing character sets were 
> developed by English speaking users, hence the possible appearance of 
> favoritism.? Likewise, an even larger batch of those existing 
> character sets were developed by ?Westerners?, which can give the 
> appearance of favoritism to non-Western users. But over time, many 
> non-English and non-Western characters have been added to the Unicode 
> repertoire because somebody took the time and made the effort to 
> submit an encoding proposal and escort it through the approval process.
>
I agree, there's every reason to identify cases where Unicode lacks a 
way for expressing a local written convention, even outside standard 
orthographic writing. We definitely should not - as a matter of 
principle - rule out local equivalents to widely used marks, just 
because the others are either used in English or have become global.

The symbol discussed here is in much more wide-spread and active use 
than many of the dead alphabets being added; even if it never becomes 
popular outside the Netherlands.

A./


From steffen at sdaoden.eu  Sat Feb 17 17:34:22 2024
From: steffen at sdaoden.eu (Steffen Nurpmeso)
Date: Sun, 18 Feb 2024 00:34:22 +0100
Subject: What's the process for proposing a symbol in the Unicode
 table?
In-Reply-To: <SJ0PR03MB65986D2522F7C33551161878CA532@SJ0PR03MB6598.namprd03.prod.outlook.com>
References: <SJ0PR03MB65986D2522F7C33551161878CA532@SJ0PR03MB6598.namprd03.prod.outlook.com>
Message-ID: <20240217233422.nbfws3QQ@steffen%sdaoden.eu>

Doug Ewell via Unicode wrote in
 <SJ0PR03MB65986D2522F7C33551161878CA532 at SJ0PR03MB6598.namprd03.prod.outl\
 ook.com>:

That made me think (my local copy is from 2020, i do not recall
anything), are "protective signs" part of Unicode already?
Some are very hard, almost impossible i'd say, for fonts.
But they are very important pictographics.  *Very*.

  https://en.wikipedia.org/wiki/Protective_sign
  https://de.wikipedia.org/wiki/Schutzzeichen

  https://de.wikipedia.org/wiki/Barbarastollen_(Freiburg_im_Breisgau)


--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

From ecm.unicode at gmail.com  Sat Feb 17 21:17:15 2024
From: ecm.unicode at gmail.com (Erik Carvalhal Miller)
Date: Sat, 17 Feb 2024 22:17:15 -0500
Subject: Protective signs
In-Reply-To: <20240217233422.nbfws3QQ@steffen%sdaoden.eu>
References: <SJ0PR03MB65986D2522F7C33551161878CA532@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <20240217233422.nbfws3QQ@steffen%sdaoden.eu>
Message-ID: <CAJTfRPFQS2k1DsRupJZ+BNn6BrcSyAOyONS7aOGAsYc5CGtgmg@mail.gmail.com>

On Sat, Feb 17, 2024 at 9:30?PM Steffen Nurpmeso via Unicode <
unicode at corp.unicode.org> wrote:

> That made me think (my local copy is from 2020, i do not recall
> anything), are "protective signs" part of Unicode already?
> Some are very hard, almost impossible i'd say, for fonts.
> But they are very important pictographics.  *Very*.
>
>   https://en.wikipedia.org/wiki/Protective_sign
>   https://de.wikipedia.org/wiki/Schutzzeichen
>
>   https://de.wikipedia.org/wiki/Barbarastollen_(Freiburg_im_Breisgau)


Some of the signs are in Unicode: the letters P and G (U+0050, U+0047), the
letters P and W (U+0050 again, U+0057), the letters I and C (U+0049,
U+0043), a white flag (U+2690 ???) and a waving white flag (U+1F3F3 ???),
a flag of the United Nations incorporating its emblem (the
regional?indicator sequence U+1F1FA, U+1F1F3 ????), the letters U and N
(U+0055, U+004E), and the thrice?repeatable large orange circle
(U+1F7E0 ???).  Those that are not in the Unicode repertoire are dependent
on color and therefore suggest emoji, if ever they should be encoded.
Important as they may be, is there a plaintext use case, such as texting an
enemy to indicate a hospital?  (Note there is also U+1F3E5 HOSPITAL ???,
which in the font I?m working in incorporates a symbol similar to the red
cross?)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240217/97029ac4/attachment.htm>

From jameskass at code2001.com  Sat Feb 17 22:22:17 2024
From: jameskass at code2001.com (James Kass)
Date: Sun, 18 Feb 2024 04:22:17 +0000
Subject: Protective signs
In-Reply-To: <CAJTfRPFQS2k1DsRupJZ+BNn6BrcSyAOyONS7aOGAsYc5CGtgmg@mail.gmail.com>
References: <SJ0PR03MB65986D2522F7C33551161878CA532@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <20240217233422.nbfws3QQ@steffen%sdaoden.eu>
 <CAJTfRPFQS2k1DsRupJZ+BNn6BrcSyAOyONS7aOGAsYc5CGtgmg@mail.gmail.com>
Message-ID: <11db0080-c022-402d-a4a5-d6961a339789@code2001.com>


On 2024-02-18 3:17 AM, Erik Carvalhal Miller via Unicode wrote:
> Important as they may be, is there a plaintext use case, such as 
> texting an enemy to indicate a hospital? 

Wouldn't work if the enemy has our texts blocked.? But if the enemy was 
a terrorist organization looking for sensitive targets, they'd probably 
be happy to have us point one out.

Seriously, though, also wondering if any plain-text encoding requirement 
exists for those symbols which aren't already available.


From ecm.unicode at gmail.com  Sat Feb 17 23:11:01 2024
From: ecm.unicode at gmail.com (Erik Carvalhal Miller)
Date: Sun, 18 Feb 2024 00:11:01 -0500
Subject: Protective signs
In-Reply-To: <11db0080-c022-402d-a4a5-d6961a339789@code2001.com>
References: <SJ0PR03MB65986D2522F7C33551161878CA532@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <20240217233422.nbfws3QQ@steffen%sdaoden.eu>
 <CAJTfRPFQS2k1DsRupJZ+BNn6BrcSyAOyONS7aOGAsYc5CGtgmg@mail.gmail.com>
 <11db0080-c022-402d-a4a5-d6961a339789@code2001.com>
Message-ID: <CAJTfRPF2=yqB2uyOLY2O=yL6XY9Lnkn2Z=0mk8YMPE9XPCaNjg@mail.gmail.com>

On Sat, Feb 17, 2024 at 11:25?PM James Kass via Unicode <
unicode at corp.unicode.org> wrote:

> On 2024-02-18 3:17 AM, Erik Carvalhal Miller via Unicode wrote:
> > Important as they may be, is there a plaintext use case, such as
> > texting an enemy to indicate a hospital?
>
> Wouldn't work if the enemy has our texts blocked.
>

Unfriended! ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240218/605e8e8d/attachment-0001.htm>

From asmusf at ix.netcom.com  Sun Feb 18 02:18:20 2024
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Sun, 18 Feb 2024 00:18:20 -0800
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <E3B09251-5CF6-49BB-BD34-D86908C45705@crissov.de>
References: <329f4c0f-c6bc-4d52-8b89-2b5cb6cc9204@ix.netcom.com>
 <E3B09251-5CF6-49BB-BD34-D86908C45705@crissov.de>
Message-ID: <0615d66d-44b2-4104-a9a9-1178808906cc@ix.netcom.com>

On 2/17/2024 12:02 PM, Christoph P?per via Unicode wrote:
> Asmus Freytag via Unicode<unicode at corp.unicode.org>:
>> We usually don't encode characters intended for use in handwriting, except if they are needed to digitally archive manuscripts. Not sure grade school papers pass that bar.
> Every piece of writing might be digitally archived nowadays, even more so in the future. Therefore, every _established_ literal atomic sign should be encodable, so it can be unambiguously read by machines. I strongly believe this includes paralinguistic signs, whereas nonlinguistic signs (e.g. much of ISO 7000) would require an extension of the scope of Unicode (although several graphic symbols from that and other standards already have a codepoint assigned to them).
>
> This one is clearly well established, i.e. has at least one canonical form and meaning, even if its use is geographically limited. It cannot be represented by a combination of other, already encoded characters.
>
That's an argument a proposal could make, but I'm not sure I'm ready to 
agree with that analysis.

Even if we approach 100% digital archiving, not everything can be, will 
be or needs to be archived as *plain text*. (Or even rich text).

Manuscripts are a good example of handwritten text that benefits from 
conversion to digital text, because they are subject of intense 
scholarship that would benefit from having the usual array of digital 
text processing available, such as search, and convenient rendering of 
excerpts.

People are studying the marks accompanying cave paintings, such as 
lines, circles or dots. One even resembles a hash mark #, making that 
arguably the oldest uniquely recognizable symbol ever encoded as a 
character. (Aside: dots and lines don't count, because we encode many 
different dots and lines).

For those studies, there's no overriding need to place the symbols into 
running text, or to attempt to show sequences of them as plain text. 
Therefore, such use alone is not sufficient rationale for deciding the 
question what constitutes an abstract character and to provide a 
standardized encoding, plus assign properties such as line breaking 
behavior.

The Dutch mark in question is interesting in that it's clearly 
associated with a well-defined concept and has a recognizable (and 
conventional) shape. Neither of those two aspects present any obstacle 
to encoding. However, the need to represent it in plain text needs to be 
established and any successful proposal will have to provide an argument 
that is specific and to the point.

The mere claim of a general principle as suggested above is not 
sufficient to make a persuasive argument for a specific encoding.

A./
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240218/9b349f88/attachment.htm>

From unicode at lindenbergsoftware.com  Sun Feb 18 04:05:13 2024
From: unicode at lindenbergsoftware.com (Norbert Lindenberg)
Date: Sun, 18 Feb 2024 11:05:13 +0100
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <8fbef790-93aa-4abd-bd27-5351177f9532@code2001.com>
References: <24be4921-eeed-4e76-9869-1df9cb08e0af@macfreek.nl>
 <Zc-EVStpx_0VE60O@nic.fr> <8fbef790-93aa-4abd-bd27-5351177f9532@code2001.com>
Message-ID: <1D2BEB85-1FF7-4CF1-8347-DE6C371B2FB3@lindenbergsoftware.com>


> On Feb 16, 2024, at 17:11, James Kass via Unicode <unicode at corp.unicode.org> wrote:
> 
> On 2024-02-16 3:50 PM, Stephane Bortzmeyer via Unicode wrote:
>> On Fri, Feb 16, 2024 at 12:27:20PM +0100,
>>  Freek Dijkstra via Unicode <unicode at corp.unicode.org> wrote
>>  a message of 188 lines which said:
>> 
>>> 1. What is the process for submitting assigning a codepoint to a symbol
>>>    currently missing from the Unicode tables?
>> http://unicode.org/emoji/proposals.html
>> 
> If the symbol is not an emoji:
> https://www.unicode.org/pending/symbol-guidelines.html
> 
> Submitting character proposals:
> http://www.unicode.org/pending/proposals.html

Proposals for characters other than emoji and Han are reviewed by the Script Ad Hoc, so this page tells you more about the process:
https://www.unicode.org/consortium/scriptadhoc.html

That page also links to a template for proposing the encoding of a new character:
https://www.unicode.org/L2/L2023/23104r-addl-script-template-april2023.pdf

> On Feb 18, 2024, at 00:26, Freek Dijkstra via Unicode <unicode at corp.unicode.org> wrote:
> 
> So looking at https://www.unicode.org/charts/, where should this symbol be placed?


Don?t worry about that; the SAH can find a code point for your character (see page 2 of the template).

Best regards,
Norbert


From marius.spix at web.de  Sun Feb 18 11:25:29 2024
From: marius.spix at web.de (Marius Spix)
Date: Sun, 18 Feb 2024 18:25:29 +0100
Subject: Aw: Re: Protective signs
In-Reply-To: <CAJTfRPFQS2k1DsRupJZ+BNn6BrcSyAOyONS7aOGAsYc5CGtgmg@mail.gmail.com>
References: <SJ0PR03MB65986D2522F7C33551161878CA532@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <20240217233422.nbfws3QQ@steffen%sdaoden.eu>
 <CAJTfRPFQS2k1DsRupJZ+BNn6BrcSyAOyONS7aOGAsYc5CGtgmg@mail.gmail.com>
Message-ID: <trinity-b542e1ee-7956-45de-ba50-866e4d51eb9f-1708277129006@msvc-mesg-web004>

Unicode also has ?? HELMET WITH WHITE CROSS (U+26D1) which also could be used to mark medical corps. However, the actual orign of that character is the Maintenance symbol in Japanese TV broadcast. And similar to ? STAFF OF AESCULAPIS (U+2695) it is no international protective sign. That is the reason, why medical corps wear a white armband or patch with a red cross, crescent or crystal. While the red lion with sun is also theoretically protected, it is never used.

> Gesendet: Sonntag, den 18.02.2024 um 04:17 Uhr
> Von: "Erik Carvalhal Miller via Unicode" <unicode at corp.unicode.org>
> An: "Steffen Nurpmeso" <steffen at sdaoden.eu>
> Cc: "Doug Ewell via Unicode" <unicode at corp.unicode.org>, "Freek Dijkstra" <freek at macfreek.nl>
> Betreff: Re: Protective signs
> 
> On Sat, Feb 17, 2024 at 9:30?PM Steffen Nurpmeso via Unicode <
> unicode at corp.unicode.org> wrote:
> 
> > That made me think (my local copy is from 2020, i do not recall
> > anything), are "protective signs" part of Unicode already?
> > Some are very hard, almost impossible i'd say, for fonts.
> > But they are very important pictographics.  *Very*.
> >
> >   https://en.wikipedia.org/wiki/Protective_sign
> >   https://de.wikipedia.org/wiki/Schutzzeichen
> >
> >   https://de.wikipedia.org/wiki/Barbarastollen_(Freiburg_im_Breisgau)
> 
> 
> Some of the signs are in Unicode: the letters P and G (U+0050, U+0047), the
> letters P and W (U+0050 again, U+0057), the letters I and C (U+0049,
> U+0043), a white flag (U+2690 ???) and a waving white flag (U+1F3F3 ???),
> a flag of the United Nations incorporating its emblem (the
> regional?indicator sequence U+1F1FA, U+1F1F3 ????), the letters U and N
> (U+0055, U+004E), and the thrice?repeatable large orange circle
> (U+1F7E0 ???).  Those that are not in the Unicode repertoire are dependent
> on color and therefore suggest emoji, if ever they should be encoded.
> Important as they may be, is there a plaintext use case, such as texting an
> enemy to indicate a hospital?  (Note there is also U+1F3E5 HOSPITAL ???,
> which in the font I?m working in incorporates a symbol similar to the red
> cross?)


From wjgo_10009 at btinternet.com  Mon Feb 19 06:53:40 2024
From: wjgo_10009 at btinternet.com (William_J_G Overington)
Date: Mon, 19 Feb 2024 12:53:40 +0000 (GMT)
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <E3B09251-5CF6-49BB-BD34-D86908C45705@crissov.de>
References: <E3B09251-5CF6-49BB-BD34-D86908C45705@crissov.de>
Message-ID: <407348d9.2140.18dc16de220.Webtop.127@btinternet.com>


I wonder if the encoding rules are no longer fit for purpose.
?
The encoding process should be to be helpful to consumers, not to lead 
to an agreement to restrict progress.
?
I get the impression - and if I have got it wrong please correct me - 
that if one were using the krul character in a desktop publishing 
program that the likely scenario is that there is a large rectangular 
text frame filling most of the page and containing text in the Dutch 
language, in, say, 14 point, and there is in the right margin, near the 
lower edge of the page, a small rectangular text frame into which the 
krul character is inserted, quite possibly at a larger size than the 
other text, at, say, 36 point or 48 point.
?
Thus the krul character is not within a line of running text involving 
other characters as well as itself.
?
I say that the fact that the krul character is not within a line of 
running text involving other characters as well as itself should not go 
against the encoding of the krul character as a regular Unicode 
character.
?
This is because, in practice an end user is likely to want to introduce 
the krul character from a font. So encoding the krul character in 
regular Unicode would be helpful to end users and in my opinion being 
helpful to end users and consumers is what is important in encoding 
decisions.
?
William Overington
?
Monday 19 February 2024
?
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240219/c72c85c9/attachment.htm>

From cate at cateee.net  Mon Feb 19 09:44:58 2024
From: cate at cateee.net (Giacomo Catenazzi)
Date: Mon, 19 Feb 2024 16:44:58 +0100
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <407348d9.2140.18dc16de220.Webtop.127@btinternet.com>
References: <E3B09251-5CF6-49BB-BD34-D86908C45705@crissov.de>
 <407348d9.2140.18dc16de220.Webtop.127@btinternet.com>
Message-ID: <62a4f367-6b63-49c2-afd2-4897775f0305@cateee.net>


On 19 Feb 2024 13:53, William_J_G Overington via Unicode wrote:

> 
> This is because, in practice an end user is likely to want to introduce 
> the krul character from a font. So encoding the krul character in 
> regular Unicode would be helpful to end users and in my opinion being 
> helpful to end users and consumers is what is important in encoding 
> decisions.

I agree, but I would not formulate on such generic way. It must be 
useful in practice, not just potentially useful. By being in Unicode 
standard doesn't make any symbol useful to users *per se*, as we see in 
many technical symbols: they are in Unicode, but impossible to use 
because nobody do a good font (or any font).

IMHO we lack of volunteers (or money). Now it seems it is mostly on SIL 
and on Google (Noto font), but they still need to implement a lot of 
missing symbols and also scripts). This particular case may be simpler: 
there is no lack of people which understand the character and the glyph 
(and no strange script rules), but we should be careful not to go much 
behind, and so telling browsers and publishing programs to just start 
ignoring *second class* characters.

So we should weight more parameters, so that user will get something 
useful for real. (Note: with time things will improve).

giacomo

From pgcon6 at msn.com  Thu Feb 22 13:07:44 2024
From: pgcon6 at msn.com (Peter Constable)
Date: Thu, 22 Feb 2024 19:07:44 +0000
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <407348d9.2140.18dc16de220.Webtop.127@btinternet.com>
References: <E3B09251-5CF6-49BB-BD34-D86908C45705@crissov.de>
 <407348d9.2140.18dc16de220.Webtop.127@btinternet.com>
Message-ID: <DS0PR12MB753520FF02F97DB441F15CCF86562@DS0PR12MB7535.namprd12.prod.outlook.com>

> in practice an end user is likely to want to introduce the krul character from a font. So encoding the krul character in regular Unicode would be helpful to end users and in my opinion being helpful to end users and consumers is what is important in encoding decisions.

By this line of reasoning, every icon in any symbol font, such as Font Awesome<https://fontawesome.com/icons> would be a candidate for encoding. UTC has already explicitly decided against that argument for encoding. Moreover, the successful, widespread use of fonts like Font Awesome clearly demonstrates that encoding in Unicode is not necessary for users to easily use graphic symbols in content.

The Unicode Standard encodes characters, where ?character? is understood to mean an element of textual content and the encoding is intended for purposes of text processing. Not every graphic element qualifies for encoding simply because it can be presented using a font and placed in a text frame of a DTP application.

Cf. https://www.unicode.org/versions/Unicode15.0.0/ch01.pdf


Peter

From: Unicode <unicode-bounces at corp.unicode.org> On Behalf Of William_J_G Overington via Unicode
Sent: Monday, February 19, 2024 5:54 AM
To: unicode at corp.unicode.org
Subject: Re: What's the process for proposing a symbol in the Unicode table?


I wonder if the encoding rules are no longer fit for purpose.


The encoding process should be to be helpful to consumers, not to lead to an agreement to restrict progress.


I get the impression - and if I have got it wrong please correct me - that if one were using the krul character in a desktop publishing program that the likely scenario is that there is a large rectangular text frame filling most of the page and containing text in the Dutch language, in, say, 14 point, and there is in the right margin, near the lower edge of the page, a small rectangular text frame into which the krul character is inserted, quite possibly at a larger size than the other text, at, say, 36 point or 48 point.


Thus the krul character is not within a line of running text involving other characters as well as itself.


I say that the fact that the krul character is not within a line of running text involving other characters as well as itself should not go against the encoding of the krul character as a regular Unicode character.


This is because, in practice an end user is likely to want to introduce the krul character from a font. So encoding the krul character in regular Unicode would be helpful to end users and in my opinion being helpful to end users and consumers is what is important in encoding decisions.


William Overington


Monday 19 February 2024


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240222/9050ff9e/attachment-0001.htm>

From freek at macfreek.nl  Thu Feb 22 17:11:29 2024
From: freek at macfreek.nl (Freek Dijkstra)
Date: Fri, 23 Feb 2024 00:11:29 +0100
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <DS0PR12MB753520FF02F97DB441F15CCF86562@DS0PR12MB7535.namprd12.prod.outlook.com>
References: <E3B09251-5CF6-49BB-BD34-D86908C45705@crissov.de>
 <407348d9.2140.18dc16de220.Webtop.127@btinternet.com>
 <DS0PR12MB753520FF02F97DB441F15CCF86562@DS0PR12MB7535.namprd12.prod.outlook.com>
Message-ID: <cc5e6b34-0192-40d5-8b5e-99162fe63684@macfreek.nl>

Hi Peter,

Thanks for your references. However, I'm a bit confused with your 
argument. Are you talking about the krul symbol or about icons in 
general in the discussion with William?

I can't find the word "icon" in the referred chapter 1 of Unicode 15.0, 
so I assume you refer to this text in the document:

> Note, however, that the Unicode Standard does not encode 
> idiosyncratic, personal, novel, or private-use characters, nor does it 
> encode logos or graphics.

In case you refer to the "krul" character I want to propose: that is 
neither an icon nor a personal or private-use character, nor a logo, nor 
a graphics. At least not in the sence that it is not a graphical 
representation of a physical object (like all examples I see on the home 
page of https://fontawesome.com/icons).

If your argument is referring to the general use case, my apologies. I 
do not have any opinion about that.

With kind regards,
Freek Dijkstra

On 22-02-2024 20:07, Peter Constable via Unicode wrote:
>
> > in practice an end user is likely to want to introduce the krul 
> character from a font. So encoding the krul character in regular 
> Unicode would be helpful to end users and in my opinion being helpful 
> to end users and consumers is what is important in encoding decisions.
>
> By this line of reasoning, every icon in any symbol font, such as Font 
> Awesome <https://fontawesome.com/icons> would be a candidate for 
> encoding. UTC has already explicitly decided against that argument for 
> encoding. Moreover, the successful, widespread use of fonts like Font 
> Awesome clearly demonstrates that encoding in Unicode is not necessary 
> for users to easily use graphic symbols in content.
>
> The Unicode Standard encodes characters, where ?character? is 
> understood to mean an element of textual content and the encoding is 
> intended for purposes of text processing. Not every graphic element 
> qualifies for encoding simply because it can be presented using a font 
> and placed in a text frame of a DTP application.
>
> Cf. https://www.unicode.org/versions/Unicode15.0.0/ch01.pdf
>
> Peter
>
> *From:*Unicode <unicode-bounces at corp.unicode.org> *On Behalf Of 
> *William_J_G Overington via Unicode
> *Sent:* Monday, February 19, 2024 5:54 AM
> *To:* unicode at corp.unicode.org
> *Subject:* Re: What's the process for proposing a symbol in the 
> Unicode table?
>
> I wonder if the encoding rules are no longer fit for purpose.
>
> The encoding process should be to be helpful to consumers, not to lead 
> to an agreement to restrict progress.
>
> I get the impression - and if I have got it wrong please correct me - 
> that if one were using the krul character in a desktop publishing 
> program that the likely scenario is that there is a large rectangular 
> text frame filling most of the page and containing text in the Dutch 
> language, in, say, 14 point, and there is in the right margin, near 
> the lower edge of the page, a small rectangular text frame into which 
> the krul character is inserted, quite possibly at a larger size than 
> the other text, at, say, 36 point or 48 point.
>
> Thus the krul character is not within a line of running text involving 
> other characters as well as itself.
>
> I say that the fact that the krul character is not within a line of 
> running text involving other characters as well as itself should not 
> go against the encoding of the krul character as a regular Unicode 
> character.
>
> This is because, in practice an end user is likely to want to 
> introduce the krul character from a font. So encoding the krul 
> character in regular Unicode would be helpful to end users and in my 
> opinion being helpful to end users and consumers is what is important 
> in encoding decisions.
>
> William Overington
>
> Monday 19 February 2024
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240223/abdbf1cb/attachment.htm>

From asmusf at ix.netcom.com  Thu Feb 22 19:08:46 2024
From: asmusf at ix.netcom.com (Asmus Freytag)
Date: Thu, 22 Feb 2024 17:08:46 -0800
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <cc5e6b34-0192-40d5-8b5e-99162fe63684@macfreek.nl>
References: <E3B09251-5CF6-49BB-BD34-D86908C45705@crissov.de>
 <407348d9.2140.18dc16de220.Webtop.127@btinternet.com>
 <DS0PR12MB753520FF02F97DB441F15CCF86562@DS0PR12MB7535.namprd12.prod.outlook.com>
 <cc5e6b34-0192-40d5-8b5e-99162fe63684@macfreek.nl>
Message-ID: <8145e1b0-17a7-405e-af1c-715f01547e2b@ix.netcom.com>

On 2/22/2024 3:11 PM, Freek Dijkstra via Unicode wrote:
> In case you refer to the "krul" character I want to propose: that is 
> neither an icon nor a personal or private-use character, nor a logo, 
> nor a graphics. At least not in the sence that it is not a graphical 
> representation of a physical object (like all examples I see on the 
> home page of https://fontawesome.com/icons).
>
> If your argument is referring to the general use case, my apologies. I 
> do not have any opinion about that.

The way I read the discussion on the list, it had descended into general 
arguments.

For the specific character, what's needed now is submission of a 
well-formed proposal. The return on further discussion of this character 
on this list is probably insignificant.

A./
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240222/1e503a2e/attachment.htm>

From pgcon6 at msn.com  Thu Feb 22 20:31:18 2024
From: pgcon6 at msn.com (Peter Constable)
Date: Fri, 23 Feb 2024 02:31:18 +0000
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <cc5e6b34-0192-40d5-8b5e-99162fe63684@macfreek.nl>
References: <E3B09251-5CF6-49BB-BD34-D86908C45705@crissov.de>
 <407348d9.2140.18dc16de220.Webtop.127@btinternet.com>
 <DS0PR12MB753520FF02F97DB441F15CCF86562@DS0PR12MB7535.namprd12.prod.outlook.com>
 <cc5e6b34-0192-40d5-8b5e-99162fe63684@macfreek.nl>
Message-ID: <CH3PR12MB7523DE5C0E85936E6DE3211886552@CH3PR12MB7523.namprd12.prod.outlook.com>

Hi, Freek

I was responding to a general principle being put forth by William. The only concern I was expressing in regards to krul was the application of William's principle as an argument for encoding krul.

If it's clear that a character is an element of text content with an active user community, then existing font implementations can contribute to a proposal for encoding. But the rational he was suggesting implied that any graphic symbol that users might want to place on a page warranted encoding so that the symbol can be implemented in fonts. UTC will not buy that.

Others have given useful suggestions for what might provide helpful evidence in an encoding proposal. The mention of users finding workarounds to display something _similar_ in text brought to mind the proposal for encoding the Bitcoin currency symbol: UTC found helpful evidence showing that users were interchanging text using characters that were similar to the currency symbol, enough that the intended meaning might be understood _in context_, but not the same and misinterpreted when not in context.


Peter

From: Freek Dijkstra <freek at macfreek.nl>
Sent: Thursday, February 22, 2024 4:11 PM
To: Peter Constable <pgcon6 at msn.com>; William_J_G Overington <wjgo_10009 at btinternet.com>; unicode at corp.unicode.org
Subject: Re: What's the process for proposing a symbol in the Unicode table?

Hi Peter,

Thanks for your references. However, I'm a bit confused with your argument. Are you talking about the krul symbol or about icons in general in the discussion with William?

I can't find the word "icon" in the referred chapter 1 of Unicode 15.0, so I assume you refer to this text in the document:


Note, however, that the Unicode Standard does not encode idiosyncratic, personal, novel, or private-use characters, nor does it encode logos or graphics.

In case you refer to the "krul" character I want to propose: that is neither an icon nor a personal or private-use character, nor a logo, nor a graphics. At least not in the sence that it is not a graphical representation of a physical object (like all examples I see on the home page of https://fontawesome.com/icons).

If your argument is referring to the general use case, my apologies. I do not have any opinion about that.

With kind regards,
Freek Dijkstra
On 22-02-2024 20:07, Peter Constable via Unicode wrote:
> in practice an end user is likely to want to introduce the krul character from a font. So encoding the krul character in regular Unicode would be helpful to end users and in my opinion being helpful to end users and consumers is what is important in encoding decisions.

By this line of reasoning, every icon in any symbol font, such as Font Awesome<https://fontawesome.com/icons> would be a candidate for encoding. UTC has already explicitly decided against that argument for encoding. Moreover, the successful, widespread use of fonts like Font Awesome clearly demonstrates that encoding in Unicode is not necessary for users to easily use graphic symbols in content.

The Unicode Standard encodes characters, where "character" is understood to mean an element of textual content and the encoding is intended for purposes of text processing. Not every graphic element qualifies for encoding simply because it can be presented using a font and placed in a text frame of a DTP application.

Cf. https://www.unicode.org/versions/Unicode15.0.0/ch01.pdf


Peter

From: Unicode <unicode-bounces at corp.unicode.org><mailto:unicode-bounces at corp.unicode.org> On Behalf Of William_J_G Overington via Unicode
Sent: Monday, February 19, 2024 5:54 AM
To: unicode at corp.unicode.org<mailto:unicode at corp.unicode.org>
Subject: Re: What's the process for proposing a symbol in the Unicode table?


I wonder if the encoding rules are no longer fit for purpose.


The encoding process should be to be helpful to consumers, not to lead to an agreement to restrict progress.


I get the impression - and if I have got it wrong please correct me - that if one were using the krul character in a desktop publishing program that the likely scenario is that there is a large rectangular text frame filling most of the page and containing text in the Dutch language, in, say, 14 point, and there is in the right margin, near the lower edge of the page, a small rectangular text frame into which the krul character is inserted, quite possibly at a larger size than the other text, at, say, 36 point or 48 point.


Thus the krul character is not within a line of running text involving other characters as well as itself.


I say that the fact that the krul character is not within a line of running text involving other characters as well as itself should not go against the encoding of the krul character as a regular Unicode character.


This is because, in practice an end user is likely to want to introduce the krul character from a font. So encoding the krul character in regular Unicode would be helpful to end users and in my opinion being helpful to end users and consumers is what is important in encoding decisions.


William Overington


Monday 19 February 2024


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240223/6e6ed83f/attachment.htm>

From freek at macfreek.nl  Fri Feb 23 10:33:28 2024
From: freek at macfreek.nl (Freek Dijkstra)
Date: Fri, 23 Feb 2024 17:33:28 +0100
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <b1a59382-c892-471a-9050-21196b12ee4b@macfreek.nl>
References: <a5650b88-75c0-4603-84f3-35542acd0694@code2001.com>
 <DDC3C760-0B1D-4191-9B12-4DFB8C546C77@crissov.de>
 <259fc498-fba3-4d28-903b-e269ff59911f@ix.netcom.com>
 <b1a59382-c892-471a-9050-21196b12ee4b@macfreek.nl>
Message-ID: <5401f3a9-6305-48d4-b3b9-2882f3abc79e@macfreek.nl>

Hi all,

A small status update: I got in touch with the webmaster of 
https://unicode-krul.nl/en, which was an effort in 2018 (not 2022 as I 
thought). She is not trying to reach two colleagues from back than, who 
initiated the effort. I'm curious for their reason to abandon it (either 
due to interest or because it did not qualify).

Writing the draft is certainly doable (especially thanks to advices from 
Asmus Freytag and others), but I rather build some support first by 
consulting Dutch language groups, perhaps get in touch with a linguistic 
expert, before submitting.

Thanks in particular to Doug Ewell and Norbert Lindenberg for pointing 
me to the Script Ad-hoc committee. After the draft is ready, and I have 
consulted some local experts, I will be back focussing on the process, 
and this committee seems the best place to get started.

I'll post a short message at that time, for those curious folks on this 
list (yes, that's you, if you kept reading till here ;) ).

Regards,
Freek

From wjgo_10009 at btinternet.com  Fri Feb 23 11:54:00 2024
From: wjgo_10009 at btinternet.com (William_J_G Overington)
Date: Fri, 23 Feb 2024 17:54:00 +0000 (GMT)
Subject: What's the process for proposing a symbol in the Unicode table?
In-Reply-To: <cc5e6b34-0192-40d5-8b5e-99162fe63684@macfreek.nl>
References: <cc5e6b34-0192-40d5-8b5e-99162fe63684@macfreek.nl>
Message-ID: <5f6c6a1f.5264.18dd71a498f.Webtop.127@btinternet.com>


Asmus Freytag wrote:
?
?
> The way I read the discussion on the list, it had descended into 
> general arguments.
?
?
Not a matter of "descended", the title of the thread is general as is 
the first question in the first post in this thread.
?
?
Also, it is not an "argument", it is a discussion. Sort of a round 
table, not adversatorial.
?
?
Peter Constable wrote:
?
?
> But the rational he was suggesting implied that any graphic symbol 
> that users might want to place on a page warranted encoding so that 
> the symbol can be implemented in fonts. UTC will not buy that.
?
?
Well, in fairness, I suppose it does imply that, though that was not my 
intention in that post. However, I do tend to favour a policy of 
encoding things that is wider than the policy that is used at present as 
I opine that that would help progress.
?
?
Even if UTC does agree to encode the krul character, it will take some 
years to become implemented. In the meantime a Private Use Area 
character could be used, yet that could lead to ambiguity, though 
possibly not if used in a PDF document and the font,, or a subset of the 
font, is embedded in the PDF document.
??
??
If using an OpenType font in an application that has OpenType capability 
one could set it up so that the glyph of a krul is displayed when a 
particular sequence of characters is used. For example, if the sequence 
%k were used for a krul then the glyph for a krul in the font could be 
named, say, krulglyph and the following added to the liga table of the 
font.
??
?
sub percent k -> krulglyph;
?
??
I have used that technique for various characters that I have devised.
?
?
For example, at one of the Internationalization and Unicode Conferences 
there was mention of there being no emoji for "I" and "you".?
??
?
I tried to design some language-independent emoji for those two, and 
some other, personal pronouns. Things I tried just did not seem to work. 
However, I devised a set of abstract emoji-compatible ?glyphs and I like 
to think that they form a coherent, elegant, colourful, 
language-independent set of glyphs for personal pronoun characters. 
Alas, though, I have been told that the Emoji Subcommittee will not 
encode abstract emoji. I considered that a Private Use Area encoding was 
unsuitable due to ambiguity issues in interoperability.
??
??
So I have devised my own encoding system for them, so "I" is encoded as 
%11 and "You" as %21 (that is, 2 for second person, 1 for singular). But 
it is not like having a regular Unicode encoding. I feel that these 
codes are just not going to get applied very much at all. But there we 
are.?
??
??
The bar for getting newly invented characters encoded into regular 
Unicode is so very high. Is that very high bar reasonable or does it 
impede progress? Does it mean that only large companies with large 
resources are able to reach that very high bar?
??
?
For example, newly invented characters that show good potential for 
being applied and that applying of them resulting in progress could be 
encoded into regular Unicode as unambiguous sequences (possibly using 
tag characters) without using any new characters. That would mean that 
people could use the characters without being concerned about 
intellectual property rights.
?
?
There could be a renaissance of progress.
?
?
William Overington
??
?
Friday 23 February 2024
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240223/36d7322e/attachment-0001.htm>

From julesbertholet at quoi.xyz  Tue Feb 27 11:19:28 2024
From: julesbertholet at quoi.xyz (Jules Bertholet)
Date: Tue, 27 Feb 2024 17:19:28 +0000 (UTC)
Subject: Should the Yijing symbols be made East Asian Wide?
Message-ID: <61c220e7627f2d487ff169d354867060b922d67e.camel@quoi.xyz>

UAX 11 (https://www.unicode.org/reports/tr11/#ED7) says of the
East_Asian_Width property:

> Neutral (Not East Asian): [?] Neutral characters do not occur in
legacy East Asian character sets. By extension, they also do not occur
in East Asian typography.

However, there are several ranges of characters which are assigned a
width of Neutral despite originating from, and being primarily used in,
East Asian text.

- The Yijing symbols: these symbols originate from the Yi Jing
(https://en.wikipedia.org/wiki/I_Ching), an ancient  Chinese divination
text. These are encoded in Unicode in the "Yijing Hexagram Symbols"
block (https://www.unicode.org/charts/PDF/U4DC0.pdf), as well as under
the "Yijing monogram and digram symbols" and "Yijing trigram symbols"
subheadings in the "Miscellaneous Symbols" block
(https://www.unicode.org/charts/PDF/U2600.pdf).

- The Tai Xuan Jing symbols: these are from another Chinese divination
text (https://en.wikipedia.org/wiki/Taixuanjing). Encoded in the block
of the same name (https://www.unicode.org/charts/PDF/U1D300.pdf).

- The counting rod units and ideographic tally marks: encoded in the
"Counting Rod Numerals" block
(https://www.unicode.org/charts/PDF/U1D360.pdf), under the respective
subheadings. (This block also contains two Western tally marks which
should not be East Asian Wide).

Given the origin and use of these characters, I believe they should be
considered East Asian Wide, not Neutral as currently specified. As
additional supporting evidence, glibc currently treats the Yijing
hexagrams as wide:
https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/unicode-
gen/utf8_gen.py;h=e273607b6710811bbbd713fe204100b248d1f7ec;hb=HEAD#l274

Jules Bertholet


From markus.icu at gmail.com  Tue Feb 27 12:23:41 2024
From: markus.icu at gmail.com (Markus Scherer)
Date: Tue, 27 Feb 2024 10:23:41 -0800
Subject: Should the Yijing symbols be made East Asian Wide?
In-Reply-To: <61c220e7627f2d487ff169d354867060b922d67e.camel@quoi.xyz>
References: <61c220e7627f2d487ff169d354867060b922d67e.camel@quoi.xyz>
Message-ID: <CAN49p6oe2HA_-bGoE=-Z0b=O2uENjMvGcz4fbUJi3k=P3QxLCA@mail.gmail.com>

Hi Jules,

I can't answer your question, but wanted to note that this mailing list can
be useful for discussion but is not monitored for making changes.
When the discussion settles, and if changes are suggested, remember to
submit a request via https://www.unicode.org/reporting.html

Best regards,
markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240227/0a3a6804/attachment.htm>