From wjgo_10009 at btinternet.com Tue Oct 5 13:51:05 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Tue, 5 Oct 2021 19:51:05 +0100 (BST) Subject: Is there an emoji for Thank you Message-ID: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> Is there an emoji for Thank you ? If not, could there be, should there be? What would it look like? William Overington Tuesday 5 October 2021 From Andrew.Glass at microsoft.com Tue Oct 5 14:22:36 2021 From: Andrew.Glass at microsoft.com (Andrew Glass) Date: Tue, 5 Oct 2021 19:22:36 +0000 Subject: [EXTERNAL] Is there an emoji for Thank you In-Reply-To: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> Message-ID: ? ________________________________ From: Unicode on behalf of William_J_G Overington via Unicode Sent: Tuesday, October 5, 2021 11:51 AM To: unicode at corp.unicode.org Subject: [EXTERNAL] Is there an emoji for Thank you Is there an emoji for Thank you ? If not, could there be, should there be? What would it look like? William Overington Tuesday 5 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at kli.org Tue Oct 5 14:24:48 2021 From: mark at kli.org (Mark E. Shoulson) Date: Tue, 5 Oct 2021 15:24:48 -0400 Subject: Is there an emoji for Thank you In-Reply-To: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> Message-ID: <566358f8-6f1d-9cc5-5ae2-a3383dc26574@shoulson.com> Emoji are pictures of things.? To the extent they convey emotions, it's because they're pictures of things (facial expressions) which we associate with emotions. If you can't say what it would look like, that almost definitionally excludes it from being a emoji, a picture of a thing, doesn't it? ~mark On 10/5/21 2:51 PM, William_J_G Overington via Unicode wrote: > Is there an emoji for > > Thank you > > ? > > If not, could there be, should there be? > > What would it look like? > > William Overington > > Tuesday 5 October 2021 From textexin at xencraft.com Tue Oct 5 15:28:10 2021 From: textexin at xencraft.com (Tex) Date: Tue, 5 Oct 2021 13:28:10 -0700 Subject: Is there an emoji for Thank you In-Reply-To: <566358f8-6f1d-9cc5-5ae2-a3383dc26574@shoulson.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <566358f8-6f1d-9cc5-5ae2-a3383dc26574@shoulson.com> Message-ID: <000901d7ba27$844f9d60$8ceed820$@xencraft.com> That's a bit unfair Mark. Someone can want to represent an idea (be it an object, emotion, action, or other concept) and not have the visual or artistic skills to know how to depict it, or simply not know the best way to do so, given many options being considered. People should be able to ask a question without there being an implication if you have to ask there is no answer. And pictures often represent more than a static view of an object. The choice of the view and the context in the image can indicate action or other states. There are many photographic images that communicate sadness, loneliness or a host of other emotions without showing facial expressions at all. tex -----Original Message----- From: Unicode [mailto:unicode-bounces at corp.unicode.org] On Behalf Of Mark E. Shoulson via Unicode Sent: Tuesday, October 5, 2021 12:25 PM To: unicode at corp.unicode.org Subject: Re: Is there an emoji for Thank you Emoji are pictures of things. To the extent they convey emotions, it's because they're pictures of things (facial expressions) which we associate with emotions. If you can't say what it would look like, that almost definitionally excludes it from being a emoji, a picture of a thing, doesn't it? ~mark On 10/5/21 2:51 PM, William_J_G Overington via Unicode wrote: > Is there an emoji for > > Thank you > > ? > > If not, could there be, should there be? > > What would it look like? > > William Overington > > Tuesday 5 October 2021 From asmusf at ix.netcom.com Tue Oct 5 15:58:54 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Tue, 5 Oct 2021 13:58:54 -0700 Subject: Is there an emoji for Thank you In-Reply-To: <000901d7ba27$844f9d60$8ceed820$@xencraft.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <566358f8-6f1d-9cc5-5ae2-a3383dc26574@shoulson.com> <000901d7ba27$844f9d60$8ceed820$@xencraft.com> Message-ID: An HTML attachment was scrubbed... URL: From mark at kli.org Tue Oct 5 17:19:28 2021 From: mark at kli.org (Mark E. Shoulson) Date: Tue, 5 Oct 2021 18:19:28 -0400 Subject: Is there an emoji for Thank you In-Reply-To: <000901d7ba27$844f9d60$8ceed820$@xencraft.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <566358f8-6f1d-9cc5-5ae2-a3383dc26574@shoulson.com> <000901d7ba27$844f9d60$8ceed820$@xencraft.com> Message-ID: <79d360ef-e0ad-fac7-1f04-a3e884b23a1d@shoulson.com> On 10/5/21 4:28 PM, Tex via Unicode wrote: > That's a bit unfair Mark. Someone can want to represent an idea (be it an object, emotion, action, or other concept) and not have the visual or artistic skills to know how to depict it, or simply not know the best way to do so, given many options being considered. People should be able to ask a question without there being an implication if you have to ask there is no answer. Perhaps that is too strong.? I guess what I'm saying is if it's something you can't really imagine being captured by an image, then, well, I guess you shouldn't expect it to be captured by an image. > And pictures often represent more than a static view of an object. The choice of the view and the context in the image can indicate action or other states. > There are many photographic images that communicate sadness, loneliness or a host of other emotions without showing facial expressions at all. Facial expressions was not intended as an exhaustive list of things we associate with emotions, though they are the most commonly-used in emoji.? You also can't really compare emoji to the corpus of photographic images out there.? Emoji are limited, both in size and complexity and also in specification.? What I mean by the last is that emoji are not defined as a vector or raster image, but by a description (generally a very short one) which may be interpreted in different ways and needs to remain identifiable in all them (ideally). ~mark > > tex > > > -----Original Message----- > From: Unicode [mailto:unicode-bounces at corp.unicode.org] On Behalf Of Mark E. Shoulson via Unicode > Sent: Tuesday, October 5, 2021 12:25 PM > To: unicode at corp.unicode.org > Subject: Re: Is there an emoji for Thank you > > Emoji are pictures of things. To the extent they convey emotions, it's > because they're pictures of things (facial expressions) which we > associate with emotions. > > If you can't say what it would look like, that almost definitionally > excludes it from being a emoji, a picture of a thing, doesn't it? > > ~mark > > On 10/5/21 2:51 PM, William_J_G Overington via Unicode wrote: >> Is there an emoji for >> >> Thank you >> >> ? >> >> If not, could there be, should there be? >> >> What would it look like? >> >> William Overington >> >> Tuesday 5 October 2021 From lyratelle at gmx.de Tue Oct 5 17:20:29 2021 From: lyratelle at gmx.de (Dominikus Dittes Scherkl) Date: Wed, 6 Oct 2021 00:20:29 +0200 Subject: Is there an emoji for Thank you In-Reply-To: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> Message-ID: <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> Am 05.10.21 um 20:51 schrieb William_J_G Overington via Unicode: > Is there an emoji for > > Thank you > Something like Heart + Thumps up? -- Dominikus Dittes Scherkl From mark at kli.org Tue Oct 5 17:27:46 2021 From: mark at kli.org (Mark E. Shoulson) Date: Tue, 5 Oct 2021 18:27:46 -0400 Subject: Is there an emoji for Thank you In-Reply-To: References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <566358f8-6f1d-9cc5-5ae2-a3383dc26574@shoulson.com> <000901d7ba27$844f9d60$8ceed820$@xencraft.com> Message-ID: <0b8ae6bd-5841-05ca-8d98-49529250689e@shoulson.com> On 10/5/21 4:58 PM, Asmus Freytag via Unicode wrote: > There's a HUUUGE distinction between > > (1) I wonder whether there's an emoji that's commonly used to express > "Thank You!" (and if so, what does it look like?). > > (2) We should have / add an emoji for "Thank You!" but I have no idea > what it would look like. Good distinction/explanation, thank you. "Thank You" is a "reaction" icon often wished-for in Slack conversations, at least where I work.? Slack lets you upload your own "emoji" (even animated ones) so people have supplied their own, but they're generally text-based, like a teeny-tiny post-it note that says "Thank you!" on it, or some animated colorful text like "thx!" or whatever.? But those are just pretty ways of using ordinary words and letters. ~mark > > The first goes into the direction that Tex perhaps was aiming at: > there are images that express ideas; some evoke these ideas > spontaneously, others may be associated with an idea by convention. > Existing emoji span both dimensions. Asking about how existing emoji > are used should always be fair game. And should definitely be > something that gets a decent answer on this list. > > The second is a no-go. Only if you are aware of a convention shared by > others that associates an image or icon with a concept should you > propose to reify that by adding the image to the emoji set. (You may > even choose to not propose it, regardless). However, if there's no > existing convention, and you are not aware of one, you should > definitely not raise a proposal. > > In other words, just because you think something ought to be > expressable --- unless you have a concrete expression for it, it can't > even be evaluated or considered, and unless you have an idea that > (many) others agree both on the need to express something and on the > proposed expression there's little chance that a proposal would find > favor. > > A./ > > PS: now (3) "Is there anyone else here who thinks there ought to be a > visual shorthand for "Thank You!" and if so, what would it look like?" > would be a fair question. This list may not be the best one to ask it. > > On 10/5/2021 1:28 PM, Tex via Unicode wrote: >> That's a bit unfair Mark. Someone can want to represent an idea (be it an object, emotion, action, or other concept) and not have the visual or artistic skills to know how to depict it, or simply not know the best way to do so, given many options being considered. People should be able to ask a question without there being an implication if you have to ask there is no answer. >> >> And pictures often represent more than a static view of an object. The choice of the view and the context in the image can indicate action or other states. >> There are many photographic images that communicate sadness, loneliness or a host of other emotions without showing facial expressions at all. >> >> tex >> >> >> -----Original Message----- >> From: Unicode [mailto:unicode-bounces at corp.unicode.org] On Behalf Of Mark E. Shoulson via Unicode >> Sent: Tuesday, October 5, 2021 12:25 PM >> To:unicode at corp.unicode.org >> Subject: Re: Is there an emoji for Thank you >> >> Emoji are pictures of things. To the extent they convey emotions, it's >> because they're pictures of things (facial expressions) which we >> associate with emotions. >> >> If you can't say what it would look like, that almost definitionally >> excludes it from being a emoji, a picture of a thing, doesn't it? >> >> ~mark >> >> On 10/5/21 2:51 PM, William_J_G Overington via Unicode wrote: >>> Is there an emoji for >>> >>> Thank you >>> >>> ? >>> >>> If not, could there be, should there be? >>> >>> What would it look like? >>> >>> William Overington >>> >>> Tuesday 5 October 2021 > > From mark at macchiato.com Tue Oct 5 17:44:50 2021 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=) Date: Tue, 5 Oct 2021 15:44:50 -0700 Subject: Is there an emoji for Thank you In-Reply-To: <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> Message-ID: Already representable, so no emoji character necessary: ?? ? Mark On Tue, Oct 5, 2021 at 3:22 PM Dominikus Dittes Scherkl via Unicode < unicode at corp.unicode.org> wrote: > Am 05.10.21 um 20:51 schrieb William_J_G Overington via Unicode: > > Is there an emoji for > > > > Thank you > > > Something like Heart + Thumps up? > > -- > Dominikus Dittes Scherkl > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom at honermann.net Tue Oct 5 18:06:20 2021 From: tom at honermann.net (Tom Honermann) Date: Tue, 5 Oct 2021 19:06:20 -0400 Subject: Is there an emoji for Thank you In-Reply-To: <0b8ae6bd-5841-05ca-8d98-49529250689e@shoulson.com> References: <0b8ae6bd-5841-05ca-8d98-49529250689e@shoulson.com> Message-ID: <387F582F-F7CA-4AA5-9497-320F03BE554B@honermann.net> > On Oct 5, 2021, at 6:28 PM, Mark E. Shoulson via Unicode wrote: > > ?On 10/5/21 4:58 PM, Asmus Freytag via Unicode wrote: >> There's a HUUUGE distinction between >> >> (1) I wonder whether there's an emoji that's commonly used to express "Thank You!" (and if so, what does it look like?). >> >> (2) We should have / add an emoji for "Thank You!" but I have no idea what it would look like. > > Good distinction/explanation, thank you. > > "Thank You" is a "reaction" icon often wished-for in Slack conversations, at least where I work. Slack lets you upload your own "emoji" (even animated ones) so people have supplied their own, but they're generally text-based, like a teeny-tiny post-it note that says "Thank you!" on it, or some animated colorful text like "thx!" or whatever. But those are just pretty ways of using ordinary words and letters. At my work place, we use an image of Tom Hanks for this. Plenty of inspiration to found at https://www.etsy.com/market/tom_hanks_thanks! Tom. > > ~mark > > >> >> The first goes into the direction that Tex perhaps was aiming at: there are images that express ideas; some evoke these ideas spontaneously, others may be associated with an idea by convention. Existing emoji span both dimensions. Asking about how existing emoji are used should always be fair game. And should definitely be something that gets a decent answer on this list. >> >> The second is a no-go. Only if you are aware of a convention shared by others that associates an image or icon with a concept should you propose to reify that by adding the image to the emoji set. (You may even choose to not propose it, regardless). However, if there's no existing convention, and you are not aware of one, you should definitely not raise a proposal. >> >> In other words, just because you think something ought to be expressable --- unless you have a concrete expression for it, it can't even be evaluated or considered, and unless you have an idea that (many) others agree both on the need to express something and on the proposed expression there's little chance that a proposal would find favor. >> >> A./ >> >> PS: now (3) "Is there anyone else here who thinks there ought to be a visual shorthand for "Thank You!" and if so, what would it look like?" would be a fair question. This list may not be the best one to ask it. >> >>> On 10/5/2021 1:28 PM, Tex via Unicode wrote: >>> That's a bit unfair Mark. Someone can want to represent an idea (be it an object, emotion, action, or other concept) and not have the visual or artistic skills to know how to depict it, or simply not know the best way to do so, given many options being considered. People should be able to ask a question without there being an implication if you have to ask there is no answer. >>> >>> And pictures often represent more than a static view of an object. The choice of the view and the context in the image can indicate action or other states. >>> There are many photographic images that communicate sadness, loneliness or a host of other emotions without showing facial expressions at all. >>> >>> tex >>> >>> >>> -----Original Message----- >>> From: Unicode [mailto:unicode-bounces at corp.unicode.org] On Behalf Of Mark E. Shoulson via Unicode >>> Sent: Tuesday, October 5, 2021 12:25 PM >>> To:unicode at corp.unicode.org >>> Subject: Re: Is there an emoji for Thank you >>> >>> Emoji are pictures of things. To the extent they convey emotions, it's >>> because they're pictures of things (facial expressions) which we >>> associate with emotions. >>> >>> If you can't say what it would look like, that almost definitionally >>> excludes it from being a emoji, a picture of a thing, doesn't it? >>> >>> ~mark >>> >>> On 10/5/21 2:51 PM, William_J_G Overington via Unicode wrote: >>>> Is there an emoji for >>>> >>>> Thank you >>>> >>>> ? >>>> >>>> If not, could there be, should there be? >>>> >>>> What would it look like? >>>> >>>> William Overington >>>> >>>> Tuesday 5 October 2021 >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Tue Oct 5 18:35:02 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Tue, 5 Oct 2021 16:35:02 -0700 Subject: Is there an emoji for Thank you In-Reply-To: References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> Message-ID: An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Wed Oct 6 11:49:49 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Wed, 6 Oct 2021 17:49:49 +0100 (BST) Subject: The encoding of flags Message-ID: <12eb8552.1a16.17c5682de51.Webtop.100@btinternet.com> In https://www.unicode.org/L2/L2021/21172-esc-recs.pdf Emoji Subcommittee Report Q4, 2021 There is a section entitled Closing the Door on Flag Category F4 There is included the following. > We encourage vendors to instead support inline images, such as > stickers, GIFs, and so on My concern is that it is one thing for large organizations to agree on common standards that benefit consumers, but quite another for them to be asked at a meeting of the Unicode Technical Committee to agree to restrict future development. It is not what they are there to do. Are they allowed to do that as there is no basis for that restriction benefitting consumers? There could be a quite straightforward solution. Simply register with index numbers expressed in a sequence of tag digit characters all flags for which a glyph is supplied, as a white flag followed by a sequence of tag digits followed by a cancel tag. That way people who want flags encoded for unambiguous plain text use get what they want. William Overington Wednesday 6 October 2021 From jameskass at code2001.com Wed Oct 6 20:30:01 2021 From: jameskass at code2001.com (James Kass) Date: Thu, 7 Oct 2021 01:30:01 +0000 Subject: The encoding of flags In-Reply-To: <12eb8552.1a16.17c5682de51.Webtop.100@btinternet.com> References: <12eb8552.1a16.17c5682de51.Webtop.100@btinternet.com> Message-ID: On 2021-10-06 4:49 PM, William_J_G Overington via Unicode wrote: > My concern is that it is one thing for large organizations to agree on > common standards that benefit consumers, but quite another for them to > be asked at a meeting of the Unicode Technical Committee to agree to > restrict future development. It is not what they are there to do. Are > they allowed to do that as there is no basis for that restriction > benefitting consumers? > > There could be a quite straightforward solution. Simply register with > index numbers expressed in a sequence of tag digit characters all > flags for which a glyph is supplied, as a white flag followed by a > sequence of tag digits followed by a cancel tag. That way people who > want flags encoded for unambiguous plain text use get what they want. Flag categories are explained here: https://unicode.org/emoji/proposals.html#Flags Closing flags in category "F4" means that future emoji flag proposals will not be considered.? (Although "F3" might be opened by a future mechanism.) "F4" includes identity flags such as those used to support a philosophy, worldview, or sexual identity.? Such flags are legion and many are in a constant state of flux with competing designs. Closing the door on "F4" makes sense. Encouraging vendors to support in-line graphics is the best way to insure that the recipient sees exactly what the sender intended. There's at least a couple of alternatives to in-line graphics.? One, Unicode could announce that it is now the world's standard for clip-art encoding.? Or two, a mechanism like QID Emoji.? Supporting in-line graphics seems simpler.? And it would even work on non-flag images. By closing "F4", Unicode is encouraging future development -- either towards in-line graphics support or QID Emoji font creation. [QID Emoji fonts are possible with existing technology and don't require any action/approval by Unicode.? A lack of QID Emoji fonts suggests that there is little demand for them.? Rebecca Bettencourt had included six QID emoji strings (kudos for Morgan Freeman!) in the Fairfax HD font for testing purposes.? A graphic showing those six designs is here: https://twitter.com/BeckieRGB/status/1198039233526718465 .] From wjgo_10009 at btinternet.com Thu Oct 7 04:48:13 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Thu, 7 Oct 2021 10:48:13 +0100 (BST) Subject: QID emoji (from Re: The encoding of flags) In-Reply-To: References: <12eb8552.1a16.17c5682de51.Webtop.100@btinternet.com> Message-ID: <91d981d.28c4.17c5a273d6d.Webtop.100@btinternet.com> James Kass wrote: > QID Emoji fonts are possible with existing technology ... Yes. > ... and don't require any action/approval by Unicode. Well, is that correct? The most recent thing I saw was that it went to a recent Unicode Technical Committee meeting and that there was a long discussion. I do not know what was said or what conclusions, if any, were reached. > A lack of QID Emoji fonts suggests that there is little demand for > them. Well, if the format is approved and time has passed since that approval, then that could possibly be true, but has the format been approved? I produced an experimental font based on the original proposal to try the concept - only one glyph and that was just a test glyph displayed, not a realistic glyph. Yet it worked well and was a useful personal learning experience. So what is the present situation regarding QID emoji please? William Overington Thursday 7 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgcon6 at msn.com Thu Oct 7 13:55:23 2021 From: pgcon6 at msn.com (Peter Constable) Date: Thu, 7 Oct 2021 18:55:23 +0000 Subject: QID emoji (from Re: The encoding of flags) In-Reply-To: <91d981d.28c4.17c5a273d6d.Webtop.100@btinternet.com> References: <12eb8552.1a16.17c5682de51.Webtop.100@btinternet.com> <91d981d.28c4.17c5a273d6d.Webtop.100@btinternet.com> Message-ID: When UTC last discussed the QID proposal, there were significant concerns raised, and not a consensus to support. UTC has no plans for further discussion of the QID proposal. Peter From: Unicode On Behalf Of William_J_G Overington via Unicode Sent: October 7, 2021 2:48 AM To: unicode at corp.unicode.org Subject: QID emoji (from Re: The encoding of flags) James Kass wrote: > QID Emoji fonts are possible with existing technology ... Yes. > ... and don't require any action/approval by Unicode. Well, is that correct? The most recent thing I saw was that it went to a recent Unicode Technical Committee meeting and that there was a long discussion. I do not know what was said or what conclusions, if any, were reached. > A lack of QID Emoji fonts suggests that there is little demand for them. Well, if the format is approved and time has passed since that approval, then that could possibly be true, but has the format been approved? I produced an experimental font based on the original proposal to try the concept - only one glyph and that was just a test glyph displayed, not a realistic glyph. Yet it worked well and was a useful personal learning experience. So what is the present situation regarding QID emoji please? William Overington Thursday 7 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Thu Oct 7 13:51:29 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Thu, 7 Oct 2021 19:51:29 +0100 (BST) Subject: A poem in each of language-independent glyphs, French. and English Message-ID: <23f2c8bf.3eb8.17c5c189d96.Webtop.100@btinternet.com> Today is National Poetry Day in the United Kingdom. Here is a link to my contribution. The thread includes a poem in each of language-independent glyphs, French. and English. https://forum.affinity.serif.com/index.php?/topic/150548-national-poetry-day-2021-nationalpoetryday/ William Overington Thursday 7 October 2021 From wjgo_10009 at btinternet.com Thu Oct 7 14:03:27 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Thu, 7 Oct 2021 20:03:27 +0100 (BST) Subject: QID emoji (from Re: The encoding of flags) In-Reply-To: References: <12eb8552.1a16.17c5682de51.Webtop.100@btinternet.com> <91d981d.28c4.17c5a273d6d.Webtop.100@btinternet.com> Message-ID: <5b3302f2.3f04.17c5c2391ad.Webtop.100@btinternet.com> Thank you. William -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Thu Oct 7 15:16:32 2021 From: jameskass at code2001.com (James Kass) Date: Thu, 7 Oct 2021 20:16:32 +0000 Subject: QID emoji (from Re: The encoding of flags) In-Reply-To: <91d981d.28c4.17c5a273d6d.Webtop.100@btinternet.com> References: <12eb8552.1a16.17c5682de51.Webtop.100@btinternet.com> <91d981d.28c4.17c5a273d6d.Webtop.100@btinternet.com> Message-ID: <389a0b3b-b277-e0e2-adbf-fdd8da4f9a2b@code2001.com> On 2021-10-07 9:48 AM, William_J_G Overington via Unicode wrote: > > James Kass wrote: > >> QID Emoji fonts are possible with existing technology ... > > Yes. > >> ... and don't require any action/approval by Unicode. > > Well, is that correct? ... Unicode hasn't approved QID Emoji and approval appears unlikely, yet it works! With limited exceptions, Unicode doesn't standardize strings. Unicode only standardizes the characters of which those strings are composed.? Anybody can use any Unicode string as a pointer into anyone's database without seeking Unicode's permission. What this means is that a vendor like Apple or Google could set up to provide users with a plethora of new emoji glyphs/images using the blueprint offered in the QID Emoji proposal.? And Unicode wouldn't need to take any action or make any approval of it.? And if the major vendors lacked interest, third-partiers could step up to the plate.? If there was sufficient user demand for it. From beckiergb at gmail.com Thu Oct 7 17:27:03 2021 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Thu, 7 Oct 2021 15:27:03 -0700 Subject: QID emoji (from Re: The encoding of flags) In-Reply-To: <389a0b3b-b277-e0e2-adbf-fdd8da4f9a2b@code2001.com> References: <12eb8552.1a16.17c5682de51.Webtop.100@btinternet.com> <91d981d.28c4.17c5a273d6d.Webtop.100@btinternet.com> <389a0b3b-b277-e0e2-adbf-fdd8da4f9a2b@code2001.com> Message-ID: On Thu, Oct 7, 2021 at 1:20 PM James Kass via Unicode < unicode at corp.unicode.org> wrote: > Unicode hasn't approved QID Emoji and approval appears unlikely, yet it > works! > To a certain extent. For example, it works in Firefox, but not Chrome. Ironic given Mozilla's strong opposition to the proposal. [image: Screen Shot 2021-10-07 at 2.48.05 PM.png] But even if it works insofar as it displays the right glyph, I've seen some applications behave suboptimally when you go to select it, allowing the selection of individual tag characters, either as equal-width vertical slices of the glyph or a bunch of zero-width characters following it. Having it work properly everywhere would only happen if Unicode approved it or a third party with the clout of Apple or Google implemented it. And... If there was sufficient user demand for it. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screen Shot 2021-10-07 at 2.48.05 PM.png Type: image/png Size: 244114 bytes Desc: not available URL: From jameskass at code2001.com Thu Oct 7 20:15:25 2021 From: jameskass at code2001.com (James Kass) Date: Fri, 8 Oct 2021 01:15:25 +0000 Subject: QID emoji (from Re: The encoding of flags) In-Reply-To: References: <12eb8552.1a16.17c5682de51.Webtop.100@btinternet.com> <91d981d.28c4.17c5a273d6d.Webtop.100@btinternet.com> <389a0b3b-b277-e0e2-adbf-fdd8da4f9a2b@code2001.com> Message-ID: On 2021-10-07 10:27 PM, Rebecca Bettencourt via Unicode wrote: > To a certain extent. For example, it works in Firefox, but not Chrome. > Ironic given Mozilla's strong opposition to the proposal. > > [image: Screen Shot 2021-10-07 at 2.48.05 PM.png] > > But even if it works insofar as it displays the right glyph, I've seen some > applications behave suboptimally when you go to select it, allowing the > selection of individual tag characters, either as equal-width vertical > slices of the glyph or a bunch of zero-width characters following it. It works fine in LibreOffice running on Win 7, even for copy/paste of the strings.? I could not make it work in Chrome or BabelPad on Win 7.? It's possible that it would work in BabelPad and Chrome under newer versions of Windows.? I suspect that lack of correct display is caused by obsolete font engines. If anyone else would like to check it out, here's a link to the Fairfax HD page: https://www.kreativekorp.com/software/fonts/fairfaxhd.shtml -------------- next part -------------- A non-text attachment was scrubbed... Name: 20211007_2_Capture.JPG Type: image/jpeg Size: 121808 bytes Desc: not available URL: From duerst at it.aoyama.ac.jp Fri Oct 8 04:43:00 2021 From: duerst at it.aoyama.ac.jp (=?UTF-8?Q?Martin_J=2e_D=c3=bcrst?=) Date: Fri, 8 Oct 2021 18:43:00 +0900 Subject: Is there an emoji for Thank you In-Reply-To: References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> Message-ID: On 2021-10-06 08:35, Asmus Freytag via Unicode wrote: > On 10/5/2021 3:44 PM, Mark Davis ? via Unicode wrote: >> Already representable, so no emoji character necessary: ?? ? > > In regular writing, I would distinguish a circumlocution from "a word for it". > Both can get the meaning across, but they're clearly not the same. A similar > distinction is applicable to emoji. Well, ?? ? can be written ???, and then it would clearly be a a word (of two characters, so very short compared with the average). And "thank you" is a two-word phrase to start with. > However, sometimes we have a "set phrase". If it's the case that a certain > string of emoji acquires a conventional meaning, then that would be equivalent > to a set phrase. And presumably mean that having a single word for it becomes > much less of a concern. Yes. There are lots of concepts that use two or three words. It depends on the language, and in many ways is a question of orthography. German is famous for connecting things where other languages don't connect. > However, if everyone uses a different ad-hoc circumlocution I would not count > that as "representable" in the sense that matters for encoding decisions. > > I would make that as a principled distinction, irrespective of where you come > down here for "Thank You!". > > Andrew Glass had suggested: ? > > Clearly, neither his, not your suggestion are as universal as the spoken phrase > (within its language). So, you could say that a clear and unambiguous > representation in emoji does not (yet) exist. And it may never exist. ?, to just take an example, can be used for thank you, but also for to represent "please" or "praying/prayer", and probably other things. And that's not something Unicode can decide, it's the users who make things up. Regards, ? Martin. > A./ > From wjgo_10009 at btinternet.com Fri Oct 8 07:30:23 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Fri, 8 Oct 2021 13:30:23 +0100 (BST) Subject: QID emoji (from Re: The encoding of flags) In-Reply-To: References: <12eb8552.1a16.17c5682de51.Webtop.100@btinternet.com> <91d981d.28c4.17c5a273d6d.Webtop.100@btinternet.com> <389a0b3b-b277-e0e2-adbf-fdd8da4f9a2b@code2001.com> Message-ID: <38d37042.525a.17c5fe211c4.Webtop.100@btinternet.com> Rebecca Bettencourt wrote: > Having it work properly everywhere would only happen if Unicode > approved it or a third party with the clout of Apple or Google > implemented it. And... >> If there was sufficient user demand for it. Sort of like the Webdings glyphs. Is it the case that Microsoft of its own decision, nothing to do with an existing user demand, designed the Webdings glyphs, put them in a font and bundled the font as a "free with" font in with the Windows operating system? So end users found them available, then used them, so then they were widely used. If I remember correctly, when those glyphs were later proposed to be encoded into The Unicode Standard, there was a little speculation from outsiders as to whether some other large businesses would object, but in the event they did not. However, suspending your disbelief to the extent necessary as if watching Star Trek with its holodeck, suppose some people read my novels and consider that they would like a system as in the novels all implemented in real life unambiguously and interoperably between platforms in email systems and web pages and on mobile telephones, how could that "user demand" have any effect when the idea is not originated BY a big business, nor AT a university, but by someone NOT representing an organization? What does "sufficient" in the phrase "sufficient user demand" mean in practice? Is the bar so high as to be unreachable in practice? William Overington Friday 8 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Fri Oct 8 08:08:39 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Fri, 8 Oct 2021 14:08:39 +0100 (BST) Subject: QID emoji (from Re: The encoding of flags) In-Reply-To: <389a0b3b-b277-e0e2-adbf-fdd8da4f9a2b@code2001.com> References: <12eb8552.1a16.17c5682de51.Webtop.100@btinternet.com> <91d981d.28c4.17c5a273d6d.Webtop.100@btinternet.com> <389a0b3b-b277-e0e2-adbf-fdd8da4f9a2b@code2001.com> Message-ID: <6ae3212.5441.17c60051a34.Webtop.100@btinternet.com> James Kass wrote: > Anybody can use any Unicode string as a pointer into anyone's database > without seeking Unicode's permission. At present, with a couple of people, emails are sent between me and them using !123 to mean Good day. or its equivalent in the recipient's language of choice. It is just friendly. A more rugged format is an integral sign followed by circled digits. Is it a sort of mathematics? Possibly, as it is expressing information in a way that the information can be manipulated. So, no problem. However, although I suggest the possibility of using, if Unicode Inc. were to formally encode it, the base character of the QID proposal followed by a TAG EXCLAMATION MARK followed by some TAG DIGITs followed by a CANCEL TAG, it seems to me that it would not be correct to implement it without such formality. It is hard to say exactly why that is my opinion. I suppose that it is because the first one looks much like using hashtags, the second looks like mathematics, yet the third one looks cy-pr?s (so near) what Unicode Inc. does with sequences such as for some flags that it would not be right to do so without formal Unicode approval. > What this means is that a vendor like Apple or Google could set up to > provide users with a plethora of new emoji glyphs/images using the > blueprint offered in the QID Emoji proposal. And Unicode wouldn't > need to take any action or make any approval of it. And if the major > vendors lacked interest, third-partiers could step up to the plate. > If there was sufficient user demand for it. Well, possibly, but what about issues of clashes between intellectual property rights and the desire for unambiguous and interoperable use across platforms? If the scenario you suggest happened it could possibly lead to the sort of problems that happened before The Unicode Standard was produced and resolved them. William Overington Friday 8 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Fri Oct 8 11:36:45 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 8 Oct 2021 09:36:45 -0700 Subject: Is there an emoji for Thank you In-Reply-To: References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> Message-ID: <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> An HTML attachment was scrubbed... URL: From mark at macchiato.com Fri Oct 8 11:55:48 2021 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=) Date: Fri, 8 Oct 2021 09:55:48 -0700 Subject: Is there an emoji for Thank you In-Reply-To: <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> Message-ID: > Something like Heart + Thumps up? > Already representable, so no emoji character necessary: ?? ? I was responding to the first line. There is no need for an emoji of "Heart + Thumps up" because people can just write "heart" and then "thumbs up". Now, I don't think that would be particularly understood as "thank you". IMO this thread is pointless. We don't encode emoji for a concept that doesn't have a clear pictorial representation. That is clear when anyone expends a modicum of effort to read the guidelines on https://unicode.org/emoji/proposals.html instead of wasting other people's time. Mark On Fri, Oct 8, 2021 at 9:38 AM Asmus Freytag via Unicode < unicode at corp.unicode.org> wrote: > On 10/8/2021 2:43 AM, Martin J. D?rst via Unicode wrote: > > ..., if everyone uses a different ad-hoc circumlocution I would not count > that as "representable" in the sense that matters for encoding decisions. > > I would make that as a principled distinction, irrespective of where you > come > down here for "Thank You!". > > Andrew Glass had suggested: ? > > Clearly, neither his, not your suggestion are as universal as the spoken > phrase > (within its language). So, you could say that a clear and unambiguous > representation in emoji does not (yet) exist. > > > And it may never exist. ?, to just take an example, can be used for thank > you, but also for to represent "please" or "praying/prayer", and probably > other things. And that's not something Unicode can decide, it's the users > who make things up. > > Agreed. > > The point is, users of English have settled on a pretty universal phrase, > and you can settle the question whether that is "representable" in written > English. > > For the emoji writing system, users have "agreed" on all sorts of > conventions, like the secondary meaning given the "egg plant" emoji, but it > isn't clear to me that "thank you" has a common and recognizable > representation (yet). > > One may evolve, but just because anyone can put together two emoji that > (to them) express the concept of "thank you" doesn't mean that it is > "representable". If lots of people agree on such an emoji phrase, so that > they would use it when writing and recognize it with reasonable certainty > where they see it written, then we can say that that phrase or idiom is a > representation of that concept. > > Until that point you would have to say that the question is open. I don't > speak "emoji" well enough to know whether the ??? idiom has achieved > critical mass in recognition, but the fact that on this list we immediately > got an alternate, ?, illustrates the problem: the suggested idiom is at > this point not universal. > > This isn't to say that everything has to have a universal representation > or that all emoji can only have one meaning. Clearly, that's not how the > writing system works. Just as some languages have a much broader range of > "thank you!" expressions than others, or are able to use "thank you" to > mean a request. > > But before you call something "representable" in an evolving writing > system, there should be some expectation of that representation being > clearly recognized by others. Certainly if you use that verdict of > "representable" to foreclose other innovations, like adding a new emoji > (whether with a primary or alternate meaning covering that concept). > > A./ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenwhistler at sonic.net Fri Oct 8 12:30:35 2021 From: kenwhistler at sonic.net (Ken Whistler) Date: Fri, 8 Oct 2021 10:30:35 -0700 Subject: Is there an emoji for Thank you In-Reply-To: References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> Message-ID: On 10/8/2021 9:55 AM, Mark Davis ? (<-- identity signalling emoji) via Unicode wrote: > IMO this thread is pointless. Well, not completely. ? > We don't encode emoji for a concept that doesn't have a clear > pictorial?representation. Which explains why we have an emoji for "price tag" but not one for "price support", for example. But in addition to clearly pictographic emoji (which then get used to convey whatever associated connotations folks might connect with them), there are many emoji which convey emotions or other affective expressions through association with gestures, hand shapes and, of course, facial expressions. ? That aspect of emoji use constitutes a kind of minimal, abstracted sign language for the masses. And what the "thank you" thread then devolves to, IMO, is whether there are some very widely recognized gestures and/or facial expressions that could be reasonably associated with gratitude in a more or less conventional way. If so, then it *might* (but not necessarily) be reasonable to encode an emoji to represent that gesture or facial expression, and expect it then to be associated with gratitude. In other words, it could then serve as "an emoji for 'thank you'". I'm not holding my breath, though. --Ken From asmusf at ix.netcom.com Fri Oct 8 12:42:16 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 8 Oct 2021 10:42:16 -0700 Subject: Is there an emoji for Thank you In-Reply-To: References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> Message-ID: <5f58525c-c1bc-209a-9607-9d701e997e07@ix.netcom.com> An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Fri Oct 8 12:15:27 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Fri, 8 Oct 2021 18:15:27 +0100 (BST) Subject: Is there an emoji for Thank you In-Reply-To: <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> Message-ID: <3519a96c.5f48.17c60e70ca9.Webtop.100@btinternet.com> I can think of a design for a new emoji for Thank you. It is simply like a ligature of a zero with an underscore that has been raised up to align with the lower part of the zero and that has been shifted to the left so as to touch the zero, so that it as if drawn without lifting the pen from the paper, in some colour such as green. However, the design is abstract. So its meaning would need to be learned. Yet it is language-independent. So the question is whether an emoji can be an abstract design or must it always of necessity be a picture of either a real item or of more than one real items. William Overington Friday 8 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Fri Oct 8 13:08:07 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Fri, 8 Oct 2021 19:08:07 +0100 (BST) Subject: Is there an emoji for Thank you In-Reply-To: <5f58525c-c1bc-209a-9607-9d701e997e07@ix.netcom.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> <5f58525c-c1bc-209a-9607-9d701e997e07@ix.netcom.com> Message-ID: <2cba3597.60e7.17c6117464f.Webtop.100@btinternet.com> I replied to the earlier post by Asmus Freytag, with an abstract design idea and asking if abstract emoji are allowed, and only after I had posted it did I see the post from Mark Davis. I then looked at the document for which Mark Davis provides a link, but I have not as yet found anything in that document that says that an emoji must be a pictorial representation of a real object or objects. But even if that is the present policy, it could be changed at some future time. Sometimes change is necessary for progress. Unicode Inc. changed its policy on whether to encode any emoji at all. Allowing abstract emoji could open up great possibilities. I have already designed some abstract emoji for personal pronouns, colourful, language-independent, clearly distinct each from the others yet within a related design framework. William Overington Friday 8 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus.icu at gmail.com Fri Oct 8 13:23:19 2021 From: markus.icu at gmail.com (Markus Scherer) Date: Fri, 8 Oct 2021 11:23:19 -0700 Subject: Is there an emoji for Thank you In-Reply-To: References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> Message-ID: On Fri, Oct 8, 2021 at 10:34 AM Ken Whistler via Unicode < unicode at corp.unicode.org> wrote: > > We don't encode emoji for a concept that doesn't have a clear > > pictorial representation. > > Which explains why we have an emoji for "price tag" but not one for > "price support", for example. > > But in addition to clearly pictographic emoji (which then get used to > convey whatever associated connotations folks might connect with them), > there are many emoji which convey emotions or other affective > expressions through association with gestures, hand shapes and, of > course, facial expressions. ? That aspect of emoji use constitutes a > kind of minimal, abstracted sign language for the masses. > Yes, as long as people recognize those emoji and agree on what something means. For example, some cultures might associate a "snot bubble" with sleepiness, but elsewhere people will think someone is sick or crying. https://emojipedia.org/sleepy-face/ We encode certain pictures that we think people will find useful for a variety of things. YMMV markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From textexin at xencraft.com Fri Oct 8 14:06:32 2021 From: textexin at xencraft.com (Tex) Date: Fri, 8 Oct 2021 12:06:32 -0700 Subject: Is there an emoji for Thank you In-Reply-To: References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> Message-ID: <004f01d7bc77$9bd22690$d37673b0$@xencraft.com> This thread is getting repetitive, with the additional examples not really adding much new to the discussion. And it seems to only trigger new lines of inquiry along the path of ?well we can change policy again? and ?what about this design tactic?, despite those questions having been addressed. So is there an emoji for ?Please stop this thread? or ?No longer responding to this topic?? Maybe ?? or ?? or Speak-No-Evil Monkey on Google Android 12.0 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 45844 bytes Desc: not available URL: From root at corp.unicode.org Fri Oct 8 14:35:28 2021 From: root at corp.unicode.org (root at corp.unicode.org) Date: Fri, 08 Oct 2021 14:35:28 -0500 Subject: Is there an emoji for Thank you Message-ID: <61609d80.zV+xUyz5SckllE3e%root@corp.unicode.org> Tex asked: > So is there an emoji for ?Please stop this thread? > or ?No longer responding to this topic?? Please consider this thread closed, then. Time to move along to other topics. Thank you for your compliance. From pgcon6 at msn.com Fri Oct 8 20:46:10 2021 From: pgcon6 at msn.com (Peter Constable) Date: Sat, 9 Oct 2021 01:46:10 +0000 Subject: Is there an emoji for Thank you In-Reply-To: <004f01d7bc77$9bd22690$d37673b0$@xencraft.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> <004f01d7bc77$9bd22690$d37673b0$@xencraft.com> Message-ID: Or how about: ?? From: Unicode On Behalf Of Tex via Unicode Sent: Friday, October 8, 2021 12:07 PM To: unicode at corp.unicode.org Subject: RE: Is there an emoji for Thank you This thread is getting repetitive, with the additional examples not really adding much new to the discussion. And it seems to only trigger new lines of inquiry along the path of ?well we can change policy again? and ?what about this design tactic?, despite those questions having been addressed. So is there an emoji for ?Please stop this thread? or ?No longer responding to this topic?? Maybe ?? or ?? or [Speak-No-Evil Monkey on Google Android 12.0] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 45844 bytes Desc: image001.png URL: From textexin at xencraft.com Fri Oct 8 23:23:47 2021 From: textexin at xencraft.com (Tex) Date: Fri, 8 Oct 2021 21:23:47 -0700 Subject: Is there an emoji for Thank you In-Reply-To: References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> <004f01d7bc77$9bd22690$d37673b0$@xencraft.com> Message-ID: <000901d7bcc5$74898df0$5d9ca9d0$@xencraft.com> Good, but it might be better if the thread was unravelling? J That said, there is also: Shushing Face on Apple Zipper-Mouth Face on Apple From: Unicode [mailto:unicode-bounces at corp.unicode.org] On Behalf Of Peter Constable via Unicode Sent: Friday, October 8, 2021 6:46 PM To: Tex; unicode at unicode.org Subject: RE: Is there an emoji for Thank you Or how about: ?? From: Unicode On Behalf Of Tex via Unicode Sent: Friday, October 8, 2021 12:07 PM To: unicode at corp.unicode.org Subject: RE: Is there an emoji for Thank you This thread is getting repetitive, with the additional examples not really adding much new to the discussion. And it seems to only trigger new lines of inquiry along the path of ?well we can change policy again? and ?what about this design tactic?, despite those questions having been addressed. So is there an emoji for ?Please stop this thread? or ?No longer responding to this topic?? Maybe ?? or ?? or Speak-No-Evil Monkey on Google Android 12.0 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 29003 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 30793 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 45844 bytes Desc: not available URL: From jameskass at code2001.com Fri Oct 8 23:42:28 2021 From: jameskass at code2001.com (James Kass) Date: Sat, 9 Oct 2021 04:42:28 +0000 Subject: Encoding ConScripts In-Reply-To: <2cba3597.60e7.17c6117464f.Webtop.100@btinternet.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> <5f58525c-c1bc-209a-9607-9d701e997e07@ix.netcom.com> <2cba3597.60e7.17c6117464f.Webtop.100@btinternet.com> Message-ID: <28ccca77-8ece-22bf-2001-d0f46e49bc98@code2001.com> On 2021-10-08 6:08 PM, William_J_G Overington via Unicode wrote: > Allowing abstract emoji could open up great possibilities. I have > already designed some abstract emoji for personal pronouns, colourful, > language-independent, clearly distinct each from the others yet within > a related design framework. Anyone approaching Unicode with proposed new characters needs to point to existing use.? (Excluding emoji and items such as new currency symbols or era names.) Anyone designing new glyphs for personal pronouns is not creating ?emoji?, but rather is inventing a ConScript.? Most of us know that there?s a registry for ConScripts using the PUA.? So it would be necessary to assign PUA code points, generate a font, make the font generally available, publicize it, and hope it catches on. That seems to be the only clear path to Unicoding novel characters. If the PUA material usage reaches some kind of critical mass, someone would draft a proposal to Unicode. That could be perceived as a ?high bar?.? But it?s quite reasonable and realistic.? And it?s not insurmountable.? And once that critical mass has been achieved, such glyphs would be on-topic for this forum. From pandey at umich.edu Sat Oct 9 00:03:01 2021 From: pandey at umich.edu (Anshuman Pandey) Date: Sat, 9 Oct 2021 00:03:01 -0500 Subject: Encoding ConScripts Message-ID: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> ?Oh, how timely! At IUC 45 next Thursday, Deborah Anderson and I will be presenting on ?Negotiating Neographies: Approaches for Encoding Newly-Invented Scripts?. I?ll be discussing some metrics that may be used for evaluating neographies (nod to Ken W for that term), conscripts, or whatever you?d like to call them. Such metrics, as James pointed out, are necessary, especially considering the influx of proposals to encode newly-invented scripts, particularly those of Africa and South Asia. Sorry Mark, we won?t be covering Klingon? ??? or whatever it was that y?all decided was the emoji for ?thank you?. All my best, Anshu > On Oct 8, 2021, at 11:43 PM, James Kass via Unicode wrote: > > ? > >> On 2021-10-08 6:08 PM, William_J_G Overington via Unicode wrote: >> Allowing abstract emoji could open up great possibilities. I have already designed some abstract emoji for personal pronouns, colourful, language-independent, clearly distinct each from the others yet within a related design framework. > > Anyone approaching Unicode with proposed new characters needs to point to existing use. (Excluding emoji and items such as new currency symbols or era names.) > > Anyone designing new glyphs for personal pronouns is not creating ?emoji?, but rather is inventing a ConScript. Most of us know that there?s a registry for ConScripts using the PUA. So it would be necessary to assign PUA code points, generate a font, make the font generally available, publicize it, and hope it catches on. > > That seems to be the only clear path to Unicoding novel characters. If the PUA material usage reaches some kind of critical mass, someone would draft a proposal to Unicode. > > That could be perceived as a ?high bar?. But it?s quite reasonable and realistic. And it?s not insurmountable. And once that critical mass has been achieved, such glyphs would be on-topic for this forum. From richard.wordingham at ntlworld.com Sat Oct 9 04:48:29 2021 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Sat, 9 Oct 2021 10:48:29 +0100 Subject: Saravasti Message-ID: <20211009104829.393f8927@JRWUBU2> Has Saravasti retired as moderator? Richard. From steffen at sdaoden.eu Sat Oct 9 09:49:55 2021 From: steffen at sdaoden.eu (Steffen Nurpmeso) Date: Sat, 09 Oct 2021 16:49:55 +0200 Subject: Saravasti In-Reply-To: <20211009104829.393f8927@JRWUBU2> References: <20211009104829.393f8927@JRWUBU2> Message-ID: <20211009144955.jgUFB%steffen@sdaoden.eu> Richard Wordingham via Unicode wrote in <20211009104829.393f8927 at JRWUBU2>: |Has Saravasti retired as moderator? Saraswati. Saraswati. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) From wjgo_10009 at btinternet.com Sat Oct 9 06:45:09 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Sat, 9 Oct 2021 12:45:09 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: <28ccca77-8ece-22bf-2001-d0f46e49bc98@code2001.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> <5f58525c-c1bc-209a-9607-9d701e997e07@ix.netcom.com> <2cba3597.60e7.17c6117464f.Webtop.100@btinternet.com> <28ccca77-8ece-22bf-2001-d0f46e49bc98@code2001.com> Message-ID: <52687cbc.6ce8.17c64df0480.Webtop.100@btinternet.com> James Kass wrote: > Anyone designing new glyphs for personal pronouns is not creating > ?emoji?, but rather is inventing a ConScript. Well, my idea for trying to produce designs for emoji for personal pronouns is as a result of a comment made by a gentleman in the discussion after the lecture in the following video videographed at the Unicode and Internationalization Conference in 2015. Unicode Emoji: How do we standardize that je ne sais ?? at IUC39 https://www.youtube.com/watch?v=9ldSVbXbjl4 Starting at 38 minutes 40 seconds into the video. > Most of us know that there?s a registry for ConScripts using the PUA. That is one initiative. Also, anyone can use the Private Use Area on his or her own independent initiative. > So it would be necessary to assign PUA code points, ... Not necessarily. I decided to use an encoding system of my own design that I have named the Mariposa system. In my opinion, the Mariposa system is far more effective for this purpose than would be a Private Use Area encoding as the codes can be entered straightforwardly on a wide variety of devices and the Mariposa system avoids many of the problems of the wrong glyph being displayed that can occur when using a Private Use Area encoding more widely than in a carefully arranged sandbox-style situation, wherein a Private Use Area encoding can indeed be very effective. > ... generate a font, make the font generally available, publicize it, > and hope it catches on. The font has been produced and is available and has been publicised. https://corp.unicode.org/pipermail/unicode/2021-January/009300.html > That seems to be the only clear path to Unicoding novel characters. Well, the Mariposa system is a path that does not use the Private Use Area. > If the PUA material usage reaches some kind of critical mass, someone would draft a proposal to Unicode. As they are emoji, is that needed? William Overington Saturday 9 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Sat Oct 9 10:17:08 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Sat, 9 Oct 2021 16:17:08 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> Message-ID: <6e8d661f.7100.17c65a117d8.Webtop.100@btinternet.com> Anshuman Pandey wrote: > I?ll be discussing some metrics that may be used for evaluating > neographies (nod to Ken W for that term), conscripts, or whatever > you?d like to call them. One way I have of looking at an invention of mine is to regard it as a constructed language. So, I regard it as Language Y and then I am able for it to have a language code x-y which uses the Private Use facility of the language code system. Language Y has whole sentences, but no individual words, and each of those sentences is grammatically independent of all of the other sentences. Each sentence has a glyph and an encoding for the sentence. The idea is that Language Y can be used as a pivot language between natural languages, simply, and using computing or manual process, with precision of result, by having, for each natural language supported, what I call a sentence.dat file that has a list of pairs of sentences, one in Language Y, represented by its encoding, and one in the natural language, expressed in Unicode plain text. I fully appreciate that there are a vast number of possible sentences in a language, yet I am not purporting that this invention will do everything, just that it could potentially be useful in some specific situations. I am not a linguist. I am interested in languages and in communication through the language barrier. My background is in applied physics and mathematics and so I have looked at the issue as one of a mathematical nature, so it is just possible that that approach, linked with consideration by an expert linguist could produce something with useful originality. I accept that I may have missed something basic, but if any such issues are pointed out to me, maybe they can be resolved. I am hoping that one day that Language Y will become encoded into The Unicode Standard. William Overington Saturday 9 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Sat Oct 9 12:36:57 2021 From: jameskass at code2001.com (James Kass) Date: Sat, 9 Oct 2021 17:36:57 +0000 Subject: Encoding ConScripts In-Reply-To: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> Message-ID: <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> On 2021-10-09 5:03 AM, Anshuman Pandey via Unicode wrote: > At IUC 45 next Thursday, Deborah Anderson and I will be presenting on ?Negotiating Neographies: Approaches for Encoding Newly-Invented Scripts?. > > I?ll be discussing some metrics that may be used for evaluating neographies (nod to Ken W for that term), conscripts, or whatever you?d like to call them. > > Such metrics, as James pointed out, are necessary, especially considering the influx of proposals to encode newly-invented scripts, particularly those of Africa and South Asia. Neography is a splendid coinage. This is a fascinating topic and I hope that there will be a video of the presentation. Consider the following two imaginary proposals: 1)? I am an inventor and I have designed a brand new writing system for my people.? I would like for it to be in Unicode so that it will be standard.? Once it is in the Standard I will have something to point to, which might help me persuade my people to use it. 2)? We have developed a new writing system for our people.? We are using this new writing system to publish books and periodicals.? Our new writing system is being taught in our schools and our people have embraced the writing system wholeheartedly. Most everyone here will agree that proposal 1 is a complete non-starter and that proposal 2 would be given due consideration with a high probability of eventual acceptance. But many proposals will probably be somewhere in between those two scenarios.? Where does one draw the line?? I expect that the upcoming presentation will thoughtfully address this matter. From jameskass at code2001.com Sat Oct 9 12:41:22 2021 From: jameskass at code2001.com (James Kass) Date: Sat, 9 Oct 2021 17:41:22 +0000 Subject: Saravasti In-Reply-To: <20211009144955.jgUFB%steffen@sdaoden.eu> References: <20211009104829.393f8927@JRWUBU2> <20211009144955.jgUFB%steffen@sdaoden.eu> Message-ID: <3a739b5a-53e3-fd58-d37e-80a49f21cd2b@code2001.com> [as a generalization...] It is a wise list moderator who understands that not everybody can spell well. From doug at ewellic.org Sat Oct 9 13:21:22 2021 From: doug at ewellic.org (Doug Ewell) Date: Sat, 9 Oct 2021 12:21:22 -0600 Subject: Saravasti In-Reply-To: <3a739b5a-53e3-fd58-d37e-80a49f21cd2b@code2001.com> References: <20211009104829.393f8927@JRWUBU2> <20211009144955.jgUFB%steffen@sdaoden.eu> <3a739b5a-53e3-fd58-d37e-80a49f21cd2b@code2001.com> Message-ID: <009401d7bd3a$76a7f2f0$63f7d8d0$@ewellic.org> James Kass wrote: > [as a generalization...] > It is a wise list moderator who understands that not everybody can > spell well. Especially when the spelling in question is a transliteration. I have a relative who likes to point out that their name has "the biblical spelling," the one that appears in the KJV, not the spelling used today in most English-speaking contexts. This is in spite of the fact that all such spellings are transliterations from 6th-century BCE Hebrew, which had different phonology from modern English. Both spellings are, of course, "correct," especially when you consider that nearly anything goes when it comes to personal names. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From doug at ewellic.org Sat Oct 9 13:42:38 2021 From: doug at ewellic.org (Doug Ewell) Date: Sat, 9 Oct 2021 12:42:38 -0600 Subject: Saravasti In-Reply-To: <009401d7bd3a$76a7f2f0$63f7d8d0$@ewellic.org> References: <20211009104829.393f8927@JRWUBU2> <20211009144955.jgUFB%steffen@sdaoden.eu> <3a739b5a-53e3-fd58-d37e-80a49f21cd2b@code2001.com> <009401d7bd3a$76a7f2f0$63f7d8d0$@ewellic.org> Message-ID: <009501d7bd3d$6f562410$4e026c30$@ewellic.org> It was politely pointed out to me that the spelling difference in question was not about V versus W in "Sarasvati", which is a matter of transliteration, but rather the reversal of sounds in "Saravasti". That is clearly not correct, but likely just a typo which escaped the writer's eye (as it did mine). -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From asmusf at ix.netcom.com Sat Oct 9 14:03:46 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 9 Oct 2021 12:03:46 -0700 Subject: Saravasti In-Reply-To: <009501d7bd3d$6f562410$4e026c30$@ewellic.org> References: <20211009104829.393f8927@JRWUBU2> <20211009144955.jgUFB%steffen@sdaoden.eu> <3a739b5a-53e3-fd58-d37e-80a49f21cd2b@code2001.com> <009401d7bd3a$76a7f2f0$63f7d8d0$@ewellic.org> <009501d7bd3d$6f562410$4e026c30$@ewellic.org> Message-ID: An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Sat Oct 9 13:16:49 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Sat, 9 Oct 2021 19:16:49 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> Message-ID: <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> James Kass wrote: > Neography is a splendid coinage. How is the word 'neography' pronounced please? It seems to be made up as including neo- and graph , but is it pronounced similarly to geography, as nee-og-raphy? Or how please? William Overington Saturday 9 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Sat Oct 9 17:20:11 2021 From: jameskass at code2001.com (James Kass) Date: Sat, 9 Oct 2021 22:20:11 +0000 Subject: Encoding ConScripts In-Reply-To: <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> Message-ID: <865d5e90-5fc6-b7a0-6e11-fde03fa7f494@code2001.com> To clarify my original post, I'd failed to consider natural language neographies until Anshuman Pandey's reply.? My comments about PUA usage should only be applied to neographies for which no defined user community exists.? Widespread PUA usage should not be required as a metric for natural language neographies. From asmusf at ix.netcom.com Sat Oct 9 17:33:38 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 9 Oct 2021 15:33:38 -0700 Subject: Encoding ConScripts In-Reply-To: <865d5e90-5fc6-b7a0-6e11-fde03fa7f494@code2001.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <865d5e90-5fc6-b7a0-6e11-fde03fa7f494@code2001.com> Message-ID: An HTML attachment was scrubbed... URL: From mark at kli.org Sat Oct 9 20:56:21 2021 From: mark at kli.org (Mark E. Shoulson) Date: Sat, 9 Oct 2021 21:56:21 -0400 Subject: Encoding ConScripts In-Reply-To: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> Message-ID: <5db166ec-25be-989b-2f63-e045d3329610@shoulson.com> On 10/9/21 01:03, Anshuman Pandey via Unicode wrote: > ?Oh, how timely! > > At IUC 45 next Thursday, Deborah Anderson and I will be presenting on ?Negotiating Neographies: Approaches for Encoding Newly-Invented Scripts?. With similar timeliness, I followed Rebecca's link about QID emoji to her excellent pages on her fonts, which are positively dripping with ConScripts from the (U)CSUR registry of which she has taken stewardship.? (Particularly amazed at the Seussian Extensions, which I've never been able to work out to "fit" even remotely with a normal font.? And some of those unofficial punctuation marks deserve some serious consideration; I bet the andorpersand could get popular enough to deserve encoding someday.) > I?ll be discussing some metrics that may be used for evaluating neographies (nod to Ken W for that term), conscripts, or whatever you?d like to call them. > > Such metrics, as James pointed out, are necessary, especially considering the influx of proposals to encode newly-invented scripts, particularly those of Africa and South Asia. Yes, some sort of way to quantify need to encode.? The bar for the CSUR is quite low; I think a lot of those scripts have never been used by anyone but their inventors (which is fine for PUA assignments as in the CSUR).? I find the distinction growing in my mind between neato inventions someone invented and wants to share with the world and scripts that have some historical or literary weight and at least _some_ community of usage (which generally has to pass the rather subjective criterion of being big enough that I've heard of it.) > Sorry Mark, we won?t be covering Klingon? Yeah, I was at the UTC when the report from the Script Ad-Hoc was presented, and probably to everyone's great relief I decided not to comment on the response to Klingon there.? I figured I'd said pretty much everything I was going to say and wouldn't really be adding anything by bringing it up again. At this point, I have to assume that Klingon easily satisfies the "usage" requirement that was given as the ostensible reason for refusal back at the beginning.? I complained long about the chicken-and-egg problem we were faced with, but as I showed a few years ago, it is hard to dispute that we've overcome it and the script has seen a non-trivial amount of use.? (At least, I hope it's hard to dispute that!)? It seems there won't be any official movement, not even a smidgen, until we can resolve the potential IP issues.? I can only hope that when and if that's cleared up, there won't still remain the wholly undignified "dignity" argument. (If anyone disagrees and is willing to discuss fitness for encoding apart from IP problems, I'd be happy to engage, but my understanding is that nobody wants to re-examine _any_ criteria until they can look at _all_ the criteria, including the big one.) But yeah.? The bottom line is, Unicode *DOES NOT* encode things on the grounds that "this would be great if people used it," only on the grounds that people DO use it.? The exception is emoji, and they have their own rules. ~mark > > ??? or whatever it was that y?all decided was the emoji for ?thank you?. > > All my best, > Anshu > >> On Oct 8, 2021, at 11:43 PM, James Kass via Unicode wrote: >> >> ? >> >>> On 2021-10-08 6:08 PM, William_J_G Overington via Unicode wrote: >>> Allowing abstract emoji could open up great possibilities. I have already designed some abstract emoji for personal pronouns, colourful, language-independent, clearly distinct each from the others yet within a related design framework. >> Anyone approaching Unicode with proposed new characters needs to point to existing use. (Excluding emoji and items such as new currency symbols or era names.) >> >> Anyone designing new glyphs for personal pronouns is not creating ?emoji?, but rather is inventing a ConScript. Most of us know that there?s a registry for ConScripts using the PUA. So it would be necessary to assign PUA code points, generate a font, make the font generally available, publicize it, and hope it catches on. >> >> That seems to be the only clear path to Unicoding novel characters. If the PUA material usage reaches some kind of critical mass, someone would draft a proposal to Unicode. >> >> That could be perceived as a ?high bar?. But it?s quite reasonable and realistic. And it?s not insurmountable. And once that critical mass has been achieved, such glyphs would be on-topic for this forum. From mark at kli.org Sat Oct 9 20:59:03 2021 From: mark at kli.org (Mark E. Shoulson) Date: Sat, 9 Oct 2021 21:59:03 -0400 Subject: Encoding ConScripts In-Reply-To: <6e8d661f.7100.17c65a117d8.Webtop.100@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <6e8d661f.7100.17c65a117d8.Webtop.100@btinternet.com> Message-ID: <5ad12753-87f8-d239-0805-7f64f4944e5a@shoulson.com> On 10/9/21 11:17, William_J_G Overington via Unicode wrote: > > One way I have of looking at an invention of mine is to regard it as a > constructed language. > > So, I regard it as Language Y and then I am able for it to have a > language code > > x-y > > which uses the Private Use facility of the language code system. > > ... > > I am hoping that one day that Language Y will become encoded into The > Unicode Standard. > > And it may well be, and good luck with it.? But such encoding will only take place *after* language Y acquires some following and usage beyond a very small group, and when there is some corpus of literature being produced in it, etc.? Not before. ~mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at kli.org Sat Oct 9 21:02:29 2021 From: mark at kli.org (Mark E. Shoulson) Date: Sat, 9 Oct 2021 22:02:29 -0400 Subject: Encoding ConScripts In-Reply-To: References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <865d5e90-5fc6-b7a0-6e11-fde03fa7f494@code2001.com> Message-ID: <4174e63e-2bba-bb89-f20c-7b706b95f12a@shoulson.com> On 10/9/21 18:33, Asmus Freytag via Unicode wrote: > On 10/9/2021 3:20 PM, James Kass via Unicode wrote: >> >> To clarify my original post, I'd failed to consider natural language >> neographies until Anshuman Pandey's reply.? My comments about PUA >> usage should only be applied to neographies for which no defined user >> community exists.? Widespread PUA usage should not be required as a >> metric for natural language neographies. >> > Which means some other metric might need to be applied, like > institutional support or anticipated/promised official stature. > Or just generally non-computer usage.? Some of these proposals haven't been encoded even in PUA, but we are told about classes being taught in them (and people actually attending those classes!) and periodicals published (and people reading them!) and so on.? PUA usage isn't the only way to demonstrate usage.? Ink and paper work too. ~mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.wordingham at ntlworld.com Sun Oct 10 05:50:40 2021 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Sun, 10 Oct 2021 11:50:40 +0100 Subject: Saravasti In-Reply-To: References: <20211009104829.393f8927@JRWUBU2> <20211009144955.jgUFB%steffen@sdaoden.eu> <3a739b5a-53e3-fd58-d37e-80a49f21cd2b@code2001.com> <009401d7bd3a$76a7f2f0$63f7d8d0$@ewellic.org> <009501d7bd3d$6f562410$4e026c30$@ewellic.org> Message-ID: <20211010115040.461acd76@JRWUBU2> On Sat, 9 Oct 2021 12:03:46 -0700 Asmus Freytag via Unicode wrote: > But it is funny, when the correction switches transliteration variant. Though it probably just reflects which Indospheric language the corrector has been working with recently. I find it easy to make myself oblivious to the v-w difference. Anyway, I take it we have no direct news about Sarasvati? Richard. From prosfilaes at gmail.com Sun Oct 10 07:24:22 2021 From: prosfilaes at gmail.com (David Starner) Date: Sun, 10 Oct 2021 05:24:22 -0700 Subject: Arabic for South Sudan languages Message-ID: I know someone in South Sudan, and he mentioned in passing that the Sudanese military dictatorship, in pushing its Islamic policies, made Arabic orthographies for many South Sudanese languages (then south Sudanese, now South Sudanese) that were quite unsuccessful; the languages had too many vowels to map easily to Arabic. Is it likely too marginal for encoding? It's not in current use, nor do I know that anyone local cares about getting it encoded. -- The standard is written in English . If you have trouble understanding a particular section, read it again and again and again . . . Sit up straight. Eat your vegetables. Do not mumble. -- _Pascal_, ISO 7185 (1991) From jameskass at code2001.com Sun Oct 10 16:43:37 2021 From: jameskass at code2001.com (James Kass) Date: Sun, 10 Oct 2021 21:43:37 +0000 Subject: Arabic for South Sudan languages In-Reply-To: References: Message-ID: Has any effort been made to compare these orthographies with the current state of Unicode Arabic? From jameskass at code2001.com Sun Oct 10 17:35:59 2021 From: jameskass at code2001.com (James Kass) Date: Sun, 10 Oct 2021 22:35:59 +0000 Subject: Encoding ConScripts In-Reply-To: <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> Message-ID: <1958fbdf-342e-d286-00e5-5e435cb0432f@code2001.com> On 2021-10-09 6:16 PM, William_J_G Overington via Unicode wrote: > How is the word 'neography' pronounced please? Wouldn't it rhyme with 'orthography'? From abrahamgross at disroot.org Sun Oct 10 21:27:35 2021 From: abrahamgross at disroot.org (abrahamgross at disroot.org) Date: Mon, 11 Oct 2021 02:27:35 +0000 (UTC) Subject: Encoding ConScripts In-Reply-To: <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> Message-ID: Yes, its pronouced exactly like geography. (but with an "n" as the first letter) -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Sun Oct 10 21:45:25 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sun, 10 Oct 2021 19:45:25 -0700 Subject: Encoding ConScripts In-Reply-To: References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> Message-ID: An HTML attachment was scrubbed... URL: From pandey at umich.edu Sun Oct 10 22:04:54 2021 From: pandey at umich.edu (Anshuman Pandey) Date: Sun, 10 Oct 2021 22:04:54 -0500 Subject: Encoding ConScripts In-Reply-To: <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> References: <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> Message-ID: > On Oct 9, 2021, at 3:08 PM, William_J_G Overington via Unicode wrote: > ? > James Kass wrote: > > > Neography is a splendid coinage. > > How is the word 'neography' pronounced please? > > It seems to be made up as including neo- and graph > , > but is it pronounced similarly to geography, as nee-og-raphy? > > Or how please? The pronunciation would likely vary between British, American, Indian, and other world Englishes. William, let?s follow your cue and model the pronunciation of ?neography? upon that of ?geography?. So just replace the ?g? with ?n?? but no matter how you pronounce it, rest assured that we will understand what you mean whatever your optiphonetic analysis! All my best, Anshu -------------- next part -------------- An HTML attachment was scrubbed... URL: From pandey at umich.edu Sun Oct 10 22:08:49 2021 From: pandey at umich.edu (Anshuman Pandey) Date: Sun, 10 Oct 2021 22:08:49 -0500 Subject: Encoding ConScripts In-Reply-To: References: Message-ID: > On Oct 10, 2021, at 9:46 PM, Asmus Freytag via Unicode wrote: > > ? > On 10/10/2021 7:27 PM, abrahamgross--- via Unicode wrote: >> Yes, its pronouced exactly like geography. (but with an "n" as the first letter) > I knew it: /nzhografi/. > Exactly! ? Same morphophonology as the capital of Chad. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pandey at umich.edu Sun Oct 10 22:13:43 2021 From: pandey at umich.edu (Anshuman Pandey) Date: Sun, 10 Oct 2021 22:13:43 -0500 Subject: Encoding ConScripts In-Reply-To: <1958fbdf-342e-d286-00e5-5e435cb0432f@code2001.com> References: <1958fbdf-342e-d286-00e5-5e435cb0432f@code2001.com> Message-ID: <4E0F8D32-D3AA-4585-8E55-540521DB8480@umich.edu> > On Oct 10, 2021, at 5:36 PM, James Kass via Unicode wrote: > > ? >> On 2021-10-09 6:16 PM, William_J_G Overington via Unicode wrote: >> How is the word 'neography' pronounced please? > > Wouldn't it rhyme with 'orthography'? It might, depending upon which part of the world you?re reading it in: - ne-og-raphy - ne-ogra-phy - neo-graph-y From prosfilaes at gmail.com Sun Oct 10 22:28:29 2021 From: prosfilaes at gmail.com (David Starner) Date: Sun, 10 Oct 2021 20:28:29 -0700 Subject: Arabic for South Sudan languages In-Reply-To: References: Message-ID: On Sun, Oct 10, 2021 at 2:46 PM James Kass via Unicode wrote: > > Has any effort been made to compare these orthographies with the current > state of Unicode Arabic? My friend in South Sudan hasn't. I searched the Unicode code charts for mention of the major African languages in the area, to no avail. It's possible they're included under Extended vowel signs for African languages, 0BF4-0BFD. -- The standard is written in English . If you have trouble understanding a particular section, read it again and again and again . . . Sit up straight. Eat your vegetables. Do not mumble. -- _Pascal_, ISO 7185 (1991) From pandey at umich.edu Sun Oct 10 23:15:47 2021 From: pandey at umich.edu (Anshuman Pandey) Date: Sun, 10 Oct 2021 23:15:47 -0500 Subject: Arabic for South Sudan languages In-Reply-To: References: Message-ID: <68248E1B-8152-452B-910D-CF7139C47B9D@umich.edu> > On Oct 10, 2021, at 10:29 PM, David Starner via Unicode wrote: > > ?On Sun, Oct 10, 2021 at 2:46 PM James Kass via Unicode > wrote: >> >> Has any effort been made to compare these orthographies with the current >> state of Unicode Arabic? > > My friend in South Sudan hasn't. I searched the Unicode code charts > for mention of the major African languages in the area, to no avail. > It's possible they're included under Extended vowel signs for African > languages, 0BF4-0BFD. Such additions to Arabic are neography-adjacent IMHO. We would need examples of attested usage. But, I denounce the encoding in the Unicode standard of any new sign that results from coercive practices. I want to see *natural* support and usage. Not people being forced to use new signs. Disgusting. Just because a particular political group took over some area, does not compel Unicode to accept their coercion. Do I need a new metric for neographic signs: number of people murdered for institutional change? All my best, Anshu From richard.wordingham at ntlworld.com Sun Oct 10 23:34:20 2021 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Mon, 11 Oct 2021 05:34:20 +0100 Subject: Encoding ConScripts In-Reply-To: References: <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> Message-ID: <20211011053420.1293dc89@JRWUBU2> On Sun, 10 Oct 2021 22:04:54 -0500 Anshuman Pandey via Unicode wrote: > > On Oct 9, 2021, at 3:08 PM, William_J_G Overington via Unicode > > wrote: ? > > James Kass wrote: > > > > > Neography is a splendid coinage. > > > > How is the word 'neography' pronounced please? > > > > It seems to be made up as including neo- and graph > > , > > but is it pronounced similarly to geography, as nee-og-raphy? > > > > Or how please? > > The pronunciation would likely vary between British, American, > Indian, and other world Englishes. > > William, let?s follow your cue and model the pronunciation of > ?neography? upon that of ?geography?. > > So just replace the ?g? with ?n?? but no matter how you pronounce it, > rest assured that we will understand what you mean whatever your > optiphonetic analysis! I think a one foot /?n??r?f?/ would be unintelligible Richard. From pandey at umich.edu Sun Oct 10 23:41:30 2021 From: pandey at umich.edu (Anshuman Pandey) Date: Sun, 10 Oct 2021 23:41:30 -0500 Subject: Encoding ConScripts In-Reply-To: <20211011053420.1293dc89@JRWUBU2> References: <20211011053420.1293dc89@JRWUBU2> Message-ID: > On Oct 10, 2021, at 11:34 PM, Richard Wordingham wrote: > > ?On Sun, 10 Oct 2021 22:04:54 -0500 > Anshuman Pandey via Unicode wrote: > >>> On Oct 9, 2021, at 3:08 PM, William_J_G Overington via Unicode >>> wrote: ? >>> James Kass wrote: >>> >>>> Neography is a splendid coinage. >>> >>> How is the word 'neography' pronounced please? >>> >>> It seems to be made up as including neo- and graph >>> , >>> but is it pronounced similarly to geography, as nee-og-raphy? >>> >>> Or how please? >> >> The pronunciation would likely vary between British, American, >> Indian, and other world Englishes. >> >> William, let?s follow your cue and model the pronunciation of >> ?neography? upon that of ?geography?. >> >> So just replace the ?g? with ?n?? but no matter how you pronounce it, >> rest assured that we will understand what you mean whatever your >> optiphonetic analysis! > > I think a one foot /?n??r?f?/ would be unintelligible True, if there are folks who might pronounce onset ?neo-? as /?n?/ From prosfilaes at gmail.com Mon Oct 11 02:27:57 2021 From: prosfilaes at gmail.com (David Starner) Date: Mon, 11 Oct 2021 00:27:57 -0700 Subject: Arabic for South Sudan languages In-Reply-To: <68248E1B-8152-452B-910D-CF7139C47B9D@umich.edu> References: <68248E1B-8152-452B-910D-CF7139C47B9D@umich.edu> Message-ID: On Sun, Oct 10, 2021 at 9:15 PM Anshuman Pandey wrote: > > > > On Oct 10, 2021, at 10:29 PM, David Starner via Unicode wrote: > > > > ?On Sun, Oct 10, 2021 at 2:46 PM James Kass via Unicode > > wrote: > >> > >> Has any effort been made to compare these orthographies with the current > >> state of Unicode Arabic? > > > > My friend in South Sudan hasn't. I searched the Unicode code charts > > for mention of the major African languages in the area, to no avail. > > It's possible they're included under Extended vowel signs for African > > languages, 0BF4-0BFD. > > Such additions to Arabic are neography-adjacent IMHO. We would need examples of attested usage. The rules of such stuff seem to be weaker, or at least governments seem to be able to churn a decent sample of texts far quicker if they want. > But, I denounce the encoding in the Unicode standard of any new sign that results from coercive practices. I want to see *natural* support and usage. Not people being forced to use new signs. Disgusting. Just because a particular political group took over some area, does not compel Unicode to accept their coercion. What's natural? Half of the Latin and Cyrilic blocks are because some government declared "okay, we're all writing in Latin or Cyrillic today" and had the change made for minority languages. A huge amount of language and writing system change came at the point of the sword or gun. There's quite a few characters created by Soviet committees, who did a lot of languages in Latin then in Cyrillic due to political changes in the Soviet Union, and these are now encoded, even for reforms that changed direction after five or ten years. These characters seem quite similar; they're newer, but they're still historical characters included for recording historical texts. Even if they were current, "we're trying to unify our country on the Arabic script, and here's a thousand pages of printing in Dinke using the characters." Is Unicode really the place to fight the issue? -- The standard is written in English . If you have trouble understanding a particular section, read it again and again and again . . . Sit up straight. Eat your vegetables. Do not mumble. -- _Pascal_, ISO 7185 (1991) From jameskass at code2001.com Mon Oct 11 03:06:26 2021 From: jameskass at code2001.com (James Kass) Date: Mon, 11 Oct 2021 08:06:26 +0000 Subject: Arabic for South Sudan languages In-Reply-To: References: <68248E1B-8152-452B-910D-CF7139C47B9D@umich.edu> Message-ID: <14128d6c-a993-7da6-2ec6-34c978b6e2f7@code2001.com> On 2021-10-11 7:27 AM, David Starner via Unicode wrote: > What's natural? Half of the Latin and Cyrilic blocks are because some > government declared "okay, we're all writing in Latin or Cyrillic > today" and had the change made for minority languages. A huge amount > of language and writing system change came at the point of the sword > or gun. There's quite a few characters created by Soviet committees, > who did a lot of languages in Latin then in Cyrillic due to political > changes in the Soviet Union, and these are now encoded, even for > reforms that changed direction after five or ten years. These > characters seem quite similar; they're newer, but they're still > historical characters included for recording historical texts. Coercion is despicable, but historic text preservation is essential.? Denying or burying unpleasant aspects of our collective heritage denies our peoples the opportunity to learn from past mistakes.? Were we to impose value judgments based on script origin (be it from a military dictatorship or a deity), we'd risk passing along inaccurate and incomplete data to future historians, if any. (This is moot if Unicode's coverage of these orthographies has no gaps.) From richard.wordingham at ntlworld.com Mon Oct 11 04:38:10 2021 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Mon, 11 Oct 2021 10:38:10 +0100 Subject: Encoding ConScripts In-Reply-To: References: <20211011053420.1293dc89@JRWUBU2> Message-ID: <20211011103810.1309e3fa@JRWUBU2> On Sun, 10 Oct 2021 23:41:30 -0500 Anshuman Pandey via Unicode wrote: > > On Oct 10, 2021, at 11:34 PM, Richard Wordingham > > wrote: > > > > ?On Sun, 10 Oct 2021 22:04:54 -0500 > > Anshuman Pandey via Unicode wrote: > >> So just replace the ?g? with ?n?? but no matter how you pronounce > >> it, rest assured that we will understand what you mean whatever > >> your optiphonetic analysis! > > > > I think a one foot /?n??r?f?/ would be unintelligible > > True, if there are folks who might pronounce onset ?neo-? as /?n?/ I was thinking of people like me who tend to say /?d????r?fi/ for _geography_ could misinterpret the instruction, Richard. From jameskass at code2001.com Mon Oct 11 09:41:05 2021 From: jameskass at code2001.com (James Kass) Date: Mon, 11 Oct 2021 14:41:05 +0000 Subject: Encoding ConScripts In-Reply-To: <20211011103810.1309e3fa@JRWUBU2> References: <20211011053420.1293dc89@JRWUBU2> <20211011103810.1309e3fa@JRWUBU2> Message-ID: <006e61db-0aa9-b5f4-ded9-a4b3933019a8@code2001.com> I don't think that PUA use should be a required metric for a proposed neography, but that it's existence would improve a proposal's chances. The best way to exchange non-standard text as bytes is to use the PUA. PUA interchange demonstrates use, and use is a factor.? Enhancing a proposal's exhibits with screen-shots of web pages, especially something like a chat room or message board, also demonstrates use. These exhibits would make a proposal more robust, which means that a different script proposal lacking PUA interchange exhibits might be less robust.? Not having PUA interchange also means that they wouldn't be having any chat rooms in their own script, either, pending encoding and system support. Users will weigh the pros and cons of PUA and make their own decisions.?? Informed decisions are the best kind, so it might be desirable for Unicode to provide some guidance/education about PUA pros and cons specifically for neographic issues. From wjgo_10009 at btinternet.com Mon Oct 11 08:18:18 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Mon, 11 Oct 2021 14:18:18 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: <5ad12753-87f8-d239-0805-7f64f4944e5a@shoulson.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <6e8d661f.7100.17c65a117d8.Webtop.100@btinternet.com> <5ad12753-87f8-d239-0805-7f64f4944e5a@shoulson.com> Message-ID: <17d99517.882e.17c6f8102cc.Webtop.111@btinternet.com> Mark E. Shoulson wrote: > And it may well be, and good luck with it. But such encoding will > only take place *after* language Y acquires some following and usage > beyond a very small group, and when there is some corpus of literature > being produced in it, etc. Not before. Inspired by the idea of a corpus of literature in Language Y, I have written and published a poem. https://forum.affinity.serif.com/index.php?/topic/150736-a-poem-in-language-y/ William Overington Monday 11 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From john_h_jenkins at apple.com Mon Oct 11 11:37:03 2021 From: john_h_jenkins at apple.com (john_h_jenkins) Date: Mon, 11 Oct 2021 10:37:03 -0600 Subject: Encoding ConScripts In-Reply-To: <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> Message-ID: <2CC66AE8-87C1-472B-8FC8-80F129A00876@apple.com> > On Oct 9, 2021, at 12:16 PM, William_J_G Overington via Unicode wrote: > > James Kass wrote: > > > Neography is a splendid coinage. > > How is the word 'neography' pronounced please? > Well, obviously it should be ????????. Or, if you prefer, ????????. ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Mon Oct 11 11:49:35 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Mon, 11 Oct 2021 17:49:35 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: <2CC66AE8-87C1-472B-8FC8-80F129A00876@apple.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <2CC66AE8-87C1-472B-8FC8-80F129A00876@apple.com> Message-ID: <6732e086.931e.17c704271eb.Webtop.111@btinternet.com> > Well, obviously it should be ????????. Or, if you prefer, > ????????. Deseret and Shavian How about that! I used the Search facility in the Editor of High-Logic FontCreator to find the code points and then I used the search facility on the https://www.unicode.org/charts/ web page to find the code charts. Six minutes! William -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Mon Oct 11 12:21:24 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Mon, 11 Oct 2021 18:21:24 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> Message-ID: <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> James Kass wrote: > 1) I am an inventor and I have designed a brand new writing system > for my people. I would like for it to be in Unicode so that it will > be standard. Once it is in the Standard I will have something to > point to, which might help me persuade my people to use it. ... > Most everyone here will agree that proposal 1 is a complete > non-starter ... The inclusion of the word "Most" is good. Please note in particular the inclusion of "for my people". So we are not talking here of someone putting together what might be called a work of art, just as some people write novels, some paint pictures, some make clay pottery. So if someone proposed a well-designed script based on well-researched potential use in a culture, set forth in a substantial thesis at a level of around that of a thesis for a Master degree, I opine that it would be good to seriously study that thesis, conduct the equivalent of a viva voce examination, with the clear possibility of encoding that proposed script, notwithstanding there being no evidence of existing substantial use. I opine that if someone has put in that amount of effort and enthusiasm for his or her dream, he or she should be applauded and assisted to fulfil his or her dream. William Overington Monday 11 October 2021 ------ Original Message ------ From: "James Kass via Unicode" To: unicode at corp.unicode.org Sent: Saturday, 2021 Oct 9 At 18:36 Subject: Re: Encoding ConScripts On 2021-10-09 5:03 AM, Anshuman Pandey via Unicode wrote: At IUC 45 next Thursday, Deborah Anderson and I will be presenting on ?Negotiating Neographies: Approaches for Encoding Newly-Invented Scripts?. I?ll be discussing some metrics that may be used for evaluating neographies (nod to Ken W for that term), conscripts, or whatever you?d like to call them. Such metrics, as James pointed out, are necessary, especially considering the influx of proposals to encode newly-invented scripts, particularly those of Africa and South Asia. Neography is a splendid coinage. This is a fascinating topic and I hope that there will be a video of the presentation. Consider the following two imaginary proposals: 1) I am an inventor and I have designed a brand new writing system for my people. I would like for it to be in Unicode so that it will be standard. Once it is in the Standard I will have something to point to, which might help me persuade my people to use it. 2) We have developed a new writing system for our people. We are using this new writing system to publish books and periodicals. Our new writing system is being taught in our schools and our people have embraced the writing system wholeheartedly. Most everyone here will agree that proposal 1 is a complete non-starter and that proposal 2 would be given due consideration with a high probability of eventual acceptance. But many proposals will probably be somewhere in between those two scenarios. Where does one draw the line? I expect that the upcoming presentation will thoughtfully address this matter. -------------- next part -------------- An HTML attachment was scrubbed... URL: From haberg-1 at telia.com Mon Oct 11 13:41:15 2021 From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=) Date: Mon, 11 Oct 2021 20:41:15 +0200 Subject: Encoding ConScripts In-Reply-To: <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> Message-ID: <4E2532F8-DF7B-4797-ADC8-42CE56F41EA4@telia.com> > On 9 Oct 2021, at 20:16, William_J_G Overington via Unicode wrote: > > James Kass wrote: > > > Neography is a splendid coinage. > > How is the word 'neography' pronounced please? > > It seems to be made up as including neo- and graph neo- AmE |?ni?o?|, BrE |?ni???| -graphy AmE |?r?fi|, BrE |?r?fi| neography AmE |nio???r?fi|, |?nio???r?fi|, BrE |ni?????r?fi|, |?ni?????r?fi| > but is it pronounced similarly to geography, as nee-og-raphy? geo- AmE |?d?io?|, BrE |?j??, ?d?i???| geography AmE |d?i???r?fi|, BrE |d?????r?fi, ?d???r?fi| > Or how please? Just a suggestion.? From richard.wordingham at ntlworld.com Mon Oct 11 14:09:03 2021 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Mon, 11 Oct 2021 20:09:03 +0100 Subject: Encoding ConScripts In-Reply-To: <6732e086.931e.17c704271eb.Webtop.111@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <2CC66AE8-87C1-472B-8FC8-80F129A00876@apple.com> <6732e086.931e.17c704271eb.Webtop.111@btinternet.com> Message-ID: <20211011200903.2ee260d7@JRWUBU2> On Mon, 11 Oct 2021 17:49:35 +0100 (BST) William_J_G Overington via Unicode wrote: > > Well, obviously it should be ????????. Or, if you prefer, > > ????????. > > Deseret and Shavian > > How about that! > > I used the Search facility in the Editor of High-Logic FontCreator to > find the code points and then I used the search facility on the > > https://www.unicode.org/charts/ > > web page to find the code charts. > > Six minutes! > > William As I get the characters not obviously supported by fonts rendered as hex, grepping UnicodeData.txt was even quicker! Richard. From richard.wordingham at ntlworld.com Mon Oct 11 14:11:18 2021 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Mon, 11 Oct 2021 20:11:18 +0100 Subject: Encoding ConScripts In-Reply-To: <6732e086.931e.17c704271eb.Webtop.111@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <2CC66AE8-87C1-472B-8FC8-80F129A00876@apple.com> <6732e086.931e.17c704271eb.Webtop.111@btinternet.com> Message-ID: <20211011201118.23a4daef@JRWUBU2> On Mon, 11 Oct 2021 17:49:35 +0100 (BST) William_J_G Overington via Unicode wrote: > > Well, obviously it should be ????????. Or, if you prefer, > > ????????. > > Deseret and Shavian > > How about that! > > I used the Search facility in the Editor of High-Logic FontCreator to > find the code points and then I used the search facility on the > > https://www.unicode.org/charts/ > > web page to find the code charts. > > Six minutes! > > William As I get the not obviously supported characters rendered as the hex characters, just grepping the UCD file UnicodeData.txt was even quicker. Richard. From jameskass at code2001.com Mon Oct 11 17:22:52 2021 From: jameskass at code2001.com (James Kass) Date: Mon, 11 Oct 2021 22:22:52 +0000 Subject: Encoding ConScripts In-Reply-To: <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> Message-ID: <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> On 2021-10-11 5:21 PM, William_J_G Overington via Unicode opined: > >> Most everyone here will agree that proposal 1 is a complete >> non-starter ... > > The inclusion of the word "Most" is good. The word "most" seemed kinder than "with one exception,". From mark at kli.org Mon Oct 11 17:45:25 2021 From: mark at kli.org (Mark E. Shoulson) Date: Mon, 11 Oct 2021 18:45:25 -0400 Subject: Powerline symbols? Message-ID: The so-called "Powerline symbols" seem to have become practically an industry standard in coding fonts.? At least some of them (the "branch" symbol has become very popular, it seems).? I don't know the source of them, why they're called that, etc; you can google as well as I can (and probably some of you actually already know.) https://awesomeopensource.com/project/ryanoasis/powerline-extra-symbols is a page that showed up in my googling, for example.? Is there a reason _not_ to encode these?? Are they copyrighted or something? Or is it just a matter of needing a proposal? ~mark From jameskass at code2001.com Mon Oct 11 17:49:34 2021 From: jameskass at code2001.com (James Kass) Date: Mon, 11 Oct 2021 22:49:34 +0000 Subject: Encoding ConScripts In-Reply-To: <2CC66AE8-87C1-472B-8FC8-80F129A00876@apple.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <2CC66AE8-87C1-472B-8FC8-80F129A00876@apple.com> Message-ID: <39bb3057-b0fd-cd99-c7a3-e794b307348f@code2001.com> On 2021-10-11 4:37 PM, john_h_jenkins via Unicode wrote: > Well, obviously it should be ????????. Or, if you prefer, ????????. > > ? Or even ???????? / ?????????. From mark at kli.org Mon Oct 11 18:05:22 2021 From: mark at kli.org (Mark E. Shoulson) Date: Mon, 11 Oct 2021 19:05:22 -0400 Subject: Powerline symbols? In-Reply-To: References: Message-ID: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> https://readthedocs.org/projects/powerline/downloads/pdf/latest/ may be a point of origin, as this is a piece of software actually named "Powerline" and dating from a few years ago.? It requires its own special fonts (obviously, since the characters aren't encoded), and credits Fabrizio Schiavi for the glyphs (which is to say, he drew them, not necessarily a claim of copyright or whatever.)? I... guess the only thing missing is a proposal?? (Well, and approval of the proposal, of course.) ~mark On 10/11/21 18:45, Mark E. Shoulson via Unicode wrote: > The so-called "Powerline symbols" seem to have become practically an > industry standard in coding fonts.? At least some of them (the > "branch" symbol has become very popular, it seems).? I don't know the > source of them, why they're called that, etc; you can google as well > as I can (and probably some of you actually already know.) > https://awesomeopensource.com/project/ryanoasis/powerline-extra-symbols > is a page that showed up in my googling, for example.? Is there a > reason _not_ to encode these?? Are they copyrighted or something? Or > is it just a matter of needing a proposal? > > ~mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From beckiergb at gmail.com Mon Oct 11 18:16:48 2021 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Mon, 11 Oct 2021 16:16:48 -0700 Subject: Powerline symbols? In-Reply-To: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> Message-ID: As you (Mark) discovered, the name originates from the piece of software which first used these characters, called Powerline. It's a plugin for vim, tmux, bash, i3, and several other environments that adds a fancy status line to the terminal. The characters have been proposed before, in document L2/19-068R2. The SAH recommended encoding three of them (the branch symbol and the row and column number symbols) but the UTC took no action. I vaguely recall a recommendation (from the SAH?) for the author, Renzhi Li, to contact the "Terminals Working Group" (Doug Ewell, me, and a few other individuals) to work out integrating them into a "round 2" Symbols for Legacy Computing proposal. We were never contacted by the author but we integrated them into a "round 2" proposal anyway, with the suggestion to use the same code points as were recommended by the SAH. That "round 2" proposal was brought to the UTC but for some reason was never added to the document register. We had an hour-long meeting in which the UTC reviewed it and had several concerns that were not resolved within that hour. The proposal has not progressed further since then. -- Rebecca Bettencourt -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at kli.org Mon Oct 11 18:32:41 2021 From: mark at kli.org (Mark E. Shoulson) Date: Mon, 11 Oct 2021 19:32:41 -0400 Subject: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> Message-ID: <49e274ef-50c4-0bbd-57fb-48f28b9cd99b@shoulson.com> Ugh, sorry.? I knew they looked familiar from a Unicode standpoint, but I checked the pipeline and they weren't there. Should have googled for them in a Unicode context. The branch symbol and the row- and column-number characters should be encoded.? They seem to be quite popular.? The neatocool path separators are questionable.? I was reminded of the Powerline symbols upon coming across https://starship.rs/, a feature-heavy prompt-setting program for various command shells, which uses the branch symbol by default. https://awesomeopensource.com/projects/powerline/theme lists eighteen projects using or involving the Powerline symbols.? This really shouldn't be an issue.? Is it odd for the Script Ad-Hoc to recommend characters but the UTC not to pick them up (without saying why)?? Seems strange to me.? And then "round 2" proposal was dropped and not added to the register?? Also odd.? Anyway, like I said, they seem to be used heavily by a lot of projects. The SAH's points about using the existing LOCK character and not encoding the triangles are reasonable, at least at this point. But the branch symbol has become very common. ~mark On 10/11/21 19:16, Rebecca Bettencourt via Unicode wrote: > As you (Mark) discovered, the name originates from the piece of > software which first used these characters, called Powerline. It's a > plugin for vim, tmux, bash, i3, and several other environments > that?adds a fancy status line to the terminal. > > The characters have been proposed before, in document L2/19-068R2. The > SAH recommended encoding three of them (the branch symbol and the row > and column number symbols) but the UTC took no action. I vaguely > recall a recommendation (from the SAH?) for the author, Renzhi Li, to > contact the "Terminals Working Group" (Doug Ewell, me, and a few other > individuals) to work out integrating them into a "round 2" Symbols for > Legacy Computing proposal. We were never contacted by the author but > we integrated them into a "round 2" proposal anyway,?with the > suggestion to use the same code points as were recommended by the SAH. > > That "round 2" proposal was brought to the UTC but for some reason was > never added to the document register. We had an hour-long meeting in > which the UTC reviewed it and had several concerns that were not > resolved within that?hour. The proposal has not progressed further > since then. > > -- Rebecca Bettencourt > -------------- next part -------------- An HTML attachment was scrubbed... URL: From duerst at it.aoyama.ac.jp Mon Oct 11 19:15:52 2021 From: duerst at it.aoyama.ac.jp (=?UTF-8?Q?Martin_J=2e_D=c3=bcrst?=) Date: Tue, 12 Oct 2021 09:15:52 +0900 Subject: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> Message-ID: <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> The idea of making status lines and prompts more informative by using color and various graphics looks very convenient. But what visual form to use for what semantics is wide open to configuration and personal preferences, and may develop in various directions as the idea catches on further, which may mean that it's premature for encoding. But I'm sure this has been discussed in the meetings mentioned below. Regards, ? Martin. On 2021-10-12 08:16, Rebecca Bettencourt via Unicode wrote: > As you (Mark) discovered, the name originates from the piece of software > which first used these characters, called Powerline. It's a plugin for vim, > tmux, bash, i3, and several other environments that adds a fancy status > line to the terminal. > > The characters have been proposed before, in document L2/19-068R2. The SAH > recommended encoding three of them (the branch symbol and the row and > column number symbols) but the UTC took no action. I vaguely recall a > recommendation (from the SAH?) for the author, Renzhi Li, to contact the > "Terminals Working Group" (Doug Ewell, me, and a few other individuals) to > work out integrating them into a "round 2" Symbols for Legacy Computing > proposal. We were never contacted by the author but we integrated them into > a "round 2" proposal anyway, with the suggestion to use the same code > points as were recommended by the SAH. > > That "round 2" proposal was brought to the UTC but for some reason was > never added to the document register. We had an hour-long meeting in which > the UTC reviewed it and had several concerns that were not resolved within > that hour. The proposal has not progressed further since then. > > -- Rebecca Bettencourt > From duerst at it.aoyama.ac.jp Mon Oct 11 19:19:43 2021 From: duerst at it.aoyama.ac.jp (=?UTF-8?Q?Martin_J=2e_D=c3=bcrst?=) Date: Tue, 12 Oct 2021 09:19:43 +0900 Subject: Encoding ConScripts In-Reply-To: <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> Message-ID: <6f90ad8e-7458-2023-f819-520dd6bfdd13@it.aoyama.ac.jp> Maybe Unicode has to add the following two items to its encoding guidelines: - Unicode doesn't encode dreams. - Unicode doesn't encode (master/...) theses. That could help avoid useless discussions. Regards, ? Martin. From mark at kli.org Mon Oct 11 19:35:20 2021 From: mark at kli.org (Mark E. Shoulson) Date: Mon, 11 Oct 2021 20:35:20 -0400 Subject: Powerline symbols? In-Reply-To: <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> Message-ID: <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Ah, but that is precisely a question Unicode need not answer or worry about!? If the meaning changes, then the meaning changes, and maybe the name is obsolete.? But the character is still a character, and still the same one! But Rick McGowan mentioned to me (off-list) the potential argument someone could raise that status lines don't count as "text" usage, which is a fair point.? Hence the sensible resistance to encoding the Powerline "triangle" characters.? To me, a prompt is more "text" than a status line, but not as much as true "text"; is it "text" enough?? I could totally see the branch symbol catching on as a conventional sigil for marking a version-control branch in email and conversation, which would be very texty indeed, but I can't say that I've seen that happen yet, so it doesn't count.? But we have other UI symbols already, like CANCELLATION X, MINIMIZE, MAXIMIZE, etc.? Is this really less worthy?? Perhaps things will be considered differently when/if Rebecca can dust off that proposal and re-submit it.? Times change, maybe this is texty enough now.? Me, I think it probably is, but it isn't (and shouldn't be!) my decision. ~mark On 10/11/21 20:15, Martin J. D?rst via Unicode wrote: > The idea of making status lines and prompts more informative by using > color and various graphics looks very convenient. But what visual form > to use for what semantics is wide open to configuration and personal > preferences, and may develop in various directions as the idea catches > on further, which may mean that it's premature for encoding. But I'm > sure this has been discussed in the meetings mentioned below. > > Regards, ? Martin. > > On 2021-10-12 08:16, Rebecca Bettencourt via Unicode wrote: >> As you (Mark) discovered, the name originates from the piece of software >> which first used these characters, called Powerline. It's a plugin >> for vim, >> tmux, bash, i3, and several other environments that adds a fancy status >> line to the terminal. >> >> The characters have been proposed before, in document L2/19-068R2. >> The SAH >> recommended encoding three of them (the branch symbol and the row and >> column number symbols) but the UTC took no action. I vaguely recall a >> recommendation (from the SAH?) for the author, Renzhi Li, to contact the >> "Terminals Working Group" (Doug Ewell, me, and a few other >> individuals) to >> work out integrating them into a "round 2" Symbols for Legacy Computing >> proposal. We were never contacted by the author but we integrated >> them into >> a "round 2" proposal anyway, with the suggestion to use the same code >> points as were recommended by the SAH. >> >> That "round 2" proposal was brought to the UTC but for some reason was >> never added to the document register. We had an hour-long meeting in >> which >> the UTC reviewed it and had several concerns that were not resolved >> within >> that hour. The proposal has not progressed further since then. >> >> -- Rebecca Bettencourt >> From abrahamgross at disroot.org Mon Oct 11 20:02:22 2021 From: abrahamgross at disroot.org (abrahamgross at disroot.org) Date: Tue, 12 Oct 2021 01:02:22 +0000 (UTC) Subject: Powerline symbols? In-Reply-To: <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Message-ID: <9bbc5050-4c2e-434d-b877-9aa83650843e@disroot.org> I'd wager and say that many awesome font symbols (which powerline is a part of) should also be encoded as theyre heavily used all over. From beckiergb at gmail.com Mon Oct 11 20:24:44 2021 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Mon, 11 Oct 2021 18:24:44 -0700 Subject: Powerline symbols? In-Reply-To: <9bbc5050-4c2e-434d-b877-9aa83650843e@disroot.org> References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> <9bbc5050-4c2e-434d-b877-9aa83650843e@disroot.org> Message-ID: You mean Font Awesome? They're used in (mostly web) UIs as icons, not as part of text (like Dingbats/Wingdings/Webdings are) or in a terminal (like Box Drawing/Symbols for Legacy Computing/Powerline symbols are). That's an important distinction. -- Rebecca Bettencourt On Mon, Oct 11, 2021 at 6:06 PM abrahamgross--- via Unicode < unicode at corp.unicode.org> wrote: > I'd wager and say that many awesome font symbols (which powerline is a > part of) should also be encoded as theyre heavily used all over. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From duerst at it.aoyama.ac.jp Mon Oct 11 21:21:38 2021 From: duerst at it.aoyama.ac.jp (=?UTF-8?Q?Martin_J=2e_D=c3=bcrst?=) Date: Tue, 12 Oct 2021 11:21:38 +0900 Subject: Powerline symbols? In-Reply-To: <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Message-ID: On 2021-10-12 09:35, Mark E. Shoulson via Unicode wrote: > Ah, but that is precisely a question Unicode need not answer or worry > about!? If the meaning changes, then the meaning changes, and maybe the > name is obsolete.? But the character is still a character, and still the > same one! I meant that the shapes are still quite in flux. Regards, ? Martin. From ratmice at gmail.com Mon Oct 11 22:15:27 2021 From: ratmice at gmail.com (Matt Rice) Date: Tue, 12 Oct 2021 03:15:27 +0000 Subject: Powerline symbols? In-Reply-To: <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Message-ID: On Tue, Oct 12, 2021 at 12:38 AM Mark E. Shoulson via Unicode wrote: > > To me, a prompt is more "text" than a > status line, but not as much as true "text"; is it "text" enough? Since we're talking about prompts and status lines, I haven't actually messed with ZWJ's, but one thing I noticed is that the thermometer emoji seems to just have the one warm character, rather than a sequence of characters from empty to full, https://fontawesome.com/v5.15/icons/thermometer-empty?style=solid https://fontawesome.com/v5.15/icons/thermometer-full?style=solid Not going to bother linking to them all, but there are more. Status lines are a bit strange in that they are almost animated unlike prompts. From christoph.paeper at crissov.de Tue Oct 12 00:29:30 2021 From: christoph.paeper at crissov.de (=?utf-8?Q?Christoph_P=C3=A4per?=) Date: Tue, 12 Oct 2021 07:29:30 +0200 Subject: Powerline symbols? In-Reply-To: References: Message-ID: <46F7D767-121B-4DAA-87CC-4EA45B416913@crissov.de> Font Awesome only recently started adding Unicode mapping to many of its characters, finally. There are many other symbol fonts out there that are used primarily in web design. A lot of the symbols don?t really differ much, graphically and semantically, from ones already encoded. Different fonts use different PUA assignments, though. As with original Japanese emojis, Unicode should step in and lead an effort to harmonize and standardize the code points used for (non-logo) pictographs. Cheers Christoph P?per > Am 12.10.2021 um 03:26 schrieb Rebecca Bettencourt via Unicode : > > ? > You mean Font Awesome? They're used in (mostly web) UIs as icons, not as part of text (like Dingbats/Wingdings/Webdings are) or in a terminal (like Box Drawing/Symbols for Legacy Computing/Powerline symbols are). That's an important distinction. > > -- Rebecca Bettencourt > > >> On Mon, Oct 11, 2021 at 6:06 PM abrahamgross--- via Unicode wrote: >> I'd wager and say that many awesome font symbols (which powerline is a part of) should also be encoded as theyre heavily used all over. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Tue Oct 12 04:20:09 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Tue, 12 Oct 2021 10:20:09 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: <6f90ad8e-7458-2023-f819-520dd6bfdd13@it.aoyama.ac.jp> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <6f90ad8e-7458-2023-f819-520dd6bfdd13@it.aoyama.ac.jp> Message-ID: <6a548a99.a524.17c73cd583c.Webtop.111@btinternet.com> Martin J. D?rst wrote: > Maybe Unicode has to add the following two items to its encoding > guidelines:> - Unicode doesn't encode dreams. > - Unicode doesn't encode (master/...) theses. Yet maybe the Unicode Technical Committee will establish a Neographies Subcommittee and appoint to it a number of well-qualified and experienced linguists who will advise on encoding neographies into plane 7, including on the basis that I have suggested. William Overington Tuesday 12 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sosipiuk at gmail.com Tue Oct 12 10:21:20 2021 From: sosipiuk at gmail.com (=?UTF-8?Q?S=C5=82awomir_Osipiuk?=) Date: Tue, 12 Oct 2021 11:21:20 -0400 Subject: Encoding ConScripts In-Reply-To: <4E2532F8-DF7B-4797-ADC8-42CE56F41EA4@telia.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <4E2532F8-DF7B-4797-ADC8-42CE56F41EA4@telia.com> Message-ID: On Mon, Oct 11, 2021 at 2:44 PM Hans ?berg via Unicode wrote: > > neography AmE |nio???r?fi|, |?nio???r?fi|, BrE |ni?????r?fi|, |?ni?????r?fi| The ? definitely doesn't belong. When saying "neography" I'm not completely including the name of Neo from the Matrix. It is, to me, exactly the same as "geography", except for that very first letter. All vowel sounds match perfectly between the two words, and I would expect that to be true no matter how your local accent affects those vowels. S?awomir Osipiuk From haberg-1 at telia.com Tue Oct 12 11:11:10 2021 From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=) Date: Tue, 12 Oct 2021 18:11:10 +0200 Subject: Encoding ConScripts In-Reply-To: References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <4E2532F8-DF7B-4797-ADC8-42CE56F41EA4@telia.com> Message-ID: > On 12 Oct 2021, at 17:21, S?awomir Osipiuk wrote: > > On Mon, Oct 11, 2021 at 2:44 PM Hans ?berg via Unicode > wrote: >> >> neography AmE |nio???r?fi|, |?nio???r?fi|, BrE |ni?????r?fi|, |?ni?????r?fi| > > The ? definitely doesn't belong. When saying "neography" I'm not > completely including the name of Neo from the Matrix. It varies, some dictionaries, particularly BrE oriented, write |?(?)?|. The AmE sound files I have checked for "neodymium" have it included, whereas the BrE excluded it. > It is, to me, exactly the same as "geography", except for that very > first letter. All vowel sounds match perfectly between the two words, > and I would expect that to be true no matter how your local accent > affects those vowels. It is very different, cf. the IPA I gave. From sosipiuk at gmail.com Tue Oct 12 11:53:34 2021 From: sosipiuk at gmail.com (=?UTF-8?Q?S=C5=82awomir_Osipiuk?=) Date: Tue, 12 Oct 2021 12:53:34 -0400 Subject: Encoding ConScripts In-Reply-To: References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <4E2532F8-DF7B-4797-ADC8-42CE56F41EA4@telia.com> Message-ID: On Tue, Oct 12, 2021 at 12:11 PM Hans ?berg wrote: > > It is very different, cf. the IPA I gave. I'm questioning the IPA you gave. I had a look through my printed dictionary and it unfortunately doesn't include "neography" but I found two distinct patterns anyway. The first has the initial vowel sound in Neo (from the Matrix), neoprene, and geodesic. These are all the same. The second has the vowel sound of neologism, geography, and in my view, neography. These are all the same. You seem to be saying that neography should belong to the first group. I don't agree. It falls very naturally into the second group. From haberg-1 at telia.com Tue Oct 12 12:17:37 2021 From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=) Date: Tue, 12 Oct 2021 19:17:37 +0200 Subject: Encoding ConScripts In-Reply-To: References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <4E2532F8-DF7B-4797-ADC8-42CE56F41EA4@telia.com> Message-ID: > On 12 Oct 2021, at 18:53, S?awomir Osipiuk via Unicode wrote: > > On Tue, Oct 12, 2021 at 12:11 PM Hans ?berg wrote: >> >> It is very different, cf. the IPA I gave. > > I'm questioning the IPA you gave. > > I had a look through my printed dictionary and it unfortunately > doesn't include "neography" but I found two distinct patterns anyway. Clearly not, as it is a newly invented word we discuss suggestions for pronunciation of. > The first has the initial vowel sound in Neo (from the Matrix), > neoprene, and geodesic. These are all the same. > > The second has the vowel sound of neologism, geography, and in my > view, neography. These are all the same. > > You seem to be saying that neography should belong to the first group. > I don't agree. It falls very naturally into the second group. Language is not logical. There are words deriving from "neo-", Ancient Greek ?????, "new", with primary stress both on it and the following syllable. However, in "neoprene" there is only one syllable in "-prene", and in "neodymimum" there are two in "-dymimum". Perhaps this is a reason for different stress?linguists might detail. The word "neologism" however comes from French "n?ologisme", which might explain why it gets a different stress, like "geography" which also comes from French, "g?ographie". From wjgo_10009 at btinternet.com Tue Oct 12 13:18:41 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Tue, 12 Oct 2021 19:18:41 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> Message-ID: <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> James Kass wrote: > On 2021-10-11 5:21 PM, William_J_G Overington via Unicode opined: Most everyone here will agree that proposal 1 is a complete non-starter ... The inclusion of the word "Most" is good. > The word "most" seemed kinder than "with one exception,". Indeed. Thank you. It also has the advantage that if one or more people agree with me that it will still be true. Policies often change more by a catastrophe theory jump than by a smooth process, due to positive feedback effects. The character map has plenty of unused space. If an exception can be made for emoji then an exception can be made for encoding by examined thesis. Sauce for pasta is sauce for rice. Futuristic policies are needed for the future. William Overington Tuesday 12 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Mon Oct 11 19:18:51 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Mon, 11 Oct 2021 17:18:51 -0700 Subject: Arabic for South Sudan languages In-Reply-To: <68248E1B-8152-452B-910D-CF7139C47B9D@umich.edu> References: <68248E1B-8152-452B-910D-CF7139C47B9D@umich.edu> Message-ID: <7ec4e1d9-e269-5031-fcbd-a6e92c229695@ix.netcom.com> On 10/10/2021 9:15 PM, Anshuman Pandey via Unicode wrote: > But, I denounce the encoding in the Unicode standard of any new sign that results from coercive practices. I want to see*natural* support and usage. Not people being forced to use new signs. Disgusting. Just because a particular political group took over some area, does not compel Unicode to accept their coercion. Preventing people from using an orthography because you dislike their reason for using it shouldn't be one of the process goals. Where do you draw the line? Prescriptive orthography reform? Now, the reverse would be true. If someone came and said "we've outlawed the following orthography/characters" that's not a reason for Unicode do mark them deprecated or to overturn stability and delete them. A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kent.b.karlsson at bahnhof.se Tue Oct 12 16:04:32 2021 From: kent.b.karlsson at bahnhof.se (Kent Karlsson) Date: Tue, 12 Oct 2021 23:04:32 +0200 Subject: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> Message-ID: <6B397FAB-43B9-481A-9DCB-816EA933F2FF@bahnhof.se> I found https://www.unicode.org/L2/L2019/19343-script-adhoc-recs.pdf, discussing the proposal in L2/19-068R2. There are some recommendations there to update the proposal. But apparently that has not been done since, IIUC. As to some of the comments and recommendations in 19343-script-adhoc-recs.pdf: ?Some Script Ad Hoc members felt these are entities used in a closed system, inappropriate for interchange as characters, and could be handled as PUA characters? I would not consider (just about) every terminal emulator used in the world as a ?closed system?. I could agree that the Powerline symbols are a bit of a hack. But it is a very neat hack for the purpose. But ?(only) used in a closed system?, no that is definitely not the case. Several of the symbols are useful (though many are fanciful, looking outside of the proposal itself), but I disagree with the selection noted in the Ad Hoc committee document: U+1FBCB VERSION CONTROL BRANCH SYMBOL OK! Great for those of us that use version control systems. Not sure one needs to include ?version control? in the name of the character. OCR FORK could work as a replacement, but nah... U+1FBCC COLUMN NUMBER INDICATOR Not great, too English oriented (and ugly?). I?d recommend a right-ish arrow instead. U+1FBCD LINE NUMBER INDICATOR Not great, too English oriented (and ugly?). I?d recommend a down-ish arrow instead. The triangles and line triangles are better; useful, and not English oriented. But, granted, mostly, though not entirely (separators/terminators), decorative. But as noted in the proposal, there are already encoded variants of these. But there is no need to duplicate the lock symbol, as noted in the Script Ad Hoc notes. /Kent K > 12 okt. 2021 kl. 01:16 skrev Rebecca Bettencourt via Unicode : > > As you (Mark) discovered, the name originates from the piece of software which first used these characters, called Powerline. It's a plugin for vim, tmux, bash, i3, and several other environments that adds a fancy status line to the terminal. > > The characters have been proposed before, in document L2/19-068R2. The SAH recommended encoding three of them (the branch symbol and the row and column number symbols) but the UTC took no action. I vaguely recall a recommendation (from the SAH?) for the author, Renzhi Li, to contact the "Terminals Working Group" (Doug Ewell, me, and a few other individuals) to work out integrating them into a "round 2" Symbols for Legacy Computing proposal. We were never contacted by the author but we integrated them into a "round 2" proposal anyway, with the suggestion to use the same code points as were recommended by the SAH. > > That "round 2" proposal was brought to the UTC but for some reason was never added to the document register. We had an hour-long meeting in which the UTC reviewed it and had several concerns that were not resolved within that hour. The proposal has not progressed further since then. > > -- Rebecca Bettencourt > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at kli.org Tue Oct 12 16:39:29 2021 From: mark at kli.org (Mark E. Shoulson) Date: Tue, 12 Oct 2021 17:39:29 -0400 Subject: Encoding ConScripts In-Reply-To: <6a548a99.a524.17c73cd583c.Webtop.111@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <6f90ad8e-7458-2023-f819-520dd6bfdd13@it.aoyama.ac.jp> <6a548a99.a524.17c73cd583c.Webtop.111@btinternet.com> Message-ID: <2805ea13-f1d7-4ec2-1d02-9b9941e3f39a@shoulson.com> That sounds like what the Scripts Ad-Hoc is for, isn't it?? I mean, SAH's purpose includes this one (as well as non-new orthographies.) Nobody knows the future; for the currently-forseeable future I don't think encoding masters theses/dreams is going to come to pass. ~mark On 10/12/21 05:20, William_J_G Overington via Unicode wrote: > Martin J. D?rst wrote: > > > > Maybe Unicode has to add the following two items to its encoding > guidelines: > > > - Unicode doesn't encode dreams. > > - Unicode doesn't encode (master/...) theses. > Yet maybe the Unicode Technical Committee will establish a Neographies > Subcommittee and appoint to it a number of well-qualified and > experienced linguists who will advise on encoding neographies into > plane 7, including on the basis that I have suggested. > > > William Overington > > > Tuesday 12 October 2021 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Tue Oct 12 19:59:38 2021 From: jameskass at code2001.com (James Kass) Date: Wed, 13 Oct 2021 00:59:38 +0000 Subject: Encoding ConScripts In-Reply-To: <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> Message-ID: <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> On 2021-10-12 6:18 PM, William_J_G Overington via Unicode wrote: > > The word "most" seemed kinder than "with one exception,". > > Indeed. Thank you. > > It also has the advantage that if one or more people agree with me > that it will still be true. Yes.? And if you could find just one person to agree with you, then you'd both be wrong. > > ...? The character map has plenty of unused space. It's not like a penny collection folder where the goal is to fill every slot. > > If an exception can be made for emoji then an exception can be made > for encoding by examined thesis. Encoding emoji was an anomaly prompted by peculiar circumstances and a unique chain of events.? Nobody should ever cite it as precedent, so of course everybody will.? Sigh. > > Futuristic policies are needed for the future. Anybody can predict the future if accuracy isn't important.? In the event, policies must be based on current realities rather than speculation. From pandey at umich.edu Tue Oct 12 20:37:54 2021 From: pandey at umich.edu (Anshuman Pandey) Date: Tue, 12 Oct 2021 20:37:54 -0500 Subject: Arabic for South Sudan languages In-Reply-To: <7ec4e1d9-e269-5031-fcbd-a6e92c229695@ix.netcom.com> References: <68248E1B-8152-452B-910D-CF7139C47B9D@umich.edu> <7ec4e1d9-e269-5031-fcbd-a6e92c229695@ix.netcom.com> Message-ID: On Tue, Oct 12, 2021 at 1:24 PM Asmus Freytag wrote: > On 10/10/2021 9:15 PM, Anshuman Pandey via Unicode wrote: > > But, I denounce the encoding in the Unicode standard of any new sign that results from coercive practices. I want to see **natural** support and usage. Not people being forced to use new signs. Disgusting. Just because a particular political group took over some area, does not compel Unicode to accept their coercion. > > Preventing people from using an orthography because you dislike their > reason for using it shouldn't be one of the process goals. Where do you > draw the line? Prescriptive orthography reform? > > Now, the reverse would be true. If someone came and said "we've outlawed > the following orthography/characters" that's not a reason for Unicode do > mark them deprecated or to overturn stability and delete them. > You're absolutely right. Actually, I'm a bit ashamed of myself for letting my historian's objectivity get clouded by the phantoms of 'coercive practices'. It was arrogant to gauge the suitability of encoding characters and scripts by factoring in their political provenance or manner of origin. After all, as David Starner mentioned, the proliferation of dominant world writing systems was likely the result of less-than-democratic practices. James Kass is also spot on in his remark that historical texts require preservation, without bias. I renounce my denounce. All my best, Anshu -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Wed Oct 13 03:14:02 2021 From: jameskass at code2001.com (James Kass) Date: Wed, 13 Oct 2021 08:14:02 +0000 Subject: Encoding ConScripts In-Reply-To: <52687cbc.6ce8.17c64df0480.Webtop.100@btinternet.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> <5f58525c-c1bc-209a-9607-9d701e997e07@ix.netcom.com> <2cba3597.60e7.17c6117464f.Webtop.100@btinternet.com> <28ccca77-8ece-22bf-2001-d0f46e49bc98@code2001.com> <52687cbc.6ce8.17c64df0480.Webtop.100@btinternet.com> Message-ID: On 2021-10-09 11:45 AM, William_J_G Overington via Unicode wrote: > Well, my idea for trying to produce designs for emoji for personal > pronouns is as a result of a comment made by a gentleman in the > discussion after the lecture in the following video videographed at > the Unicode and Internationalization Conference in 2015. > > Unicode Emoji: How do we standardize that je ne sais ?? at IUC39 > > https://www.youtube.com/watch?v=9ldSVbXbjl4 > > Starting at 38 minutes 40 seconds into the video. There's nothing wrong with designing new glyphs covering pronouns in response to a lament about their lack in emoji made at a conference several years back.? Although met with skepticism on this list (and elsewhere), there's always a possibility that emoji users might welcome pronoun coverage. The way to find out is to float a proposal.? Here's the guidelines: https://unicode.org/emoji/proposals.html This forum would be a good place to discuss anything unclear in the guidelines as well as to ask questions regarding any feedback generated further on down the line.? Glyph design and so forth is mostly off-topic here.? And since we already know that many people find pronouns useful, there would be no need to discuss their potential usefulness here. Good luck! From jameskass at code2001.com Wed Oct 13 03:57:13 2021 From: jameskass at code2001.com (James Kass) Date: Wed, 13 Oct 2021 08:57:13 +0000 Subject: Encoding ConScripts In-Reply-To: <6f90ad8e-7458-2023-f819-520dd6bfdd13@it.aoyama.ac.jp> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <6f90ad8e-7458-2023-f819-520dd6bfdd13@it.aoyama.ac.jp> Message-ID: <8453d4ee-1d77-0dfa-d3c2-c1292bd6778f@code2001.com> On 2021-10-12 12:19 AM, Martin J. D?rst via Unicode wrote: > Maybe Unicode has to add the following two items to its encoding > guidelines: > > - Unicode doesn't encode dreams. > - Unicode doesn't encode (master/...) theses. > > That could help avoid useless discussions. Anyone who has actually read the guidelines should already know this, but it couldn't hurt. Unicode also doesn't grant encoding as a participation trophy, no matter how admirable one's persistence might be over a span of many years. From doug at ewellic.org Wed Oct 13 10:22:39 2021 From: doug at ewellic.org (Doug Ewell) Date: Wed, 13 Oct 2021 09:22:39 -0600 Subject: Encoding ConScripts In-Reply-To: <39bb3057-b0fd-cd99-c7a3-e794b307348f@code2001.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <2CC66AE8-87C1-472B-8FC8-80F129A00876@apple.com> <39bb3057-b0fd-cd99-c7a3-e794b307348f@code2001.com> Message-ID: <000001d7c046$295aa8d0$7c0ffa70$@ewellic.org> OK, James forced me into this thread. >> Well, obviously it should be ????????. Or, if you prefer, ????????. > > Or even ???????? / ?????????. Pronunciation of newly coined English words is notoriously unstandardized, but I agree with those who notice the similarity between "geography" (which I pronounce /?i???????fi?/) and "neography" (hence /ni???????fi?/). There's no /o?/ as in "go" in either of these words, not for me anyway. So the latter would be ?????????, or really ????????? as the schwa should be used minimally. Don't forget the accent on multi-syllable words. This is probably not much farther out of scope than some other recent posts. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From doug at ewellic.org Wed Oct 13 11:31:38 2021 From: doug at ewellic.org (Doug Ewell) Date: Wed, 13 Oct 2021 10:31:38 -0600 Subject: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> Message-ID: <000c01d7c04f$cc51eef0$64f5ccd0$@ewellic.org> I?ve responded privately to Rebecca and to Debbie Anderson from SAH. ? We did make a significant effort to get Powerline and other symbols encoded, comparable to the successful multi-year effort to encode Symbols for Legacy Computing. We encountered additional challenges with this newer proposal, and I was unable to devote time in the first half on 2021 to resolving them, but we will try again. ? -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org ? From: Unicode On Behalf Of Rebecca Bettencourt via Unicode Sent: Monday, October 11, 2021 17:17 To: Mark E. Shoulson Cc: unicode at corp.unicode.org Subject: Re: Powerline symbols? ? As you (Mark) discovered, the name originates from the piece of software which first used these characters, called Powerline. It's a plugin for vim, tmux, bash, i3, and several other environments that ?adds a fancy status line to the terminal. ? The characters have been proposed before, in document L2/19-068R2. The SAH recommended encoding three of them (the branch symbol and the row and column number symbols) but the UTC took no action. I vaguely recall a recommendation (from the SAH?) for the author, Renzhi Li, to contact the "Terminals Working Group" (Doug Ewell, me, and a few other individuals) to work out integrating them into a "round 2" Symbols for Legacy Computing proposal. We were never contacted by the author but we integrated them into a "round 2" proposal anyway, ?with the suggestion to use the same code points as were recommended by the SAH. ? That "round 2" proposal was brought to the UTC but for some reason was never added to the document register. We had an hour-long meeting in which the UTC reviewed it and had several concerns that were not resolved within that ?hour. The proposal has not progressed further since then. -- Rebecca Bettencourt ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From kent.b.karlsson at bahnhof.se Wed Oct 13 12:32:19 2021 From: kent.b.karlsson at bahnhof.se (Kent Karlsson) Date: Wed, 13 Oct 2021 19:32:19 +0200 Subject: Powerline symbols? In-Reply-To: <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Message-ID: > 12 okt. 2021 kl. 02:35 skrev Mark E. Shoulson via Unicode : > > But Rick McGowan mentioned to me (off-list) the potential argument someone could raise that status lines don't count as "text" usage, which is a fair point. No, no, no. It is normal practice to copy text from a terminal emulator window to something else: a document (notes, or instructions, or examples), a chat, an email. That would include the prompts. Preferably preserving the formatting (colour, bold, underline, ?) of both prompts and other texts (commands (normally not styled other than default) and output text, which is often coloured and sometimes have other styling). The practice of taking image snippets of the terminal emulator window is common, but mostly annoying. While it preserves the formatting (in a way), often the resolution is too low (having shrunk the image) making it hard to read, and? one cannot copy-paste text from an image. /Kent K From doug at ewellic.org Wed Oct 13 12:46:32 2021 From: doug at ewellic.org (Doug Ewell) Date: Wed, 13 Oct 2021 11:46:32 -0600 Subject: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Message-ID: <001c01d7c05a$42b1b0d0$c8151270$@ewellic.org> Kent Karlsson wrote: > It is normal practice to copy text from a terminal emulator window to > something else: a document (notes, or instructions, or examples), a > chat, an email. That would include the prompts. Preferably preserving > the formatting (colour, bold, underline, ?) of both prompts and other > texts (commands (normally not styled other than default) and output > text, which is often coloured and sometimes have other styling). It's important to note that this would have to be a rich-text copy and paste, unless the color and styling could be copied and pasted via ECMA-48 sequences. > The practice of taking image snippets of the terminal emulator window > is common, but mostly annoying. While it preserves the formatting (in > a way), often the resolution is too low (having shrunk the image) > making it hard to read, and? one cannot copy-paste text from an image. +1 -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From wjgo_10009 at btinternet.com Wed Oct 13 05:39:28 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Wed, 13 Oct 2021 11:39:28 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> <5f58525c-c1bc-209a-9607-9d701e997e07@ix.netcom.com> <2cba3597.60e7.17c6117464f.Webtop.100@btinternet.com> <28ccca77-8ece-22bf-2001-d0f46e49bc98@code2001.com> <52687cbc.6ce8.17c64df0480.Webtop.100@btinternet.com> Message-ID: <6c04bbd2.d0d7.17c793c51d7.Webtop.111@btinternet.com> The following request for allowing abstract emoji to become in scope has received no reply as far as I am aware. https://www.unicode.org/L2/L2021/21068-pubrev.html#Emoji_Feedback If abstract emoji become in scope for consideration, then a proposal document would be allowed to go forward for consideration. William ------ Original Message ------ From: "James Kass via Unicode" To: unicode at corp.unicode.org Sent: Wednesday, 2021 Oct 13 At 09:14 Subject: Re: Encoding ConScripts On 2021-10-09 11:45 AM, William_J_G Overington via Unicode wrote: Well, my idea for trying to produce designs for emoji for personal pronouns is as a result of a comment made by a gentleman in the discussion after the lecture in the following video videographed at the Unicode and Internationalization Conference in 2015. Unicode Emoji: How do we standardize that je ne sais ?? at IUC39 https://www.youtube.com/watch?v=9ldSVbXbjl4 Starting at 38 minutes 40 seconds into the video. There's nothing wrong with designing new glyphs covering pronouns in response to a lament about their lack in emoji made at a conference several years back. Although met with skepticism on this list (and elsewhere), there's always a possibility that emoji users might welcome pronoun coverage. The way to find out is to float a proposal. Here's the guidelines: https://unicode.org/emoji/proposals.html This forum would be a good place to discuss anything unclear in the guidelines as well as to ask questions regarding any feedback generated further on down the line. Glyph design and so forth is mostly off-topic here. And since we already know that many people find pronouns useful, there would be no need to discuss their potential usefulness here. Good luck! -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Wed Oct 13 06:37:01 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Wed, 13 Oct 2021 12:37:01 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> Message-ID: <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> James Kass wrote: > On 2021-10-12 6:18 PM, William_J_G Overington via Unicode wrote: The word "most" seemed kinder than "with one exception,". Indeed. Thank you. It also has the advantage that if one or more people agree with me that it will still be true. > Yes. And if you could find just one person to agree with you, then > you'd both be wrong. Well, I opine that if one or more people do choose to agree with me on this, then his, her, or their posts should be considered on the basis of what he, she, or they post, and not on the basis of a prejudging before the posting takes place. ... The character map has plenty of unused space. ... The character map has plenty of unused space. > It's not like a penny collection folder where the goal is to fill > every slot. That is correct. However, if a proposal by thesis is accepted for encoding without evidence of substantial existing use it may be around a hundred characters each needing a code point or fewer out of well over half a million thus far unused code points. So the goal is very very much less than filling every slot. It is a matter of the risk of the allowing of an encoding and it not being used balanced against the amount of good that could be done if the encoding being accepted leads to widespread use. If an exception can be made for emoji then an exception can be made for encoding by examined thesis. > Encoding emoji was an anomaly prompted by peculiar circumstances and a > unique chain of events. I would say 'particular' rather than 'peculiar'. > Nobody should ever cite it as precedent, so of course everybody will. > Sigh. No, it is a precedent. The rules were changed for emoji. So on a sauce for pasta is sauce for rice basis, as the Unicode Technical Committee has changed the rules for one set of particular circumstances, it can change them again for other particular circumstances. Whether the Unicode Technical Committee does actually change the rules for some specific particular circumstances is for the future, yet the precedent exists and it is perfectly reasonable to refer to it if one is told that something is not possible. Futuristic policies are needed for the future. > Anybody can predict the future if accuracy isn't important. I don't understand why you have stated that. Is it relevant here? If so, how? > In the event, policies must be based on current realities rather than > speculation. No. One needs to consider both current realities and speculative possibilities for the future and make a balanced decision as to how to proceed going forwards, bearing in mind traditions but not being shackled by them such that they are damaging the future. William Overington Wednesday 13 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Wed Oct 13 11:11:32 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Wed, 13 Oct 2021 17:11:32 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: <000001d7c046$295aa8d0$7c0ffa70$@ewellic.org> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <2CC66AE8-87C1-472B-8FC8-80F129A00876@apple.com> <39bb3057-b0fd-cd99-c7a3-e794b307348f@code2001.com> <000001d7c046$295aa8d0$7c0ffa70$@ewellic.org> Message-ID: <249e52e2.e0b7.17c7a6c559a.Webtop.111@btinternet.com> As this appears to be about Ewellic, and it comes up as tofu boxes here, could there possibly be some graphics posted please? https://en.wikipedia.org/wiki/Ewellic_alphabet William Overington Wednesday 13 October 2021 ------ Original Message ------ From: "Doug Ewell via Unicode" To: unicode at corp.unicode.org Cc: "'James Kass'" Sent: Wednesday, 2021 Oct 13 At 16:22 Subject: RE: Encoding ConScripts OK, James forced me into this thread. Well, obviously it should be ????????. Or, if you prefer, ????????. Or even ???????? / ?????????. Pronunciation of newly coined English words is notoriously unstandardized, but I agree with those who notice the similarity between "geography" (which I pronounce /?i???????fi?/) and "neography" (hence /ni???????fi?/). There's no /o?/ as in "go" in either of these words, not for me anyway. So the latter would be ?????????, or really ????????? as the schwa should be used minimally. Don't forget the accent on multi-syllable words. This is probably not much farther out of scope than some other recent posts. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Wed Oct 13 17:12:59 2021 From: jameskass at code2001.com (James Kass) Date: Wed, 13 Oct 2021 22:12:59 +0000 Subject: Encoding ConScripts In-Reply-To: <6c04bbd2.d0d7.17c793c51d7.Webtop.111@btinternet.com> References: <6e540736.6d537.17c51cb87d8.Webtop.100@btinternet.com> <05b47c66-69d3-18b3-9777-1013753008c7@gmx.de> <081f6159-b514-e273-53b4-973ee8131eb8@ix.netcom.com> <5f58525c-c1bc-209a-9607-9d701e997e07@ix.netcom.com> <2cba3597.60e7.17c6117464f.Webtop.100@btinternet.com> <28ccca77-8ece-22bf-2001-d0f46e49bc98@code2001.com> <52687cbc.6ce8.17c64df0480.Webtop.100@btinternet.com> <6c04bbd2.d0d7.17c793c51d7.Webtop.111@btinternet.com> Message-ID: <2566ad25-4bab-90c2-1664-c357270d871a@code2001.com> On 2021-10-13 10:39 AM, William_J_G Overington via Unicode wrote: > The following request for allowing abstract emoji to become in scope > has received no reply as far as I am aware. > > https://www.unicode.org/L2/L2021/21068-pubrev.html#Emoji_Feedback > > If abstract emoji become in scope for consideration, then a proposal > document would be allowed to go forward for consideration. > > William Your request may not have gone to the right place. We discussed this back in mid-August of 2018 on this list. I think that a proposal, even if initially rejected as out-of-scope, would be more likely to be read by actual emoji users (and vendors) than feedback on an obscure web page.? If any kind of demand for such characters ensued, surely any abstract objections would be waived in this case.? But don't you think it more likely that if users and vendors wanted pronoun emoji, they would design language-independent pictographs?? I do, and that's why I pointed out that you were devising a ConScript rather than inventing "emoji". From jameskass at code2001.com Wed Oct 13 17:24:07 2021 From: jameskass at code2001.com (James Kass) Date: Wed, 13 Oct 2021 22:24:07 +0000 Subject: Encoding ConScripts In-Reply-To: <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> Message-ID: <67197359-938f-751d-0c6a-bdf01f7787ee@code2001.com> On 2021-10-13 11:37 AM, William_J_G Overington wrote: > Well, I opine that if one or more people do choose to agree with me on > this, then his, her, or their posts should be considered on the basis > of what he, she, or they post, and not on the basis of a prejudging > before the posting takes place. > This isn't a matter of opinion.? Proposal 1 in the example is a non-starter because of Unicode's encoding principles. > > Encoding emoji was an anomaly prompted by peculiar circumstances and > a unique chain of events. > > I would say 'particular' rather than 'peculiar'. > You're entitled to do so. > > Anybody can predict the future if accuracy isn't important. > > I don't understand why you have stated that. Is it relevant here? If > so, how? > Because predicting the future with no guarantee of accuracy is akin to speculation, which ties in with my subsequent statement. >> In the event, policies must be based on current realities rather than >> speculation. > Now /this/ one *is* a matter of opinion, and we apparently disagree. From jameskass at code2001.com Wed Oct 13 18:02:47 2021 From: jameskass at code2001.com (James Kass) Date: Wed, 13 Oct 2021 23:02:47 +0000 Subject: Encoding ConScripts In-Reply-To: <249e52e2.e0b7.17c7a6c559a.Webtop.111@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <7a7b14c.74a2.17c664599be.Webtop.100@btinternet.com> <2CC66AE8-87C1-472B-8FC8-80F129A00876@apple.com> <39bb3057-b0fd-cd99-c7a3-e794b307348f@code2001.com> <000001d7c046$295aa8d0$7c0ffa70$@ewellic.org> <249e52e2.e0b7.17c7a6c559a.Webtop.111@btinternet.com> Message-ID: <6b3ba0de-715d-8f4c-fe34-1ede9275e194@code2001.com> On 2021-10-13 4:11 PM, William_J_G Overington via Unicode wrote: > As this appears to be about Ewellic, and it comes up as tofu boxes > here, could there possibly be some graphics posted please? Fairfax HD covers Ewellic and so do a couple of my fonts.? Part of the mystique of PUA is downloading and installing appropriate fonts. From doug at ewellic.org Wed Oct 13 18:32:37 2021 From: doug at ewellic.org (Doug Ewell) Date: Wed, 13 Oct 2021 17:32:37 -0600 Subject: [OT] Ewellic (was: RE: Encoding ConScripts) Message-ID: <002301d7c08a$9bdde770$d399b650$@ewellic.org> James Kass wrote: > Fairfax HD covers Ewellic and so do a couple of my fonts. Part of the > mystique of PUA is downloading and installing appropriate fonts. Here's the list of fonts I have that support it; there may be others out there which I don't have: >From James: ? Code2000 ? Code2001 >From Rebecca: ? Constructium ? Fairfax HD (but not Fairfax SM HD, which doesn't support U+0301 COMBINING ACUTE ACCENT) Commissioned from Michael Everson, but not yet released due to my own neglect over many years: ? Everewellic Others: ? Nishiki-teki ? unscii (doesn't display the accent correctly, plus has really ugly glyphs) All of these except Everewellic can be found with a quick web search. Beware of unauthorized clones of James's fonts. Andrew West's wonderful BabelPad editor will mix and match the fonts on one's Windows system or allow one to pick a single font, to provide optimum coverage of the code space (including PUAs). It also allows one to add one or more fonts for just that editing session, without actually installing them. https://babelstone.co.uk/Software/BabelPad.html -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From beckiergb at gmail.com Wed Oct 13 19:18:17 2021 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Wed, 13 Oct 2021 17:18:17 -0700 Subject: [OT] Ewellic (was: RE: Encoding ConScripts) In-Reply-To: <002301d7c08a$9bdde770$d399b650$@ewellic.org> References: <002301d7c08a$9bdde770$d399b650$@ewellic.org> Message-ID: Unifont CSUR also supports it, but I'm not sure how well it renders the combining acute accent. -- Rebecca Bettencourt On Wed, Oct 13, 2021 at 4:36 PM Doug Ewell via Unicode < unicode at corp.unicode.org> wrote: > James Kass wrote: > > > Fairfax HD covers Ewellic and so do a couple of my fonts. Part of the > > mystique of PUA is downloading and installing appropriate fonts. > > Here's the list of fonts I have that support it; there may be others out > there which I don't have: > > From James: > ? Code2000 > ? Code2001 > > From Rebecca: > ? Constructium > ? Fairfax HD (but not Fairfax SM HD, which doesn't support U+0301 > COMBINING ACUTE ACCENT) > > Commissioned from Michael Everson, but not yet released due to my own > neglect over many years: > ? Everewellic > > Others: > ? Nishiki-teki > ? unscii (doesn't display the accent correctly, plus has really ugly > glyphs) > > All of these except Everewellic can be found with a quick web search. > Beware of unauthorized clones of James's fonts. > > Andrew West's wonderful BabelPad editor will mix and match the fonts on > one's Windows system or allow one to pick a single font, to provide optimum > coverage of the code space (including PUAs). It also allows one to add one > or more fonts for just that editing session, without actually installing > them. > > https://babelstone.co.uk/Software/BabelPad.html > > -- > Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Wed Oct 13 20:11:35 2021 From: jameskass at code2001.com (James Kass) Date: Thu, 14 Oct 2021 01:11:35 +0000 Subject: [OT] Bytext (was Re: Encoding ConScripts) In-Reply-To: <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> Message-ID: On 2021-10-13 11:37 AM, William_J_G Overington via Unicode wrote: > Whether the Unicode Technical Committee does actually change the rules > for some specific particular circumstances is for the future, ... Quoting William Overington from a comment to a Michael S. Kaplan blog page archived here: http://archives.miloush.net/michkap/archive/2007/08/18/4455146.html > Something which I particularly like about Bytext is the arrowed > brackets which are intended for use with superscripts and subscripts > and for designating in a linear run of text the lower and upper > limits of integrals and summations in mathematics.? Perhaps your > committee could have a look at those please? Bytext!? Anyone dissatisfied with Unicode principles and stability would be free to join the Bytext community and advance suggestions there, except the community appears to have disappeared.? But it might be possible to find original documents in Wayback archives. Then Bytext could be revived and reanimated.? "Phoenix Bytext"? "Bytext 2.0"?? The point being that whoever revives Bytext (or builds something new from scratch) would be in charge of establishing policy, which might open the door to all kinds of personal and idiosyncratic glyphs. My prediction is that such an effort would flop, other predictions may vary. Meanwhile, Unicode still has the PUA with all of its charm, allure, and mystique. From mark at kli.org Wed Oct 13 20:34:03 2021 From: mark at kli.org (Mark E. Shoulson) Date: Wed, 13 Oct 2021 21:34:03 -0400 Subject: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Message-ID: On 10/13/21 13:32, Kent Karlsson via Unicode wrote: > The practice of taking image snippets of the terminal emulator window is common, but mostly annoying. While it preserves the formatting (in a way), often the resolution is too low (having shrunk the image) making it hard to read, and? one cannot copy-paste text from an image. > > /Kent K I know, right?? Drives me crazy. It seems to me that at the very least the "branch" symbol has some claim to be text.? Moreover, we have already encoded lots of characters that have less.? MINIMIZE, MAXIMIZE, OVERLAP, and CANCELLATION X are all strictly graphic UI characters, as are quite a few others in that block.? The "map" symbols also can't really make much of a claim to being plain text. ~mark From mark at kli.org Wed Oct 13 20:44:17 2021 From: mark at kli.org (Mark E. Shoulson) Date: Wed, 13 Oct 2021 21:44:17 -0400 Subject: [OT] Bytext (was Re: Encoding ConScripts) In-Reply-To: References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> Message-ID: <24cb07bf-beda-0cdf-7724-e19f25acfb16@shoulson.com> On 10/13/21 21:11, James Kass via Unicode wrote: > > On 2021-10-13 11:37 AM, William_J_G Overington via Unicode wrote: >> Whether the Unicode Technical Committee does actually change the >> rules for some specific particular circumstances is for the future, ... Perhaps.? But at some point the conversation of "this is out of scope." "can you make it in scope?" "no."? "how about now?" starts to wear thin. > Bytext! Anyone dissatisfied with Unicode principles and stability > would be free to join the Bytext community and advance suggestions > there, except the community appears to have disappeared.? But it might > be possible to find original documents in Wayback archives. Then > Bytext could be revived and reanimated.? "Phoenix Bytext"? "Bytext > 2.0"?? The point being that whoever revives Bytext (or builds > something new from scratch) would be in charge of establishing policy, > which might open the door to all kinds of personal and idiosyncratic > glyphs. > > My prediction is that such an effort would flop, other predictions may > vary. > > Meanwhile, Unicode still has the PUA with all of its charm, allure, > and mystique. I've been wondering what the need is to tilt at windmills.? There *are* all kinds of ways to make your ConScript/emoji used.? There is the PUA, people use graphics "stickers", there is rich text, there are alternate encodings... if you won't make the effort to make things available to even see if there is interest out there apart from you, if you won't form a community asking for this, why should Unicode? ~mark From jameskass at code2001.com Wed Oct 13 23:00:07 2021 From: jameskass at code2001.com (James Kass) Date: Thu, 14 Oct 2021 04:00:07 +0000 Subject: [OT] Bytext (was Re: Encoding ConScripts) In-Reply-To: <24cb07bf-beda-0cdf-7724-e19f25acfb16@shoulson.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> <24cb07bf-beda-0cdf-7724-e19f25acfb16@shoulson.com> Message-ID: On 2021-10-14 1:44 AM, Mark E. Shoulson via Unicode wrote: > I've been wondering what the need is to tilt at windmills.? There > *are* all kinds of ways to make your ConScript/emoji used.? There is > the PUA, people use graphics "stickers", there is rich text, there are > alternate encodings... if you won't make the effort to make things > available to even see if there is interest out there apart from you, > if you won't form a community asking for this, why should Unicode? > To be fair, William Overington has produced a font with these glyphs and made it available to the public.? It's called Mariposa font, which is unfortunate because that font name is already registered by a different developer. The font is available here: http://www.users.globalnet.co.uk/~ngo/mariposa_novel.htm (And you don't have to read the novel to download it.) The font is fixed up to generate glyphs using OpenType substitution of ASCII strings in the format of a percentage sign followed by ASCII digits.? Since ASCII is covered in Unicode, this material can already be interchanged with no action required from Unicode.? Such data is already "regular Unicode". I speculate that little-to-no interest has been generated as yet, but I do not have access to the download stats, if any, at William's family web site. I'm mystified at the persistence (or obsession) directed to getting these novel abstract symbols enshrined in Unicode when interchange is already fully enabled. Alleging that Unicode's principles are somehow unfairly preventing his work from being available to the world with analogies to rice and pasta isn't compelling. From marius.spix at web.de Thu Oct 14 02:14:28 2021 From: marius.spix at web.de (Marius Spix) Date: Thu, 14 Oct 2021 09:14:28 +0200 Subject: Aw: Re: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Message-ID: An HTML attachment was scrubbed... URL: From mark at kli.org Thu Oct 14 07:24:10 2021 From: mark at kli.org (Mark E. Shoulson) Date: Thu, 14 Oct 2021 08:24:10 -0400 Subject: [OT] Bytext (was Re: Encoding ConScripts) In-Reply-To: References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> <24cb07bf-beda-0cdf-7724-e19f25acfb16@shoulson.com> Message-ID: <9a8831f8-0afd-9a7c-d0a2-a7be2d8758a3@shoulson.com> On 10/14/21 00:00, James Kass via Unicode wrote: > > On 2021-10-14 1:44 AM, Mark E. Shoulson via Unicode wrote: >> I've been wondering what the need is to tilt at windmills.? There >> *are* all kinds of ways to make your ConScript/emoji used.? There is >> the PUA, people use graphics "stickers", there is rich text, there >> are alternate encodings... if you won't make the effort to make >> things available to even see if there is interest out there apart >> from you, if you won't form a community asking for this, why should >> Unicode? >> > To be fair, William Overington has produced a font with these glyphs > and made it available to the public.? It's called Mariposa font, which > is unfortunate because that font name is already registered by a > different developer. That does count as "making things available," I'll grant.? But it's a very far cry from forming a community.? To get this happening, you have to convince people to agree with you.? Convincing Unicode is not happening; instead, turn those efforts into convincing other people to agree with you, and then you'll have a community of users ready-made to come to Unicode with, to assure them that there will be usage.? You'll even already have a corpus of communication using it, which is another thing that is desirable when deciding to encode a script.? I raised the chicken-and-egg problem when arguing for Klingon, but the fact is that even back in 1997, there _was_ a community that would almost certainly have started using the script if it had been available (actually, it was available in the PUA and indeed people were using it, just allegedly "not enough".)? And indeed they did start using it, Unicode or no.? Take your arguments someplace where they'll do some good and try to convince other people to USE your system, instead of trying to convince Unicode to change its basic principles. (Just don't do it here, as it's off-topic.? Yes, it is, even though it's sort of about writing and stuff.) > > The font is available here: > http://www.users.globalnet.co.uk/~ngo/mariposa_novel.htm > (And you don't have to read the novel to download it.) > > The font is fixed up to generate glyphs using OpenType substitution of > ASCII strings in the format of a percentage sign followed by ASCII > digits.? Since ASCII is covered in Unicode, this material can already > be interchanged with no action required from Unicode.? Such data is > already "regular Unicode". Yes, there are all kinds of fun things one can do with OpenType (I once saw a font that would change ostensibly socially charged words into less charged ones, like "fat" into "overweight" or something.) And it can be a lot of fun to use such things, and indeed using ASCII gives a nice fallback when rendered in a different font, makking the system actually usable.? So get people using it! > I speculate that little-to-no interest has been generated as yet, but > I do not have access to the download stats, if any, at William's > family web site. > > I'm mystified at the persistence (or obsession) directed to getting > these novel abstract symbols enshrined in Unicode when interchange is > already fully enabled. Alleging that Unicode's principles are somehow > unfairly preventing his work from being available to the world with > analogies to rice and pasta isn't compelling. Well, there _is_ a difference between being able to do something and being able to do it "right."? The old code-switching ISO-8859-X (that the right number?) was capable of communicating in Cyrillic and Hebrew and whatever, and similar font tricks or search-and-replace can change romanized Klingon into pIqaD.? But as these methods can stumble along and manage, they indeed _must_ be used to do so in order to demonstrate demand in order to get encoded.? Unicode does not encode dreams, and isn't there to create orthographies, but to enable those already in use.? So get using!! ~mark From mark at kli.org Thu Oct 14 07:36:47 2021 From: mark at kli.org (Mark E. Shoulson) Date: Thu, 14 Oct 2021 08:36:47 -0400 Subject: [OT] Bytext (was Re: Encoding ConScripts) In-Reply-To: <9a8831f8-0afd-9a7c-d0a2-a7be2d8758a3@shoulson.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> <24cb07bf-beda-0cdf-7724-e19f25acfb16@shoulson.com> <9a8831f8-0afd-9a7c-d0a2-a7be2d8758a3@shoulson.com> Message-ID: On 10/14/21 08:24, Mark E. Shoulson via Unicode wrote: > Yes, there are all kinds of fun things one can do with OpenType (I > once saw a font that would change ostensibly socially charged words > into less charged ones, like "fat" into "overweight" or something.) https://www.thepolitetype.com/ ~mark From jameskass at code2001.com Thu Oct 14 23:11:00 2021 From: jameskass at code2001.com (James Kass) Date: Fri, 15 Oct 2021 04:11:00 +0000 Subject: Encoding ConScripts In-Reply-To: <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> Message-ID: <207a0a26-126d-4cbe-9668-006b7ca1640d@code2001.com> On 2021-10-13 11:37 AM, William_J_G Overington via Unicode responded: > >> Encoding emoji was an anomaly prompted by peculiar circumstances and >> a unique chain of events. > > I would say 'particular' rather than 'peculiar'. > >> Nobody should ever cite it as precedent, so of course everybody will. >> Sigh. > > No, it is a precedent. The rules were changed for emoji. So on a sauce > for pasta is sauce for rice basis, as the Unicode Technical Committee > has changed the rules for one set of particular circumstances, it can > change them again for other particular circumstances. Certain principles were violated when emoji penetrated Unicode plain-text.? In order to accommodate emoji, a special class was established along with a different set of encoding principles covering that new class. A crucial difference exists between most everything William has ever proposed and initial emoji encoding.? Pre-existing "characters" were already being interchanged by users and were very, very popular. Conflicting character sets existed which impacted cross-platform interchange.? Decisions were made to move forward in spite of vehement opposition.? (There was also an adamant group of emoji supporters, of course.) For additions to the emoji set, the special emoji principles admit speculative characters based on certain special criteria and hurdles.? So it would be tempting to anyone inventing novel glyphs to try to get those inventions into Unicode as "emoji", because there is no other path.? But the emoji principles reject "abstract emoji" from consideration for a pictographic character set.? A request to review that abstraction rejection on an unrelated or marginally related public feedback blog may not really put anything on the table for the committee to review, so it may well have been overlooked.? A formal proposal might be better for prompting discussion and consideration. William acknowledges that the Mariposa System is essentially mark-up.? See here: https://lists.aau.at/pipermail/mpeg-otspec/2021-May/002766.html (A fascinating thread, BTW.? My respect to Peter Constable and Vladimir Levantovsky for their professionalism in that thread.)? But William apparently considers it desirable for these glyphs also to be encoded as characters, although I've no idea why, saucy pasta notwithstanding. Over the years, many of us have tried to provide helpful, honest suggestions and pointers while remaining tolerably polite.? Trying to drum up support for abstract emoji (and whatnot) on technical discussion lists doesn't seem to be working out.? Social media platforms and other places where emoji users congregate might be a better target for publicity and for garnering support / generating usage. From pgcon6 at msn.com Fri Oct 15 12:42:17 2021 From: pgcon6 at msn.com (Peter Constable) Date: Fri, 15 Oct 2021 17:42:17 +0000 Subject: Powerline symbols? In-Reply-To: <6B397FAB-43B9-481A-9DCB-816EA933F2FF@bahnhof.se> References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <6B397FAB-43B9-481A-9DCB-816EA933F2FF@bahnhof.se> Message-ID: From: Unicode on behalf of Kent Karlsson via Unicode Date: Tuesday, October 12, 2021 at 2:12 PM ? >I would not consider (just about) every terminal emulator used in the world as a ?closed system?. The important consideration is interoperability of character data. There may be many terminal emulators that use a particular symbol. But is there a public data interchange scenario involving the symbol in which interoperability between different content and software is required? Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgcon6 at msn.com Fri Oct 15 12:50:10 2021 From: pgcon6 at msn.com (Peter Constable) Date: Fri, 15 Oct 2021 17:50:10 +0000 Subject: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Message-ID: From: Unicode on behalf of Kent Karlsson via Unicode Date: Wednesday, October 13, 2021 at 10:39 AM ? >No, no, no. It is normal practice to copy text from a terminal emulator window to something else? Prompts, yes. But do the terminals have symbols in UI affordances with symbols that can?t be selected and copied as text? No, to those. Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.w.kennedy at gmail.com Fri Oct 15 15:09:26 2021 From: john.w.kennedy at gmail.com (John W Kennedy) Date: Fri, 15 Oct 2021 16:09:26 -0400 Subject: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Message-ID: > On Oct 15, 2021, at 1:50 PM, Peter Constable via Unicode wrote: > > From: Unicode > on behalf of Kent Karlsson via Unicode > > Date: Wednesday, October 13, 2021 at 10:39 AM > > ? > > >No, no, no. It is normal practice to copy text from a terminal emulator window to something else? > > Prompts, yes. > > But do the terminals have symbols in UI affordances with symbols that can?t be selected and copied as text? No, to those. Since you ask, line 25 on the IBM 3278-2 and other below-the-bottom lines on its many descendants. (The previous 3277 had three solid rectangles to the right of the right margin.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Fri Oct 15 21:26:55 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 15 Oct 2021 19:26:55 -0700 Subject: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <6B397FAB-43B9-481A-9DCB-816EA933F2FF@bahnhof.se> Message-ID: <3761ba02-079d-6e97-5440-41c11c663cda@ix.netcom.com> An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Fri Oct 15 21:28:20 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 15 Oct 2021 19:28:20 -0700 Subject: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Message-ID: <2767f6f6-65eb-a9ab-3aea-5c5e8bdad975@ix.netcom.com> An HTML attachment was scrubbed... URL: From jameskass at code2001.com Fri Oct 15 21:39:44 2021 From: jameskass at code2001.com (James Kass) Date: Sat, 16 Oct 2021 02:39:44 +0000 Subject: Powerline symbols? In-Reply-To: <2767f6f6-65eb-a9ab-3aea-5c5e8bdad975@ix.netcom.com> References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> <2767f6f6-65eb-a9ab-3aea-5c5e8bdad975@ix.netcom.com> Message-ID: <471e9a23-9fa8-df5a-a018-87762632850e@code2001.com> On 2021-10-16 2:28 AM, Asmus Freytag via Unicode wrote: > I'm not so versed in dancing: what are affordances? (From Merriam-Webster) Definition of affordance : the quality or property of an object that defines its possible uses or makes clear how it can or should be used From asmusf at ix.netcom.com Sat Oct 16 00:53:57 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 15 Oct 2021 22:53:57 -0700 Subject: Powerline symbols? In-Reply-To: <2767f6f6-65eb-a9ab-3aea-5c5e8bdad975@ix.netcom.com> References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> <2767f6f6-65eb-a9ab-3aea-5c5e8bdad975@ix.netcom.com> Message-ID: <34cff8f3-a31a-0d9b-a403-1d8e0b8436c2@ix.netcom.com> An HTML attachment was scrubbed... URL: From kent.b.karlsson at bahnhof.se Sat Oct 16 11:27:09 2021 From: kent.b.karlsson at bahnhof.se (Kent Karlsson) Date: Sat, 16 Oct 2021 18:27:09 +0200 Subject: Powerline symbols? In-Reply-To: References: <3f8b2fe3-3ee1-9959-cb19-7d7c285d0242@shoulson.com> <056b177c-30cf-7392-84a1-7aefecbc9fdc@it.aoyama.ac.jp> <6f1b6a26-99d8-5b88-2d5b-e8793aae4d06@shoulson.com> Message-ID: <6EBFDA01-469F-4E81-BBED-4475BFCF5B85@bahnhof.se> > 15 okt. 2021 kl. 22:09 skrev John W Kennedy via Unicode : > > > >> On Oct 15, 2021, at 1:50 PM, Peter Constable via Unicode > wrote: >> >> From: Unicode > on behalf of Kent Karlsson via Unicode > >> Date: Wednesday, October 13, 2021 at 10:39 AM >> >> ? >> >> >No, no, no. It is normal practice to copy text from a terminal emulator window to something else? >> >> Prompts, yes. >> >> But do the terminals have symbols in UI affordances with symbols that can?t be selected and copied as text? No, to those. > > Since you ask, line 25 on the IBM 3278-2 and other below-the-bottom lines on its many descendants. (The previous 3277 had three solid rectangles to the right of the right margin.) A little perplexed here. We were, or a t least I thought we were, talking about "selected and copied as text? (a.k.a. copy-(and-paste)). For the physical devices called ?terminals?, that is just about impossible, since those basically became museum pieces before cut-and-paste was invented? (and cut-and-paste would not be practical for them anyway). For terminal emulators, which are commonly used today, for Linux and other Unix-like systems like MacOSX, as well as Windows (well the NT-based ones (which includes current Windows systems), perhaps not the since long dead DOS-based ones), cut-and-paste of text, both from the terminal emulator to such things as Word documents (for notes), emails, and to ?chat? applications, as well as to a terminal emulator (usually a command, but often other text as well, e.g. when entering text to a terminal based text editor) from a Word document, Notepad++ document, email, or from a chat. That is NORMAL EVERYDAY PRACTICE. I cannot tell how many it is that used terminal emulators for modern up-to-date systems, but I would guess somewhere in the range of millions. The Powerline symbols are mostly intended for terminal emulators running bash (as login shell), and the ?branch? symbol is intended for bash prompts that include git status indication; quite popular for those of us that use git. I often get such a prompt as default when I log in to certain systems, prompts designed by someone else? Though not yet using the Powerline symbols. But people make terminal emulators for all sorts of old things, not just up-to-date Linux/etc. Apparently there are terminal emulators also for IBM 3278, but I?m not familiar with those. If for some reason you cannot do ?selected and copy as text? for line 25, then you should report that as a bug to whoever made that terminal emulator. /Kent K PS If you don?t have easy access to a ?real? terminal emulator, there are online ones (with a very sandboxed environment; nobody wants you to go and hack their system...). Here is a list someone made: https://itsfoss.com/online-linux-terminals/. You can try some of then out, they have different properties/capabilities. Here?s a direct link to one: https://xtermjs.org/; it, like the ?real? ones (but often not the web ones) allow for copy-and-paste (to outside of this web based terminal emulator demo): Unicode support And much more... ? ? Supports CJK ? and emoji ?? (Styling is not preserved in this copy-paste, unfortunately; I know of one terminal emulator that has some (imperfect) support for preserving styling when copy-pasting. There may be a few more, I haven?t done a survey.) /K -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Sat Oct 16 16:13:44 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 16 Oct 2021 14:13:44 -0700 Subject: Encoding ConScripts In-Reply-To: <207a0a26-126d-4cbe-9668-006b7ca1640d@code2001.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> <207a0a26-126d-4cbe-9668-006b7ca1640d@code2001.com> Message-ID: An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Fri Oct 15 04:29:21 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Fri, 15 Oct 2021 10:29:21 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: <207a0a26-126d-4cbe-9668-006b7ca1640d@code2001.com> References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> <207a0a26-126d-4cbe-9668-006b7ca1640d@code2001.com> Message-ID: <35403a88.2627.17c8348d89c.Webtop.111@btinternet.com> James Kass wrote: > A request to review that abstraction rejection on an unrelated or > marginally related public feedback blog may not really put anything on > the table for the committee to review, so it may well have been > overlooked. I sent in that request on the then version of the official contact form. So not on a blog at all. William Overington Friday 15 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Mon Oct 18 14:00:38 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Mon, 18 Oct 2021 20:00:38 +0100 (BST) Subject: Encoding ConScripts In-Reply-To: References: <43CB86F5-980D-4E64-9E2B-F9C064ABBD34@umich.edu> <9a2d6508-c42f-87fc-27e2-3684b7eaa742@code2001.com> <18fe7c38.94aa.17c705f9bfd.Webtop.111@btinternet.com> <8a6370a4-2e27-968a-2c00-513a354007c3@code2001.com> <4a203144.be9c.17c75ba63af.Webtop.111@btinternet.com> <29585c01-7da9-9038-1c9f-340efb500944@code2001.com> <4614e819.d3d4.17c7971008b.Webtop.111@btinternet.com> <207a0a26-126d-4cbe-9668-006b7ca1640d@code2001.com> Message-ID: <23977afa.90c0.17c94c6f135.Webtop.111@btinternet.com> I have written a poem, which I hope will be of interest and possibly of help to some people. Alt sixty thousand on the keys is E A six zero if you please The poem is intended as a useful poem, and could be very useful for people making fonts, and people using those fonts, where the font includes one or more characters in the Private Use Area. The thing is, for an end user, the getting of a Private Use Area character into a document from a font can be awkward at times, particularly if one does not have access to much software, such as if using a basic Windows system and one is doing what one can using WordPad. So if a font has, say, one non-standard character and that character is in the Private Use Area, then placing it at U+EA60 means that when using a program such as WordPad one can access that character easily by using Alt 60000 as the way to access the character. Also, if, say, a new script of twenty characters is being added to a font, adding the characters starting at U+EA61 allows access from WordPad using an Alt code of sixty thousand plus the index number of the character in the new script. Mostly one Private Use Area code point is as good as any other, so if people trying to develop a new script choose to add in the characters starting at U+EA61 that could be a good choice as, compared with other choices, it can, in some circumstances, give better access to the characters. In that case the glyph at U+EA60 could be used to provide a visual indication of the name of the script. Choosing to use this method can be helpful in some circumstances, yet for people with access to more specialised software the using of this method rather than placing the characters elsewhere in the Private Use Area does no harm. William Overington Monday 18 October 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Thu Oct 21 02:42:10 2021 From: jameskass at code2001.com (James Kass) Date: Thu, 21 Oct 2021 07:42:10 +0000 Subject: Breaking barriers Message-ID: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> A recent article announced a new phone with superior translation abilities.? The phone translates text, speech, and images of text. If interested, here's the article: https://www.xda-developers.com/google-pixel-6-live-translate-screenshots/ The article doesn't reveal the inner workings, but it's likely that any computer text entered to or produced by Live Translate would be in Unicode.? Although this is emerging technology and the translation modules may not yet be as robust or numerous as we wish, it might be expected that this software and any spin-offs will become powerful and versatile enough to handle most any kind of source text. This would mean that if an image of text can be scanned from a computer monitor, it could be translated.? The underlying source encoding wouldn't matter, it could be some obscure code page, Unicode PUA, or even a specialty custom ASCII font as long as the source display is correctly enabled and the translation software handles the source language(s).? Since the resulting data would likely be stored in Unicode, both pre- and post-translation -- the barrier between conflicting older encodings which Unicode has practically removed would then be completely demolished. P.S. - Too bad about human translators, though.? Being a translator used to be a lucrative field with skilled translators in high demand.? Newer technology, as it breaks down the communication barrier between languages, will probably have an effect on translator employment, if it hasn't already. From haberg-1 at telia.com Thu Oct 21 03:01:03 2021 From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=) Date: Thu, 21 Oct 2021 10:01:03 +0200 Subject: Breaking barriers In-Reply-To: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> Message-ID: > On 21 Oct 2021, at 09:42, James Kass via Unicode wrote: > > This would mean that if an image of text can be scanned from a computer monitor, it could be translated. The underlying source encoding wouldn't matter, it could be some obscure code page, Unicode PUA, or even a specialty custom ASCII font as long as the source display is correctly enabled and the translation software handles the source language(s). Since the resulting data would likely be stored in Unicode, both pre- and post-translation -- the barrier between conflicting older encodings which Unicode has practically removed would then be completely demolished. This is the case in iOS 15, requires faster (later) devices, I think. https://www.macrumors.com/guide/ios-15-translate-app/ > P.S. - Too bad about human translators, though. Being a translator used to be a lucrative field with skilled translators in high demand. Newer technology, as it breaks down the communication barrier between languages, will probably have an effect on translator employment, if it hasn't already. It helps human translators; the computer can make a quick rough draft, but it isn't very accurate. From albrecht.dreiheller at siemens.com Thu Oct 21 04:41:39 2021 From: albrecht.dreiheller at siemens.com (Dreiheller, Albrecht) Date: Thu, 21 Oct 2021 09:41:39 +0000 Subject: AW: Breaking barriers In-Reply-To: References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> Message-ID: > Von: Unicode Im Auftrag von Hans ?berg via Unicode > Gesendet: Donnerstag, 21. Oktober 2021 10:01 >> P.S. - Too bad about human translators, though. Being a translator used to be a lucrative field with skilled translators in high demand. Newer technology, as it breaks down the communication barrier between languages, will probably have an effect on translator employment, if it hasn't already. > It helps human translators; the computer can make a quick rough draft, but it isn't very accurate. Without understanding the context, Live Translate won't have a good chance to find the right translation. Machine translation often only pretends to know the meaning but in fact it fails. I'm not worried about human translators. /Albrecht From jameskasskrv at gmail.com Thu Oct 21 16:11:08 2021 From: jameskasskrv at gmail.com (James Kass) Date: Thu, 21 Oct 2021 21:11:08 +0000 Subject: AW: Breaking barriers In-Reply-To: References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> Message-ID: On 2021-10-21 9:41 AM, Dreiheller, Albrecht via Unicode wrote: > Without understanding the context, Live Translate won't have a good chance to find the right translation. > Machine translation often only pretends to know the meaning but in fact it fails. > I'm not worried about human translators. They may be safe in the short-term.? Machine translation is much, much better than it was at the onset.? I recall translating a German web page about the Phaistos disk into English and the page title was translated as "The Discotheques of Phaistos".? (That was a foreshadowing of what to expect in the article, which was amusing to read through.)? Even before machine translation, translations could be humorous.? A French speaking friend once told me that the French title of a certain Steinbeck novel could translate back into English as "The Raisins of Anger". The web page of Google Translate offers an option for the end user to contribute suggestions for improving the specific translation. This would be expected to make the machine translations better over time.? If this option is also offered in Live Translate, which travels around town in pockets and purses and accepts source material from more than plain-text, wouldn't that expedite machine translation improvement? And what will happen when AI is added to the mix? From jameskass at code2001.com Thu Oct 21 16:36:02 2021 From: jameskass at code2001.com (James Kass) Date: Thu, 21 Oct 2021 21:36:02 +0000 Subject: AW: Breaking barriers In-Reply-To: References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> Message-ID: <5e028def-015d-ecab-0938-9f412c74954b@code2001.com> On 2021-10-21 9:11 PM, James Kass via Unicode wrote: > I recall translating a German web page... * I recall running a German web page through machine translation ... (Sorry) From mark at kli.org Thu Oct 21 17:40:47 2021 From: mark at kli.org (Mark E. Shoulson) Date: Thu, 21 Oct 2021 18:40:47 -0400 Subject: AW: Breaking barriers In-Reply-To: References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> Message-ID: <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> If I recall correctly, someone has proved that "fully automatic high-quality translation" is AI-hard.? Meaning that it's basically the same as making a fully aware, human-intelligence AI.? Now, that probably depends a lot on the details of "high-quality." There are probably sentences and texts one could cook up that a would-be translator would need arbitrarily good understanding of the context, situation, shared cultural memories and references, etc etc for, and I guess that would be what the "proof" was about.? Obviously, machine translation has improved in ways nobody(?) would have expected it to when the field was in its infancy, and has done it by a completely different method. Instead of making more and more sophisticated programs to understand and parse the grammars of various languages and build networks of subjects and predicates, modern translation, afaik, depends greatly on throwing _vast_ amounts of known text into the mix and doing some heavy-duty number- and memory-crunching to almost "guess" at what's probably the best translation, without necessarily actually "understanding" what it means.? (BTW, am I totally wrong about this?)? It seems to me that that does have farther to take us, and we'll probably see a lot more improvement, but it can only take us so far.? Then again, "so far" might be far enough.? If you have a translator whose results are semantically satisfactory, say, 97% of the time, and sound only a little awkwardnessful to a native speaker in the target language... well, customers' standards may be willing to duck a little. ~mark On 10/21/21 17:11, James Kass via Unicode wrote: > > > On 2021-10-21 9:41 AM, Dreiheller, Albrecht via Unicode wrote: >> Without understanding the context, Live Translate won't have a good >> chance to find the right translation. >> Machine translation often only pretends to know the meaning but in >> fact it fails. >> I'm not worried about human translators. > They may be safe in the short-term.? Machine translation is much, much > better than it was at the onset.? I recall translating a German web > page about the Phaistos disk into English and the page title was > translated as "The Discotheques of Phaistos".? (That was a > foreshadowing of what to expect in the article, which was amusing to > read through.)? Even before machine translation, translations could be > humorous.? A French speaking friend once told me that the French title > of a certain Steinbeck novel could translate back into English as "The > Raisins of Anger". > > The web page of Google Translate offers an option for the end user to > contribute suggestions for improving the specific translation. This > would be expected to make the machine translations better over time.? > If this option is also offered in Live Translate, which travels around > town in pockets and purses and accepts source material from more than > plain-text, wouldn't that expedite machine translation improvement? > > And what will happen when AI is added to the mix? From asmusf at ix.netcom.com Fri Oct 22 11:17:33 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 22 Oct 2021 09:17:33 -0700 Subject: AW: Breaking barriers In-Reply-To: <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> Message-ID: An HTML attachment was scrubbed... URL: From mark at kli.org Fri Oct 22 15:51:04 2021 From: mark at kli.org (Mark E. Shoulson) Date: Fri, 22 Oct 2021 16:51:04 -0400 Subject: AW: Breaking barriers In-Reply-To: References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> Message-ID: <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> On 10/22/21 12:17, Asmus Freytag via Unicode wrote: > On 10/21/2021 3:40 PM, Mark E. Shoulson via Unicode wrote: >> If I recall correctly, someone has proved that "fully automatic >> high-quality translation" is AI-hard.? Meaning that it's basically >> the same as making a fully aware, human-intelligence AI.? Now, that >> probably depends a lot on the details of "high-quality." There are >> probably sentences and texts one could cook up that a would-be >> translator would need arbitrarily good understanding of the context, >> situation, shared cultural memories and references, etc etc for, and >> I guess that would be what the "proof" was about. > > Sentences that require some understanding of the meaning for a > successful translation, even if you only consider factual accuracy, > are not hard to come by: they do prop up regularly. > Yeah, you're right.? I was wrong to imply (or think) that it only mattered in rarefied corner cases.? You give some fun examples of languages that don't mesh because they encode different information, and I'm sure a lot of us could come up with more. That makes any kind of language-independent representation difficult or impossible?if used or envisioned as a translation intermediate or codes "equivalent" to some sentence (because sentences may not be capable of being equivalent.)? You can use it on its own to express concepts in its own way, but at that point it isn't a translation intermediate, nor even language-independent, but is a language in its own way (see Blissymbolics, which, fair warning, I really hardly know anything about, so maybe you shouldn't see them.) > > >> It seems to me that that does have farther to take us, and we'll >> probably see a lot more improvement, but it can only take us so far.? >> Then again, "so far" might be far enough.? If you have a translator >> whose results are semantically satisfactory, say, 97% of the time, >> and sound only a little awkwardnessful to a native speaker in the >> target language... well, customers' standards may be willing to duck >> a little. > > There's a level of "quality" that equates to "a human looking at the > translation can guess what might have been in the original". > And it is over-optimistic to expect the level I expressed any time soon, yes. Thanks! From prosfilaes at gmail.com Fri Oct 22 16:04:41 2021 From: prosfilaes at gmail.com (David Starner) Date: Fri, 22 Oct 2021 14:04:41 -0700 Subject: Breaking barriers In-Reply-To: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> Message-ID: On Thu, Oct 21, 2021 at 12:46 AM James Kass via Unicode wrote: > This would mean that if an image of text can be scanned from a computer > monitor, it could be translated. The underlying source encoding > wouldn't matter, it could be some obscure code page, Unicode PUA, or > even a specialty custom ASCII font as long as the source display is > correctly enabled and the translation software handles the source > language(s). Since the resulting data would likely be stored in > Unicode, both pre- and post-translation -- the barrier between > conflicting older encodings which Unicode has practically removed would > then be completely demolished. "as long as the source display is correctly enabled and the translation software handles the source language(s)." So in no interesting cases. Project Gutenberg had a Swedish bible translation in an unknown encoding (a variant of the DOS encoding that doesn't seem to have corresponded to anything documented); getting it to display correctly was basically the same challenge as translating it to Unicode, which was eventually done by figuring out what the unknown codepoints (obviously quotes) must have been. The set of languages in PUA and that have reliable transcription and translation is going to be virtually empty, and if you care about correctness and you have the font, directly convert the encoding. > P.S. - Too bad about human translators, though. Being a translator used > to be a lucrative field with skilled translators in high demand. Newer > technology, as it breaks down the communication barrier between > languages, will probably have an effect on translator employment, if it > hasn't already. Haven't you seen photos of billboards saying "Translation server is down" or the like? It certainly already has impacted translator employment. I recall an older story, from the 1970s, where a tobacco firm was keeping track of a Brazilian anti-smoking group via a hired translator; said translator eventually proceeded to give a copy of all translated works to the Brazilian group (discreetly, or so he thought) at which point the company never called him again. Translation programs don't tend to do stuff like that. On the other hand, someone called translation AI-hard; it's not, it's impossible, in league with the halting problem. One example is Harry Potter and the Half-Blood Prince, who has a character mentioned as R.A.B. This is a preexisting character in the series, but which? Translators had to ask Rowling to correctly translate the initials. Now, a reference to the seventh book will answer the question, moving it to AI-Hard, but such deliberate or accidental ambiguity is part of the reason translators are traitors. (?traduttore, traditore?.) -- The standard is written in English . If you have trouble understanding a particular section, read it again and again and again . . . Sit up straight. Eat your vegetables. Do not mumble. -- _Pascal_, ISO 7185 (1991) From doug at ewellic.org Fri Oct 22 16:30:51 2021 From: doug at ewellic.org (Doug Ewell) Date: Fri, 22 Oct 2021 15:30:51 -0600 Subject: AW: Breaking barriers In-Reply-To: <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> Message-ID: <000001d7c78c$168e1c80$43aa5580$@ewellic.org> Miscommunication can happen in almost any translation situation, even between two educated, literate, fluent humans, and for that matter even within a single language. Translating between Spanish and English is supposed to be one of the easiest scenarios in the field, but there is still the Spanish verb 'deber' which can mean either "must" or "should" in English. Getting this right can be tricky; getting it wrong can cause any number of problems. Achieving 100% perfection is probably never going to happen. Getting within epsilon ? using actual software solutions, to translate arbitrary content ? happens every day, and the value of epsilon is shrinking. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From everson at evertype.com Fri Oct 22 16:43:10 2021 From: everson at evertype.com (Michael Everson) Date: Fri, 22 Oct 2021 22:43:10 +0100 Subject: Breaking barriers In-Reply-To: <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> References: <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> Message-ID: I wonder why you have brought up Blissymbols. Michael Everson http://evertype.com > On 22 Oct 2021, at 21:52, Mark E. Shoulson via Unicode wrote: > > ?On 10/22/21 12:17, Asmus Freytag via Unicode wrote: >>> On 10/21/2021 3:40 PM, Mark E. Shoulson via Unicode wrote: >>> If I recall correctly, someone has proved that "fully automatic high-quality translation" is AI-hard. Meaning that it's basically the same as making a fully aware, human-intelligence AI. Now, that probably depends a lot on the details of "high-quality." There are probably sentences and texts one could cook up that a would-be translator would need arbitrarily good understanding of the context, situation, shared cultural memories and references, etc etc for, and I guess that would be what the "proof" was about. >> >> Sentences that require some understanding of the meaning for a successful translation, even if you only consider factual accuracy, are not hard to come by: they do prop up regularly. >> > > Yeah, you're right. I was wrong to imply (or think) that it only mattered in rarefied corner cases. You give some fun examples of languages that don't mesh because they encode different information, and I'm sure a lot of us could come up with more. That makes any kind of language-independent representation difficult or impossible?if used or envisioned as a translation intermediate or codes "equivalent" to some sentence (because sentences may not be capable of being equivalent.) You can use it on its own to express concepts in its own way, but at that point it isn't a translation intermediate, nor even language-independent, but is a language in its own way (see Blissymbolics, which, fair warning, I really hardly know anything about, so maybe you shouldn't see them.) > >> >> >>> It seems to me that that does have farther to take us, and we'll probably see a lot more improvement, but it can only take us so far. Then again, "so far" might be far enough. If you have a translator whose results are semantically satisfactory, say, 97% of the time, and sound only a little awkwardnessful to a native speaker in the target language... well, customers' standards may be willing to duck a little. >> >> There's a level of "quality" that equates to "a human looking at the translation can guess what might have been in the original". >> > > And it is over-optimistic to expect the level I expressed any time soon, yes. > > Thanks! > > From richard.wordingham at ntlworld.com Fri Oct 22 18:07:04 2021 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Sat, 23 Oct 2021 00:07:04 +0100 Subject: Breaking barriers In-Reply-To: <000001d7c78c$168e1c80$43aa5580$@ewellic.org> References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> <000001d7c78c$168e1c80$43aa5580$@ewellic.org> Message-ID: <20211023000704.34f03ca0@JRWUBU2> On Fri, 22 Oct 2021 15:30:51 -0600 Doug Ewell via Unicode wrote: > Miscommunication can happen in almost any translation situation, even > between two educated, literate, fluent humans, and for that matter > even within a single language. > > Translating between Spanish and English is supposed to be one of the > easiest scenarios in the field, but there is still the Spanish verb > 'deber' which can mean either "must" or "should" in English. Getting > this right can be tricky; getting it wrong can cause any number of > problems. Indeed, I have trouble with the auxiliary 'should' in TUS. If I read TUS as a specification, vast swathes evaporate. For the specification language I am used to, a Lucifer's lexicon will interpret it as an auxiliary cancelling a sentence in which it appears in the principal clause. Can't Spanish 'deber' mean 'shall' in the language of specifications? Richard. From asmusf at ix.netcom.com Fri Oct 22 18:29:50 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 22 Oct 2021 16:29:50 -0700 Subject: AW: Breaking barriers In-Reply-To: <000001d7c78c$168e1c80$43aa5580$@ewellic.org> References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> <000001d7c78c$168e1c80$43aa5580$@ewellic.org> Message-ID: An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Fri Oct 22 18:48:44 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 22 Oct 2021 16:48:44 -0700 Subject: Meaning of "should" (was Re: Breaking barriers) In-Reply-To: <20211023000704.34f03ca0@JRWUBU2> References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> <000001d7c78c$168e1c80$43aa5580$@ewellic.org> <20211023000704.34f03ca0@JRWUBU2> Message-ID: <43fb397f-cf18-2916-3230-bfc430e301a7@ix.netcom.com> An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Fri Oct 22 19:03:24 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 22 Oct 2021 17:03:24 -0700 Subject: AW: Breaking barriers In-Reply-To: References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> <000001d7c78c$168e1c80$43aa5580$@ewellic.org> Message-ID: <7b84b7be-7709-7f41-3e28-4343c4cce9af@ix.netcom.com> An HTML attachment was scrubbed... URL: From jameskass at code2001.com Fri Oct 22 23:31:02 2021 From: jameskass at code2001.com (James Kass) Date: Sat, 23 Oct 2021 04:31:02 +0000 Subject: Breaking barriers In-Reply-To: References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> Message-ID: On 2021-10-22 9:04 PM, David Starner via Unicode wrote: > "as long as the source display is correctly enabled and the > translation software handles the source language(s)." So in no > interesting cases. Project Gutenberg had a Swedish bible translation > in an unknown encoding (a variant of the DOS encoding that doesn't > seem to have corresponded to anything documented); getting it to > display correctly was basically the same challenge as translating it > to Unicode, which was eventually done by figuring out what the unknown > codepoints (obviously quotes) must have been. The set of languages in > PUA and that have reliable transcription and translation is going to > be virtually empty, and if you care about correctness and you have the > font, directly convert the encoding. Yes, it's best to directly convert old source data when it's feasible. When the source data is in pre-Unicode Indic languages/scripts (or even in pre-shaping support Unicode), this can often not be accomplished simply.? If you know the font and can find a cross-reference table, then you're off to a good start.? If you can't find an existing cross-reference and have to "roll your own", it's not as fun as it sounds.? Some legacy fonts combine standard encoding with PUA for presentation forms, others use ISO-8859-hacks.? Any presentation form might be covered with a dedicated glyph in one font, yet the same presentation form might be constructed from two or three component glyphs in other fonts.? And, crucially, even after you've set up the basic cross-reference table, there's still reordering which must be accomplished.? (Pre-Unicode Indic was of necessity entered in visual order.? Same for pre-shaping Unicode Indic.) Instead of going through all that rigamarole, most users would probably prefer to just take a picture of the text with their phone and be done with it.? And if the source data is PDF, in a perfect world the PDF file could be dragged and dropped directly into the app, which would then prompt the user to choose whether the source should be processed as text or graphic. I don't know enough about the current state of OCR to evaluate the challenge of training software to recognize unsupported scripts.? An open source OCR system like Tesseract may already be set-up for the common Indic scripts, and since it's crowd-sourced might eventually ease or simplify the training process, if it hasn't already. From asmusf at ix.netcom.com Fri Oct 22 23:57:51 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Fri, 22 Oct 2021 21:57:51 -0700 Subject: Breaking barriers In-Reply-To: References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> Message-ID: <04223d35-5669-e21b-5153-14225aec3614@ix.netcom.com> An HTML attachment was scrubbed... URL: From jameskass at code2001.com Sat Oct 23 00:36:15 2021 From: jameskass at code2001.com (James Kass) Date: Sat, 23 Oct 2021 05:36:15 +0000 Subject: AW: Breaking barriers In-Reply-To: <000001d7c78c$168e1c80$43aa5580$@ewellic.org> References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> <000001d7c78c$168e1c80$43aa5580$@ewellic.org> Message-ID: On 2021-10-22 9:30 PM, Doug Ewell via Unicode wrote: > Miscommunication can happen in almost any translation situation, even between two educated, literate, fluent humans, and for that matter even within a single language. In addition to idiomatic or regional issues, there's also a temporal barrier to communication even within the same language.? This is because meanings of words (and even phrases) shift over time.? Some words I used as a kid now mean something completely different. The phrase "punk rock" wouldn't mean much to a 19th century denizen, and some future translator might output something meaning "rotting wood stone" for it. I came across a lyrics web page which offered the words for Washboard Sam's hit recording "Let Me Play Your Vendor".? A phrase in the song was transcribed as "Let me play your sea bird".? The song uses phrases related to juke boxes in order to convey its sexual imagery.? The lyric transcriber might have missed that completely, though.? The word "vendor" itself in the song title refers to a vending machine / machine that gives you something for a nickel / juke box.? The lyric transcriber, apparently unfamiliar with 1940s culture, could be excused for not correctly interpreting the phrase as "Let me play your Seaburg".? (Seaburg was a popular juke box brand name.? I filed a lyric correction on that particular web site.) Bottom line - any machine or human translator should take steps to determine the era in which the source material originated. From stas624-uni at yahoo.com Fri Oct 22 22:17:43 2021 From: stas624-uni at yahoo.com (stas) Date: Sat, 23 Oct 2021 03:17:43 +0000 (UTC) Subject: Gap at U+2FE0 References: <1153443827.836868.1634959063585.ref@mail.yahoo.com> Message-ID: <1153443827.836868.1634959063585@mail.yahoo.com> It bothers me that there is still empty space at U+2FE0 (see https://www.unicode.org/roadmaps/bmp/). I find it weird considering there are already two one-column blocks in SMP which are extensions of scripts encoded in BMP: Lisu and UCAS (see https://www.unicode.org/roadmaps/smp/). They would be perfect fit for that spot. This message: https://www.unicode.org/mail-arch/unicode-ml/y2007-m12/0035.html mentions proposal for additional Ideographic Description Characters (I guess it is https://www.unicode.org/L2/L2002/02221-cdp-idc.pdf), but almost 20 years passed and it's still not even mentioned on the roadmap, so I guess it is rejected for good. This document: https://www.unicode.org/L2/L2021/21016r-script-adhoc-rept.pdf states: A strong case would need to be made to place characters on the BMP, and in our view, the single open column at U+2FE0..U+2FEF should be used for characters with a valid case for encoding on the BMP. The Kanbun Extended block does not, in our opinion, fit this criterium. Ken Lunde also agrees with this view. What is a stronger case than extension of a block already encoded in BMP? It won't get any better than this. Looks like this spot became psychological golden place and no new proposal would be good enough for it. From 4mm4adbfrm4 at tonton-pixel.com Sat Oct 23 12:40:54 2021 From: 4mm4adbfrm4 at tonton-pixel.com (Michel Mariani) Date: Sat, 23 Oct 2021 19:40:54 +0200 Subject: Gap at U+2FE0 In-Reply-To: <1153443827.836868.1634959063585@mail.yahoo.com> References: <1153443827.836868.1634959063585.ref@mail.yahoo.com> <1153443827.836868.1634959063585@mail.yahoo.com> Message-ID: <34075049-DC55-4280-BBD0-AA858264D1C4@tonton-pixel.com> According to the recent document: Preliminary proposal to add a new provisional kIDS property (Unihan) , the U2FE0 block is still been considered for receiving extra IDCs: > There is currently an unassigned block of 16 code points immediately before the Ideographic Description Characters block, specifically the range U+2FE0 through U+2FEF, which could be used to encode the fifth IDC. BTW, the existence of this still empty "modest" block of sixteen characters would have been a good opportunity to efficiently encode a set of CJK-specific variation selectors which could have been used to represent region-specific CJK character glyphs, inspired by the the clever (unofficial) scheme proposed in the PanCJKV IVD Collection , which IMHO would be far more acceptable if the specific variation selectors were all as short as possible. After all, one of the original aims of Han Unification was the possibility to "pack" all CJK characters in 16 bits... This is of course no more relevant now that 93,867 Unihan characters have been assigned in Unicode 14.O, and more to come later... For the record, here are the eleven CJK glyph sources referenced so far through the various PDF code charts: | > Le 23 oct. 2021 ? 05:17, stas via Unicode a ?crit : > > It bothers me that there is still empty space at U+2FE0 (see https://www.unicode.org/roadmaps/bmp/). > I find it weird considering there are already two one-column blocks in SMP which are extensions of scripts encoded in BMP: > Lisu and UCAS (see https://www.unicode.org/roadmaps/smp/). They would be perfect fit for that spot. > > This message: https://www.unicode.org/mail-arch/unicode-ml/y2007-m12/0035.html > mentions proposal for additional Ideographic Description Characters (I guess it is https://www.unicode.org/L2/L2002/02221-cdp-idc.pdf), > but almost 20 years passed and it's still not even mentioned on the roadmap, so I guess > it is rejected for good. > > This document: https://www.unicode.org/L2/L2021/21016r-script-adhoc-rept.pdf states: > A strong case would need to be made to place characters on the BMP, and in our view, the single open > column at U+2FE0..U+2FEF should be used for characters with a valid case for encoding on the > BMP. The Kanbun Extended block does not, in our opinion, fit this criterium. Ken Lunde also agrees with > this view. > > What is a stronger case than extension of a block already encoded in BMP? It won't get any better than this. > Looks like this spot became psychological golden place and no new proposal would be good enough for it. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Sources Table.png Type: image/png Size: 95653 bytes Desc: not available URL: From doug at ewellic.org Sat Oct 23 14:46:01 2021 From: doug at ewellic.org (Doug Ewell) Date: Sat, 23 Oct 2021 13:46:01 -0600 Subject: Gap at U+2FE0 In-Reply-To: <34075049-DC55-4280-BBD0-AA858264D1C4@tonton-pixel.com> References: <1153443827.836868.1634959063585.ref@mail.yahoo.com> <1153443827.836868.1634959063585@mail.yahoo.com> <34075049-DC55-4280-BBD0-AA858264D1C4@tonton-pixel.com> Message-ID: <002f01d7c846$9c1de5f0$d459b1d0$@ewellic.org> Michel Mariani wrote: > BTW, the existence of this still empty "modest" block of sixteen > characters would have been a good opportunity to efficiently encode a > set of CJK-specific variation selectors which could have been used to > represent region-specific CJK character glyphs, inspired by the the > clever (unofficial) scheme proposed in the > https://github.com/adobe-type-tools/pancjkv-ivd-collection, which IMHO > would be far more acceptable if the specific variation selectors were > all as short as possible. After all, one of the original aims of Han > Unification was the possibility to "pack" all CJK characters in 16 > bits... What is the intended use case for variation selectors to represent region-specific CJK character glyphs? -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From jameskass at code2001.com Sat Oct 23 16:44:13 2021 From: jameskass at code2001.com (James Kass) Date: Sat, 23 Oct 2021 21:44:13 +0000 Subject: Breaking barriers In-Reply-To: References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> Message-ID: On 2021-10-22 9:04 PM, David Starner via Unicode wrote: > Project Gutenberg had a Swedish bible translation > in an unknown encoding (a variant of the DOS encoding that doesn't > seem to have corresponded to anything documented); getting it to > display correctly was basically the same challenge as translating it > to Unicode, which was eventually done by figuring out what the unknown > codepoints (obviously quotes) must have been. Editors for DOS fonts enabled users to create all manner of alternate "encodings" for anything which could fit into the grid. Newly created/modified fonts could be saved under different file names.? A DOS command then enabled users to swap the font-in-use. Here's an example of such an editor written by Adam Twardoch in 1994: https://dos-font-utils-wiki.readthedocs.io/en/latest/POLFED/ The Swedish text data which didn't match up with any known code page that David Starner encountered must have originally been displayed with such a modified font.? There's probably similar legacy data still out there which will be challenging to anyone trying to preserve it by converting it to Unicode. From asmusf at ix.netcom.com Sat Oct 23 17:59:00 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 23 Oct 2021 15:59:00 -0700 Subject: Breaking barriers In-Reply-To: <40abc91a-7f9c-ecf1-5a8d-0283f96a0ba4@ix.netcom.com> References: <40abc91a-7f9c-ecf1-5a8d-0283f96a0ba4@ix.netcom.com> Message-ID: An HTML attachment was scrubbed... URL: From jameskass at code2001.com Sat Oct 23 18:02:37 2021 From: jameskass at code2001.com (James Kass) Date: Sat, 23 Oct 2021 23:02:37 +0000 Subject: Breaking barriers In-Reply-To: References: <40abc91a-7f9c-ecf1-5a8d-0283f96a0ba4@ix.netcom.com> Message-ID: On 2021-10-23 10:59 PM, Asmus Freytag via Unicode wrote: > If you know the language, you can play with frequency data and try to use guess > mapping tables. You'll probably get most of the singleton to singleton mappings > correct, and then you could use various forms of trial and error, such as > genetic algorithms to locate and assign n:m mappings. > > If the language is not known, but among a set of known languages for which there > is existing data, I wouldn't be surprised to learn that you could adopt simple > language recognition algorithms to be independent of encoding details, and > either identify the actual language, or sharply limit the candidates. > > After that, you'd re-run the recognition algorithm with each candidate > transcoding table. > > I'm not an expert on this, but I did cobble together my own toy language > recognition code at one time, including using some genetic algorithm to improve > its sensitivity. Fun stuff and I was surprised how well that worked with only a > few hours of effort. That's a sophisticated approach.? For anyone lacking that level of expertise or not having quick access to language frequency/identification data, it might be more practical to locate the modified font, open it in one of those font editors which displays all the glyphs in the font on a grid, open up the Unicode charts, and start cross-mapping away. Or the font editor step could be skipped with a program that simply displays all of the font-in-use glyphs and their corresponding mappings. Here's a simple program that runs in dBASE III which does that: * asctoo.prg CLEA ALL SET TALK OFF SET ECHO OFF SET BELL OFF clea aa = 1 ab = 1 co = 1 ac = 0 ???? DO WHIL aa < 256 ???? ca = STR(aa,3) ????????? IF aa < 10 ????????? ca = SUBSTR(ca,3,1) ????????? ENDI aa ????????? IF aa < 100 .AND. aa > 9 ????????? ca = SUBSTR(ca,2,2) ????????? ENDI ????????? IF ab = 23 ????????? ab = 1 ?????????????? IF ac < 21 ?????????????? ac = ac + 6 ???????????????? ELSE ?????????????? ac = ac + 7 ?????????????? ENDI ac ????????? ENDI ab ????????? IF aa # 7 ????????? @ ab,ac SAY STR(aa,3) + "-" + CHR(&ca) ????????? ENDI aa ???? aa = aa + 1 ???? ab = ab + 1 ???? ENDD aa CLEA ALL @ 21,74 SAY "Press" @ 22,75 SAY "Any" @ 23,75 SAY "Key" SET CONS OFF WAIT SET CONS ON * EOF() asctoo.prg ............................ From asmusf at ix.netcom.com Sat Oct 23 18:32:29 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 23 Oct 2021 16:32:29 -0700 Subject: Breaking barriers In-Reply-To: References: <40abc91a-7f9c-ecf1-5a8d-0283f96a0ba4@ix.netcom.com> Message-ID: <0416bf73-d05e-abcb-6128-26c03eed961a@ix.netcom.com> An HTML attachment was scrubbed... URL: From mark at kli.org Sat Oct 23 21:11:44 2021 From: mark at kli.org (Mark E. Shoulson) Date: Sat, 23 Oct 2021 22:11:44 -0400 Subject: Breaking barriers In-Reply-To: References: <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> Message-ID: <5faddcad-3102-9aa1-b6f7-02e60dac2241@shoulson.com> Seriously, no ulterior motive.? Just that it is my (limited) understanding that they are used by people with various cognitive issues that make it difficult for them to communicate via more usual language.? So they're a way of talking that _sort of_ isn't associated with a language... but really all that means is they comprise a "language" of their own, just one that perhaps is unlike most others.? I accept that I might be mistaken about how they are used. ~mark On 10/22/21 17:43, Michael Everson via Unicode wrote: > I wonder why you have brought up Blissymbols. > > Michael Everson > http://evertype.com > >> On 22 Oct 2021, at 21:52, Mark E. Shoulson via Unicode wrote: >> >> ?On 10/22/21 12:17, Asmus Freytag via Unicode wrote: >>>> On 10/21/2021 3:40 PM, Mark E. Shoulson via Unicode wrote: >>>> If I recall correctly, someone has proved that "fully automatic high-quality translation" is AI-hard. Meaning that it's basically the same as making a fully aware, human-intelligence AI. Now, that probably depends a lot on the details of "high-quality." There are probably sentences and texts one could cook up that a would-be translator would need arbitrarily good understanding of the context, situation, shared cultural memories and references, etc etc for, and I guess that would be what the "proof" was about. >>> Sentences that require some understanding of the meaning for a successful translation, even if you only consider factual accuracy, are not hard to come by: they do prop up regularly. >>> >> Yeah, you're right. I was wrong to imply (or think) that it only mattered in rarefied corner cases. You give some fun examples of languages that don't mesh because they encode different information, and I'm sure a lot of us could come up with more. That makes any kind of language-independent representation difficult or impossible?if used or envisioned as a translation intermediate or codes "equivalent" to some sentence (because sentences may not be capable of being equivalent.) You can use it on its own to express concepts in its own way, but at that point it isn't a translation intermediate, nor even language-independent, but is a language in its own way (see Blissymbolics, which, fair warning, I really hardly know anything about, so maybe you shouldn't see them.) >> >>> >>>> It seems to me that that does have farther to take us, and we'll probably see a lot more improvement, but it can only take us so far. Then again, "so far" might be far enough. If you have a translator whose results are semantically satisfactory, say, 97% of the time, and sound only a little awkwardnessful to a native speaker in the target language... well, customers' standards may be willing to duck a little. >>> There's a level of "quality" that equates to "a human looking at the translation can guess what might have been in the original". >>> >> And it is over-optimistic to expect the level I expressed any time soon, yes. >> >> Thanks! >> >> From doug at ewellic.org Sun Oct 24 00:04:30 2021 From: doug at ewellic.org (Doug Ewell) Date: Sat, 23 Oct 2021 23:04:30 -0600 Subject: AW: Breaking barriers In-Reply-To: References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> <000001d7c78c$168e1c80$43aa5580$@ewellic.org> Message-ID: <003801d7c894$a13132b0$e3939810$@ewellic.org> Asmus Freytag wrote: > Your argument that humans can't get translations correct or make them > unambiguous should lead you to the conclusion that this lowers the bar > for AI translation. "Can't get translations correct or make them unambiguous 100% of the time, in every situation" would be a more accurate summary of my argument. But yes, that does lower the bar a little for machine translation. > But that's OK. We may stipulate that the highest achievable quality > can only approach 100% but must fall short. Agreed. > I've seen examples where the switched "left" and "right". A very human > mistake, but then, current AIs will get "he" and "she" wrong at times. As will some native speakers of Mandarin. (? (he), ? (she), and ? (it) are all pronounced "t?" in Mandarin, which blurs the distinction between these three pronouns when they switch to English.) -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From asmusf at ix.netcom.com Sun Oct 24 00:59:36 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 23 Oct 2021 22:59:36 -0700 Subject: AW: Breaking barriers In-Reply-To: <003801d7c894$a13132b0$e3939810$@ewellic.org> References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> <000001d7c78c$168e1c80$43aa5580$@ewellic.org> <003801d7c894$a13132b0$e3939810$@ewellic.org> Message-ID: <80489ff6-a61e-8ea5-1a63-9d1105fbe9d1@ix.netcom.com> An HTML attachment was scrubbed... URL: From xfq.free at gmail.com Sun Oct 24 20:19:57 2021 From: xfq.free at gmail.com (Fuqiao Xue) Date: Mon, 25 Oct 2021 09:19:57 +0800 Subject: AW: Breaking barriers In-Reply-To: <80489ff6-a61e-8ea5-1a63-9d1105fbe9d1@ix.netcom.com> References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> <000001d7c78c$168e1c80$43aa5580$@ewellic.org> <003801d7c894$a13132b0$e3939810$@ewellic.org> <80489ff6-a61e-8ea5-1a63-9d1105fbe9d1@ix.netcom.com> Message-ID: 2021?10?24?(?) 14:03 Asmus Freytag via Unicode : > (**) someone mentioned that human translators don't always agree on the best translations. And, sometimes authors who are not bilingual feel emboldened to insist on a more "literal" translation, because to their limited understanding of the target language, a more natural phrasing may sound unusual and there for different from what they think it should have been. Therefore, what constitutes an "accurate" translation is very much a moving target. Indeed. Faithfulness to the source text is the most important principle of translation, but different translators have different understanding and practices of this principle. If the original text is with a lyrical style, then in the translation process, the translator should also choose those beautiful and emotional words to express the original author's emotions. If the style of the original text is plain, then the translator should use plain words when translating. Translation can be considered as constantly asking such a question: if the original author is a great, for example, writer in the English language, how would he/she write this book in English? Translation should be based on concepts rather than sentences/phrases/words (even words/phrases with the same meaning in different languages might be used in different contexts, and sometimes there is no word corresponding to the word in the original text), and every concept that has appeared in the original text should be clearly reflected in the translation. From 747.neutron at gmail.com Mon Oct 25 02:03:47 2021 From: 747.neutron at gmail.com (=?UTF-8?B?V8OhbmcgWWlmw6Fu?=) Date: Mon, 25 Oct 2021 16:03:47 +0900 Subject: Gap at U+2FE0 In-Reply-To: <002f01d7c846$9c1de5f0$d459b1d0$@ewellic.org> References: <1153443827.836868.1634959063585.ref@mail.yahoo.com> <1153443827.836868.1634959063585@mail.yahoo.com> <34075049-DC55-4280-BBD0-AA858264D1C4@tonton-pixel.com> <002f01d7c846$9c1de5f0$d459b1d0$@ewellic.org> Message-ID: > What is the intended use case for variation selectors to represent region-specific CJK character glyphs? It's not that I fall in with turning that section into variation selectors, but it'd be sometimes a lot easier to convince (non-CJK) developers to include a pan-CJK font than three or four each is already darn huge for them, especially game dev. 2021?10?24?(?) 4:49 Doug Ewell via Unicode : > > Michel Mariani wrote: > > > BTW, the existence of this still empty "modest" block of sixteen > > characters would have been a good opportunity to efficiently encode a > > set of CJK-specific variation selectors which could have been used to > > represent region-specific CJK character glyphs, inspired by the the > > clever (unofficial) scheme proposed in the > > https://github.com/adobe-type-tools/pancjkv-ivd-collection, which IMHO > > would be far more acceptable if the specific variation selectors were > > all as short as possible. After all, one of the original aims of Han > > Unification was the possibility to "pack" all CJK characters in 16 > > bits... > > What is the intended use case for variation selectors to represent region-specific CJK character glyphs? > > -- > Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org > > > From pgcon6 at msn.com Mon Oct 25 12:02:39 2021 From: pgcon6 at msn.com (Peter Constable) Date: Mon, 25 Oct 2021 17:02:39 +0000 Subject: "DOS fonts" (was RE: Breaking barriers) Message-ID: > A DOS command then enabled users to swap the font-in-use. As I recall, DOS had no such command. Rather, one needed a utility that would load the font data into specific memory. I dealt with that while working on my MA in linguistics: I had a Hercules graphics card (pre-VGA, but better than EGA) and a utility specific to the Hercules to load font data into memory on the Hercules card. And Word for DOS had a graphics mode that would display using whatever font was provided by the Hercules card. So, I could edit word documents with "special" characters. Peter -----Original Message----- From: Unicode On Behalf Of James Kass via Unicode Sent: October 23, 2021 2:44 PM To: unicode at corp.unicode.org Subject: Re: Breaking barriers On 2021-10-22 9:04 PM, David Starner via Unicode wrote: > Project Gutenberg had a Swedish bible translation in an unknown > encoding (a variant of the DOS encoding that doesn't seem to have > corresponded to anything documented); getting it to display correctly > was basically the same challenge as translating it to Unicode, which > was eventually done by figuring out what the unknown codepoints > (obviously quotes) must have been. Editors for DOS fonts enabled users to create all manner of alternate "encodings" for anything which could fit into the grid. Newly created/modified fonts could be saved under different file names.? A DOS command then enabled users to swap the font-in-use. Here's an example of such an editor written by Adam Twardoch in 1994: https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdos-font-utils-wiki.readthedocs.io%2Fen%2Flatest%2FPOLFED%2F&data=04%7C01%7C%7C7e7c2c814eff43d780b708d9966f51ee%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637706227125043511%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=dNAqSuA6n0A0gSLMzohErb%2FMbTgT2wIban8m7jW0a3A%3D&reserved=0 The Swedish text data which didn't match up with any known code page that David Starner encountered must have originally been displayed with such a modified font.? There's probably similar legacy data still out there which will be challenging to anyone trying to preserve it by converting it to Unicode. From doug at ewellic.org Mon Oct 25 12:23:23 2021 From: doug at ewellic.org (Doug Ewell) Date: Mon, 25 Oct 2021 11:23:23 -0600 Subject: "DOS fonts" (was RE: Breaking barriers) In-Reply-To: References: Message-ID: <001201d7c9c5$042bd240$0c8376c0$@ewellic.org> Peter Constable wrote: >> A DOS command then enabled users to swap the font-in-use. > > As I recall, DOS had no such command. Rather, one needed a utility > that would load the font data into specific memory. I suspect James was thinking of the MODE CON CP SELECT=x command, where 'x' was the code page ID of the desired character set. You also had to "prepare" the code page, using one or more settings in CONFIG.SYS, but once that was done, you could use MODE to switch back and forth to your heart's content. If you created a custom .CPI file with the code page ID and glyphs you wanted, and added it to the right place, and IIRC chanted the right spell, you could display any simple characters you liked (no combining, no RTL, had to fit in 8 pixels wide) on your PC screen. It was a lot of fun. Interoperability was horrible. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From jameskass at code2001.com Mon Oct 25 15:03:50 2021 From: jameskass at code2001.com (James Kass) Date: Mon, 25 Oct 2021 20:03:50 +0000 Subject: "DOS fonts" (was RE: Breaking barriers) In-Reply-To: <001201d7c9c5$042bd240$0c8376c0$@ewellic.org> References: <001201d7c9c5$042bd240$0c8376c0$@ewellic.org> Message-ID: On 2021-10-25 5:23 PM, Doug Ewell via Unicode wrote: > Peter Constable wrote: > >>> A DOS command then enabled users to swap the font-in-use. >> As I recall, DOS had no such command. Rather, one needed a utility >> that would load the font data into specific memory. > I suspect James was thinking of the MODE CON CP SELECT=x command, where 'x' was the code page ID of the desired character set. My post was poorly phrased.? "A command entered at the DOS prompt" would have been better.? It wasn't a native DOS command.? An internet search revealed that typical extensions for the modified/newly created fonts included "*.F11" or "*.F12".? I couldn't locate the "*.COM" file which swapped the font-in-use in my archives, I can't remember the file name.? I did find "8859-5.f16" in a directory, which appears to be one I made back in the day. From marius.spix at web.de Tue Oct 26 02:34:48 2021 From: marius.spix at web.de (Marius Spix) Date: Tue, 26 Oct 2021 09:34:48 +0200 Subject: Aw: Re: AW: Breaking barriers In-Reply-To: <80489ff6-a61e-8ea5-1a63-9d1105fbe9d1@ix.netcom.com> References: <59a437c3-83c8-0a49-4618-dab11003cd8b@code2001.com> <247f65e7-d6c2-5fa5-1f26-5c713e261cf5@shoulson.com> <71574cce-f6c2-b0ef-ef11-5a84fc635009@shoulson.com> <000001d7c78c$168e1c80$43aa5580$@ewellic.org> <003801d7c894$a13132b0$e3939810$@ewellic.org> <80489ff6-a61e-8ea5-1a63-9d1105fbe9d1@ix.netcom.com> Message-ID: An HTML attachment was scrubbed... URL: