From richard.wordingham at ntlworld.com Mon Oct 10 14:00:01 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Mon, 10 Oct 2022 20:00:01 +0100 Subject: Terminology for competing Myanmar encodings Message-ID: <20221010200001.495053af@JRWUBU2> As I understand TUS Table 16-4, would be a Unicode-compliant spelling of a word in the modern Burmese language and would be Unicode non-compliant. Is there a better terminology I should use instead? Now, if the language of the word is not Burmese, am I right to say that TUS is silent on whether there is an 'incorrect' spelling? Does Unicode offer any defence for using the first encoding? Could I call it a 'Unicode Burmese-style encoding'? I am aware that the major renderers decline to support the first encoding. That doesn't stop it being used outside Burmese. Richard. From jameskass at code2001.com Mon Oct 10 17:04:31 2022 From: jameskass at code2001.com (James Kass) Date: Mon, 10 Oct 2022 22:04:31 +0000 Subject: Terminology for competing Myanmar encodings In-Reply-To: <20221010200001.495053af@JRWUBU2> References: <20221010200001.495053af@JRWUBU2> Message-ID: <15570165-3544-ed04-397f-6dc89d4dc5f4@code2001.com> On 2022-10-10 7:00 PM, Richard Wordingham via Unicode wrote: > As I understand TUS Table 16-4, MYANMAR VOWEL SIGN AI, U+102F MYANMAR VOWEL SIGN U> would be a > Unicode-compliant spelling of a word in the modern Burmese language and > would be Unicode non-compliant. Is there a > better terminology I should use instead? https://www.unicode.org/versions/Unicode15.0.0/ch16.pdf The table is on pages 672 and 673.??? The table shows that above base vowel signs (i,ii,ai) should be entered before below base vowel signs (u,uu). The attached graphic shows the display using Windows 7 Uniscribe. LibreOffice/HarfBuzz display is similar on the three fonts tested. Here's the sample text used to make the graphic for anyone who likes to try things out for themselves: ???? ? ? ????? ?????? ???? ?? ? ????? ??????? -------------- next part -------------- A non-text attachment was scrubbed... Name: 20221010_Capture.JPG Type: image/jpeg Size: 25569 bytes Desc: not available URL: From richard.wordingham at ntlworld.com Tue Oct 11 23:53:25 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Wed, 12 Oct 2022 05:53:25 +0100 Subject: Terminology for competing Myanmar encodings In-Reply-To: <20221010200001.495053af@JRWUBU2> References: <20221010200001.495053af@JRWUBU2> Message-ID: <20221012055325.40c84106@JRWUBU2> On Mon, 10 Oct 2022 20:00:01 +0100 Richard Wordingham via Unicode wrote: > As I understand TUS Table 16-4, MYANMAR VOWEL SIGN AI, U+102F MYANMAR VOWEL SIGN U> would be a > Unicode-compliant spelling of a word in the modern Burmese language > and would be Unicode non-compliant. Is > there a better terminology I should use instead? > > Now, if the language of the word is not Burmese, am I right to say > that TUS is silent on whether there is an 'incorrect' spelling? Does > Unicode offer any defence for using the first encoding? Could I call > it a 'Unicode Burmese-style encoding'? > > I am aware that the major renderers decline to support the first > encoding. That doesn't stop it being used outside Burmese. CORRECTION: CoreText (or whatever is in iOS) supports the first encoding, but not the second. The theoretical issue is whether SIGN AI is truly a vowel above, or a coda consonant / vowel modifier like anusvara. Richard. From haberg-1 at telia.com Thu Oct 13 03:40:11 2022 From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=) Date: Thu, 13 Oct 2022 10:40:11 +0200 Subject: Glasses emoji Message-ID: <9BF6DC88-D8BC-499A-81B6-780834636E93@telia.com> For some reason, one cannot add glasses to emoji, even though skin color, which a concern of young people, as in the video below. There is ? NERD FACE U+1F913, but they do not feel it is a representative of all young today. (Not speaking about sunglasses here, as in ? Unicode U+1F60E.) https://www.bbc.com/news/av/uk-england-nottinghamshire-63229464 From marius.spix at web.de Thu Oct 13 08:45:34 2022 From: marius.spix at web.de (Marius Spix) Date: Thu, 13 Oct 2022 15:45:34 +0200 Subject: Glasses emoji In-Reply-To: <9BF6DC88-D8BC-499A-81B6-780834636E93@telia.com> References: <9BF6DC88-D8BC-499A-81B6-780834636E93@telia.com> Message-ID: <20221013154534.279cb540@spixxi> There is already EYEGLASSES U+1F453 in Unicode. Vendors like Apple, Google, Twitter, Meta etc. should agree to use ZWJ sequences to add glasses to existing emojis. Am Thu, 13 Oct 2022 10:40:11 +0200 schrieb Hans ?berg via Unicode : > For some reason, one cannot add glasses to emoji, even though skin > color, which a concern of young people, as in the video below. There > is ? NERD FACE U+1F913, but they do not feel it is a representative > of all young today. (Not speaking about sunglasses here, as in ? > Unicode U+1F60E.) > > https://www.bbc.com/news/av/uk-england-nottinghamshire-63229464 > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 833 bytes Desc: Digitale Signatur von OpenPGP URL: From wjgo_10009 at btinternet.com Thu Oct 13 08:22:13 2022 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Thu, 13 Oct 2022 14:22:13 +0100 (BST) Subject: Glasses emoji Message-ID: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com> Thank you for posting about this. Could one use variation selectors with this too, so as to have a default style of glasses and various styles of glasses available? Or would one need to have separate styles of glasses each encoded separately? If both approaches are possible, which one would be better? If it is to be encoded, and I hope it will be, it would be good to go for the lot all at once. Lots of styles as glasses are in lots of styles. In my opinion it is no use just doing one and leaving the rest for some future time as that is often a recipe for the rest never getting done. If the lot is done as one grand forward leap then that is the way to keep Unicode thriving. William Overington Thursday 13 October 2022 From mark at kli.org Thu Oct 13 18:38:06 2022 From: mark at kli.org (Mark E. Shoulson) Date: Thu, 13 Oct 2022 19:38:06 -0400 Subject: Glasses emoji In-Reply-To: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com> References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com> Message-ID: Again, this way lieth madness.? People aren't satisfied with an emoji for "female teacher with dark hair"; they want "TALL, THIN, female PHYSICS teacher with dark hair IN PRINCESS-LEIA BUNS AND A PIERCED EYEBROW (GOLD RING)."? And if you give in on "welllllll, okay, we'll give in on the tall/short...," you're only encouraging them to beg for the rest.? ("How about only a _little_ tall?? How about broad-shouldered?? small-breasted?") (Though my opinion isn't actually quite what that sounds like: even I admit that there probably *are* things that are appropriate to give in on, and I know we all can argue all the day long about them.) ~mark On 10/13/22 09:22, William_J_G Overington via Unicode wrote: > Thank you for posting about this. > > Could one use variation selectors with this too, so as to have a > default style of glasses and various styles of glasses available? > > Or would one need to have separate styles of glasses each encoded > separately? > > If both approaches are possible, which one would be better? > > If it is to be encoded, and I hope it will be, it would be good to go > for the lot all at once. Lots of styles as glasses are in lots of styles. > > In my opinion it is no use just doing one and leaving the rest for > some future time as that is often a recipe for the rest never getting > done. > > If the lot is done as one grand forward leap then that is the way to > keep Unicode thriving. > > William Overington > > Thursday 13 October 2022 From asmusf at ix.netcom.com Thu Oct 13 23:54:54 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Thu, 13 Oct 2022 21:54:54 -0700 Subject: Glasses emoji In-Reply-To: References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com> Message-ID: People that grew up on games are used to character editors that allow any avatar to be assembled from building blocks. Short of a common "avatar engine" shared across all platforms, a limited set of emoji-legos isn't that unreasonable. We have skin tones, male/female, some limited use of color (black + cat). Because of their small size, emoji faces would support more customization; it's hard to create a full character emoji on the level of detail of a game character. So you'd be limited to less detail than you can implement with real lego blocks. (And yes, the ones for the heads of the little figure have removable hair (and head gear). Plus a variety of of faces (pirate) painted on. If that can be done in the physical world, there's no reason a subset of that couldn't be supported in emoji rendering. People will intuitively sense that that should be possible and thus the pressure to innovate in that direction won't stop. Just my $1/50. A./ On 10/13/2022 4:38 PM, Mark E. Shoulson via Unicode wrote: > Again, this way lieth madness.? People aren't satisfied with an emoji > for "female teacher with dark hair"; they want "TALL, THIN, female > PHYSICS teacher with dark hair IN PRINCESS-LEIA BUNS AND A PIERCED > EYEBROW (GOLD RING)."? And if you give in on "welllllll, okay, we'll > give in on the tall/short...," you're only encouraging them to beg for > the rest.? ("How about only a _little_ tall?? How about > broad-shouldered?? small-breasted?") > > (Though my opinion isn't actually quite what that sounds like: even I > admit that there probably *are* things that are appropriate to give in > on, and I know we all can argue all the day long about them.) > > ~mark > > On 10/13/22 09:22, William_J_G Overington via Unicode wrote: >> Thank you for posting about this. >> >> Could one use variation selectors with this too, so as to have a >> default style of glasses and various styles of glasses available? >> >> Or would one need to have separate styles of glasses each encoded >> separately? >> >> If both approaches are possible, which one would be better? >> >> If it is to be encoded, and I hope it will be, it would be good to go >> for the lot all at once. Lots of styles as glasses are in lots of >> styles. >> >> In my opinion it is no use just doing one and leaving the rest for >> some future time as that is often a recipe for the rest never getting >> done. >> >> If the lot is done as one grand forward leap then that is the way to >> keep Unicode thriving. >> >> William Overington >> >> Thursday 13 October 2022 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Fri Oct 14 10:20:04 2022 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Fri, 14 Oct 2022 16:20:04 +0100 (BST) Subject: Glasses emoji Message-ID: <3becbc9f.18dbf.183d714c05d.Webtop.99@btinternet.com> Following some thought and experimentation here in England this morning, I wrote and posted the following. https://lists.aau.at/pipermail/mpeg-otspec/2022-October/002863.html Could readers consider and discuss this please? If this is regarded as a valid way to do things both for OpenType and for Unicode then this could be the way forward for glasses emoji and for other emoji as well. William Overington Friday 14 October 2022 From mark at kli.org Fri Oct 14 16:43:25 2022 From: mark at kli.org (Mark E. Shoulson) Date: Fri, 14 Oct 2022 17:43:25 -0400 Subject: The conflicting needs of emoji In-Reply-To: References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

Message-ID: There are really two distinct animi(?) behind the push for ever-more-detailed emoji, or rather, two animi for using emoji, and they pull in different directions, almost opposite. On one hand, people want an emoji that looks JUST like they want it to look.? Maybe not even only for people!? But of course we see it most with people.? People want an emoji that looks like _they_ look (do they really use it?? or do they just feel left out if it isn't available?? I'm not a heavy emoji-user, so I'm no judge, but I bet the second motivation is non-trivial as well.)? So the really do want "tall ectomorph female physics teacher with dark hair in Princess Leia buns, an eyebrow piercing (right), and a birthmark on the left side of the chin."? This motivates the various suggestions of somehow encoding a teeny image and pretending it's somehow "plain text", or encoding a link to it or something.? I had what I thought were helpful ideas about the recent notions being floated at the last Unicode meeting I attended, and maybe they're even being thought about...? But although I've thought that maybe sending images like this was the best of the suggested solutions, it's still awful. The other problem is the other animus involved.? Because it isn't just a picture that people want when they use emoji.? It isn't enough that there's a little picture of a person wearing a mortarboard hat or something, there's actual semantic information embedded in the encoded text as well.? It isn't just a picture, it's a codepoint(-sequence) that means something, a bit-sequence that _means_ "TEACHER."? Just like U+0065 means something more than the ink used to draw it in whatever font.? People want some snippet of "text" that not only looks like them, but also *means* them.? Hence in my example above, expressing some of the various physical traits are one thing, and you can represent TEACHER with some cultural convention like a mortarboard hat (or THIEF with a mask), but how do you get across "PHYSICS teacher"?? Or a dozen other subjects, arbitrarily finely divided? When I think about it, I don't know that people would really be satisfied with image-sticker emoji.? After all, not everyone has the skill to draw them (which is why we rely on emoji-font artists in the first place), or make them Just So, and I really do think that people would feel the lack of semantic meaning.? A lot of messaging services already let you include little graphics images, but I don't think the people using them feel this desire for new emoji any less.? How many homedrawn emoji do you really think will be made?? How many used more than twice?? How many used by more than one person?? How many will even be understood by the recipient? Asmus' point about comparing it to swapping out lego bits is well-taken.? There have long been these kind of "avatar engines" that let you swap around features to get something kind of like you (I remember one on the Wii way back when.)? And maybe there is some reasonable limit to how much customization can be provided (though I'd bet anything we'd take forever agreeing on it.)? And even within specific limits like hair-style and hats, there'll always be one more that we're lacking, one more Mr Potato Head piece people will push for. (Actually, thinking about Asmus' line about "faces drawn on," how's this for an idea, combining raster with standard?? Instead of drawing some random picture yourself for an emoji, you have an image of, say, facial features that's to be projected onto a blank emoji-face, which can be any "standard" emoji or whatever.? Could have other distinct ways of specifying headgear images, etc.? The renderer would be smart enough to scale or transform the image appropriately for different kinds of emoji with faces in different places and orientations etc.? Probably a beast to implement, but I'm just floating ideas.? This one actually has signs of MAYBE bridging the gap between the two drives for emoji.) Perhaps the best "generalized emoji" implementation is something along the lines of the Emoji Kitchen, where you can combine arbitrary emoji in arbitrary numbers and orders and the system does its level best to figure out SOME way for the resulting image to make sense.? This gives you some genericness, and you can express all kinds of shades of meaning by combining enough emoji, but retains the semantic meaning of them as well.? Of course, it puts you at the mercy of how the system chooses to combine things, how good the designers are, etc etc.? You have no control over what really emerges at the end.? (I suppose if this ever became something widespread there would develop conventions for combining with a little more control (like Egyptian hieroglyph combiner marks??? Probably not, but with some semantic similarities.)) Anyway.? Wanted to rant a bit on this "two desires" notion that I was thinking about since the last meeting.? I think it's important to remember the second one, which gets missed out on when people focus on controlling the picture just so (though it is what's behind the idea of using Wikidata codes.)? And the "images of features" notion occurred to me while typing this, and I think it's interesting. I'm not really trying to suggest answers here (though I did remark on some things favorably); this is more asking the questions.? There's always going to be these two conflicting needs, and there'll always be people who want ever-finer distinctions in emoji, and there may simply not be any really good answers.? Emoji probably never should have been part of Unicode (not "plain text"), but that ship sailed long ago, and even there it's not cut and dried (webdings? map symbols?) Thoughts? ~mark On 10/14/22 00:54, Asmus Freytag via Unicode wrote: > People that grew up on games are used to character editors that allow > any avatar to be assembled from building blocks. Short of a common > "avatar engine" shared across all platforms, a limited set of > emoji-legos isn't that unreasonable. > > We have skin tones, male/female, some limited use of color (black + cat). > > Because of their small size, emoji faces would support more > customization; it's hard to create a full character emoji on the level > of detail of a game character. So you'd be limited to less detail than > you can implement with real lego blocks. (And yes, the ones for the > heads of the little figure have removable hair (and head gear). Plus a > variety of of faces (pirate) painted on. > > If that can be done in the physical world, there's no reason a subset > of that couldn't be supported in emoji rendering. > > People will intuitively sense that that should be possible and thus > the pressure to innovate in that direction won't stop. > > Just my $1/50. > > A./ > > On 10/13/2022 4:38 PM, Mark E. Shoulson via Unicode wrote: >> Again, this way lieth madness.? People aren't satisfied with an emoji >> for "female teacher with dark hair"; they want "TALL, THIN, female >> PHYSICS teacher with dark hair IN PRINCESS-LEIA BUNS AND A PIERCED >> EYEBROW (GOLD RING)."? And if you give in on "welllllll, okay, we'll >> give in on the tall/short...," you're only encouraging them to beg >> for the rest.? ("How about only a _little_ tall?? How about >> broad-shouldered?? small-breasted?") >> >> (Though my opinion isn't actually quite what that sounds like: even I >> admit that there probably *are* things that are appropriate to give >> in on, and I know we all can argue all the day long about them.) >> >> ~mark >> >> On 10/13/22 09:22, William_J_G Overington via Unicode wrote: >>> Thank you for posting about this. >>> >>> Could one use variation selectors with this too, so as to have a >>> default style of glasses and various styles of glasses available? >>> >>> Or would one need to have separate styles of glasses each encoded >>> separately? >>> >>> If both approaches are possible, which one would be better? >>> >>> If it is to be encoded, and I hope it will be, it would be good to >>> go for the lot all at once. Lots of styles as glasses are in lots of >>> styles. >>> >>> In my opinion it is no use just doing one and leaving the rest for >>> some future time as that is often a recipe for the rest never >>> getting done. >>> >>> If the lot is done as one grand forward leap then that is the way to >>> keep Unicode thriving. >>> >>> William Overington >>> >>> Thursday 13 October 2022 > > From abehjat at apple.com Fri Oct 14 17:28:55 2022 From: abehjat at apple.com (Adib Behjat) Date: Fri, 14 Oct 2022 15:28:55 -0700 Subject: The conflicting needs of emoji In-Reply-To: References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

Message-ID: I second Mark?s sentiment and really like the suggestion. I also do think it would be wiser if this process was handled and managed by OSs/Apps versus Unicode. After reading Mark?s suggestion, I was reminded of a game where you combine basic/generic elements to create more complex elements (e.g. https://littlealchemy.com/ ). For example, if someone wants to generate ?Physics Teacher?, a user can type in their device: ????? And based on this combination, the OS/App can give the user the option to generate a custom avatar to represent Physics Teacher. With regards to rendering, tools like DALL-E (or other similar diffusion models) can enable this capability. In addition, this process will help encourage the introduction of more generic emoji characters to help expand the foundational building blocks for these tools. > On Oct 14, 2022, at 2:43 PM, Mark E. Shoulson via Unicode wrote: > > There are really two distinct animi(?) behind the push for ever-more-detailed emoji, or rather, two animi for using emoji, and they pull in different directions, almost opposite. > > On one hand, people want an emoji that looks JUST like they want it to look. Maybe not even only for people! But of course we see it most with people. People want an emoji that looks like _they_ look (do they really use it? or do they just feel left out if it isn't available? I'm not a heavy emoji-user, so I'm no judge, but I bet the second motivation is non-trivial as well.) So the really do want "tall ectomorph female physics teacher with dark hair in Princess Leia buns, an eyebrow piercing (right), and a birthmark on the left side of the chin." This motivates the various suggestions of somehow encoding a teeny image and pretending it's somehow "plain text", or encoding a link to it or something. I had what I thought were helpful ideas about the recent notions being floated at the last Unicode meeting I attended, and maybe they're even being thought about... But although I've thought that maybe sending images like this was the best of the suggested solutions, it's still awful. > > The other problem is the other animus involved. Because it isn't just a picture that people want when they use emoji. It isn't enough that there's a little picture of a person wearing a mortarboard hat or something, there's actual semantic information embedded in the encoded text as well. It isn't just a picture, it's a codepoint(-sequence) that means something, a bit-sequence that _means_ "TEACHER." Just like U+0065 means something more than the ink used to draw it in whatever font. People want some snippet of "text" that not only looks like them, but also *means* them. Hence in my example above, expressing some of the various physical traits are one thing, and you can represent TEACHER with some cultural convention like a mortarboard hat (or THIEF with a mask), but how do you get across "PHYSICS teacher"? Or a dozen other subjects, arbitrarily finely divided? > > When I think about it, I don't know that people would really be satisfied with image-sticker emoji. After all, not everyone has the skill to draw them (which is why we rely on emoji-font artists in the first place), or make them Just So, and I really do think that people would feel the lack of semantic meaning. A lot of messaging services already let you include little graphics images, but I don't think the people using them feel this desire for new emoji any less. How many homedrawn emoji do you really think will be made? How many used more than twice? How many used by more than one person? How many will even be understood by the recipient? > > Asmus' point about comparing it to swapping out lego bits is well-taken. There have long been these kind of "avatar engines" that let you swap around features to get something kind of like you (I remember one on the Wii way back when.) And maybe there is some reasonable limit to how much customization can be provided (though I'd bet anything we'd take forever agreeing on it.) And even within specific limits like hair-style and hats, there'll always be one more that we're lacking, one more Mr Potato Head piece people will push for. > > (Actually, thinking about Asmus' line about "faces drawn on," how's this for an idea, combining raster with standard? Instead of drawing some random picture yourself for an emoji, you have an image of, say, facial features that's to be projected onto a blank emoji-face, which can be any "standard" emoji or whatever. Could have other distinct ways of specifying headgear images, etc. The renderer would be smart enough to scale or transform the image appropriately for different kinds of emoji with faces in different places and orientations etc. Probably a beast to implement, but I'm just floating ideas. This one actually has signs of MAYBE bridging the gap between the two drives for emoji.) > > Perhaps the best "generalized emoji" implementation is something along the lines of the Emoji Kitchen, where you can combine arbitrary emoji in arbitrary numbers and orders and the system does its level best to figure out SOME way for the resulting image to make sense. This gives you some genericness, and you can express all kinds of shades of meaning by combining enough emoji, but retains the semantic meaning of them as well. Of course, it puts you at the mercy of how the system chooses to combine things, how good the designers are, etc etc. You have no control over what really emerges at the end. (I suppose if this ever became something widespread there would develop conventions for combining with a little more control (like Egyptian hieroglyph combiner marks?? Probably not, but with some semantic similarities.)) > > Anyway. Wanted to rant a bit on this "two desires" notion that I was thinking about since the last meeting. I think it's important to remember the second one, which gets missed out on when people focus on controlling the picture just so (though it is what's behind the idea of using Wikidata codes.) And the "images of features" notion occurred to me while typing this, and I think it's interesting. > > I'm not really trying to suggest answers here (though I did remark on some things favorably); this is more asking the questions. There's always going to be these two conflicting needs, and there'll always be people who want ever-finer distinctions in emoji, and there may simply not be any really good answers. Emoji probably never should have been part of Unicode (not "plain text"), but that ship sailed long ago, and even there it's not cut and dried (webdings? map symbols?) > > Thoughts? > > ~mark > > > On 10/14/22 00:54, Asmus Freytag via Unicode wrote: >> People that grew up on games are used to character editors that allow any avatar to be assembled from building blocks. Short of a common "avatar engine" shared across all platforms, a limited set of emoji-legos isn't that unreasonable. >> >> We have skin tones, male/female, some limited use of color (black + cat). >> >> Because of their small size, emoji faces would support more customization; it's hard to create a full character emoji on the level of detail of a game character. So you'd be limited to less detail than you can implement with real lego blocks. (And yes, the ones for the heads of the little figure have removable hair (and head gear). Plus a variety of of faces (pirate) painted on. >> >> If that can be done in the physical world, there's no reason a subset of that couldn't be supported in emoji rendering. >> >> People will intuitively sense that that should be possible and thus the pressure to innovate in that direction won't stop. >> >> Just my $1/50. >> >> A./ >> >> On 10/13/2022 4:38 PM, Mark E. Shoulson via Unicode wrote: >>> Again, this way lieth madness. People aren't satisfied with an emoji for "female teacher with dark hair"; they want "TALL, THIN, female PHYSICS teacher with dark hair IN PRINCESS-LEIA BUNS AND A PIERCED EYEBROW (GOLD RING)." And if you give in on "welllllll, okay, we'll give in on the tall/short...," you're only encouraging them to beg for the rest. ("How about only a _little_ tall? How about broad-shouldered? small-breasted?") >>> >>> (Though my opinion isn't actually quite what that sounds like: even I admit that there probably *are* things that are appropriate to give in on, and I know we all can argue all the day long about them.) >>> >>> ~mark >>> >>> On 10/13/22 09:22, William_J_G Overington via Unicode wrote: >>>> Thank you for posting about this. >>>> >>>> Could one use variation selectors with this too, so as to have a default style of glasses and various styles of glasses available? >>>> >>>> Or would one need to have separate styles of glasses each encoded separately? >>>> >>>> If both approaches are possible, which one would be better? >>>> >>>> If it is to be encoded, and I hope it will be, it would be good to go for the lot all at once. Lots of styles as glasses are in lots of styles. >>>> >>>> In my opinion it is no use just doing one and leaving the rest for some future time as that is often a recipe for the rest never getting done. >>>> >>>> If the lot is done as one grand forward leap then that is the way to keep Unicode thriving. >>>> >>>> William Overington >>>> >>>> Thursday 13 October 2022 >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Sat Oct 15 10:22:53 2022 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Sat, 15 Oct 2022 16:22:53 +0100 (BST) Subject: The conflicting needs of emoji In-Reply-To: References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

Message-ID: Mark E. Shoulson wrote: > Thoughts? A way to solve this would, in my opinion, be to produce a system based on the following. http://www.users.globalnet.co.uk/~ngo/14560000.htm yet redesigned and updated for use with a Unicode text stream as input and so that as well as data types such as Integer, Boolean, Complex and Quaternions et cetera used then that there would also be data types of Point, Contour, Glyph. There could also be preset shapes for eyes, mouths, and so on built into the middleware and accessible by software methods in the middleware and those software methods callable in a straightforward manner from the "plain text" stream. Then software methods to scale and move glyphs would be included in the 1456 middleware virtual machine software running in the rendering system. That way the message sent as "plain text" would often, even usually, include calls with data parameters to preset software routines in the 1456 middleware virtual machine running in the renderer but the "plain text" message would also be capable of containing direct software for the 1456 middleware virtual machine where that were needed to go beyond what the preset methods could do. The 1456 virtual machine would be properly sandboxed so as to ensure that any risk of a virus threat getting into the host computer is not possible. For the avoidance of doubt I emphasise that the accumulator register of the 1456 virtual machine is a sandboxed software construct and is not the accumulator register of the host computer upon which the virtual machine is running. I appreciate that such a 1456 style system cannot become part of a future version of Unicode if "plain text" is required to not become redefined to some extent to allow such a 1456 style system to become included in Unicode. William Overington Saturday 15 October 2022 -------------- next part -------------- An HTML attachment was scrubbed... URL: From marius.spix at web.de Sat Oct 15 11:49:05 2022 From: marius.spix at web.de (Marius Spix) Date: Sat, 15 Oct 2022 18:49:05 +0200 Subject: Aw: Re: The conflicting needs of emoji In-Reply-To: References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

Message-ID: What is the need for your 1456 middleware system when there is already software like HarfBuzz doing exactly the same thing? > Gesendet: Samstag, den 15.10.2022 um 17:22 Uhr > Von: "William_J_G Overington via Unicode" > An: unicode at corp.unicode.org > Betreff: Re: The conflicting needs of emoji > > > Mark E. Shoulson wrote: > > > > Thoughts? > > A way to solve this would, in my opinion, be to produce a system based > on the following. > > http://www.users.globalnet.co.uk/~ngo/14560000.htm > > > yet redesigned and updated for use with a Unicode text stream as input > and so that as well as data types such as Integer, Boolean, Complex and > Quaternions et cetera used then that there would also be data types of > Point, Contour, Glyph. There could also be preset shapes for eyes, > mouths, and so on built into the middleware and accessible by software > methods in the middleware and those software methods callable in a > straightforward manner from the "plain text" stream. > > Then software methods to scale and move glyphs would be included in the > 1456 middleware virtual machine software running in the rendering > system. > > That way the message sent as "plain text" would often, even usually, > include calls with data parameters to preset software routines in the > 1456 middleware virtual machine running in the renderer but the "plain > text" message would also be capable of containing direct software for > the 1456 middleware virtual machine where that were needed to go beyond > what the preset methods could do. > > The 1456 virtual machine would be properly sandboxed so as to ensure > that any risk of a virus threat getting into the host computer is not > possible. > > For the avoidance of doubt I emphasise that the accumulator register of > the 1456 virtual machine is a sandboxed software construct and is not > the accumulator register of the host computer upon which the virtual > machine is running. > > I appreciate that such a 1456 style system cannot become part of a > future version of Unicode if "plain text" is required to not become > redefined to some extent to allow such a 1456 style system to become > included in Unicode. > > William Overington > > Saturday 15 October 2022 > From wjgo_10009 at btinternet.com Mon Oct 17 08:08:34 2022 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Mon, 17 Oct 2022 14:08:34 +0100 (BST) Subject: Aw: Re: The conflicting needs of emoji In-Reply-To: References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

Message-ID: <54c9483e.5d3fd.183e60f6d6a.Webtop.105@btinternet.com> Marius Spix wrote as follows. > What is the need for your 1456 middleware system when there is already > software like HarfBuzz doing exactly the same thing? When I read your post I knew nothing about HarfBuzz though I think I remember having seen the name somewhere. I have found the following page. https://harfbuzz.github.io/what-does-harfbuzz-do.html However, even if HarfBuzz can already do all that I am suggesting that a "not yet available and maybe never available" 1456 middleware system could do, neither of them can do it today because to do so first the Unicode Technical Committee would need to have agreed a format so that an end user of Unicode could write a Unicode "plain text" message that would provide the instructions to whichever of those two systems were running in the rendering system. Those would be instructions that would be regarded by the rendering system as software to run in the virtual machine that is simulated in the rendering software so as to draw a custom character. If the encoding using 7-bit ASCII printing characters that I used in 2000 (as adapted and extended for this situation) were implemented using tag characters that could be one way to achieve progress. Which character to use as a base character for the sequence of tag characters? Preferably one that already exists so as to get things going more quickly. One that makes it clear that such a sequence of instructons is in a text stream. Could U+FFFC OBJECT REPLACEMENT CHARACTER be used as the base character for such a sequence of tag characters that are providing instructions to a virtual machine as to how to draw a custom character? William Overington Monday 17 October 2022 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Wed Oct 19 10:27:22 2022 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Wed, 19 Oct 2022 16:27:22 +0100 (BST) Subject: U+FFFC Object Replacement Character In-Reply-To: <54c9483e.5d3fd.183e60f6d6a.Webtop.105@btinternet.com> References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

<54c9483e.5d3fd.183e60f6d6a.Webtop.105@btinternet.com> Message-ID: <61ddb973.353c.183f0db3a2e.Webtop.105@btinternet.com> On page 944 of The Unicode Sandard, page 33 in the https://www.unicode.org/versions/Unicode15.0.0/ch23.pdf document, is text about U+FFFC OBJECT REPLACEMENT CHARACTER. Could this text be modified such that U+FFFC could be used either as described in that text or could be followed by a sequence of tag characters, that sequence of tag characters containing information relating to the object? In another thread I wrote as follows. > Could U+FFFC OBJECT REPLACEMENT CHARACTER be used as the base > character for such a sequence of tag characters that are providing > instructions to a virtual machine as to how to draw a custom > character? Yet upon thinking about this further, now that tag characters are used for sequences that are not (only) about languages, I am wondering if a sequence of tag characters immediately following a U+FFFC character could be defined by The Unicode Technical Committee to refer to one of various types of object, which one depending upon, say, the first character of the sequence of tag characters. For example, a full Universal Resource Locator so as to indicate where on the web to find an image file; a local file name to indicate an attached image in an email or the same folder as the file where the message is stored; a full Universal Resource Locator of where to find the font data for an emoji character, and so on. Would this be possible please or would it cause problems with documents where the existing definition of the meaning of U+FFFC is being applied? William Overington Wednesday 19 October 2022 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at kli.org Wed Oct 19 16:58:10 2022 From: mark at kli.org (Mark E. Shoulson) Date: Wed, 19 Oct 2022 17:58:10 -0400 Subject: The conflicting needs of emoji In-Reply-To: References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

Message-ID: <6afd4a1b-2fd3-8fcb-f882-fafe5a429b90@shoulson.com> On 10/14/22 18:28, Adib Behjat via Unicode wrote: > I second Mark?s sentiment and really like the suggestion. I also do > think it would be wiser if this process was handled and managed by > OSs/Apps versus Unicode. > > After reading Mark?s suggestion, I was reminded of a game where you > combine basic/generic elements to create more complex elements (e.g. > https://littlealchemy.com/). > > For example, if someone wants to generate ?Physics Teacher?, a user > can type in their device: > ????? Yes, this is essentially the "emoji kitchen" approach.? It has definite appeal and could work... and also distinct downsides: lack of control over what you actually mean, different people thinking of different formul?, etc.? No scheme is perfect.? Those alchemy games are indeed a good example of this kind of thinking, but it has its ups and downs. ~mark > > And based on this combination, the OS/App can give the user the option > to generate a custom avatar to represent Physics Teacher. With regards > to rendering, tools like DALL-E (or other similar diffusion models) > can enable this capability. In addition, this process will help > encourage the introduction of more generic emoji characters to help > expand the foundational building blocks for these tools. > > >> On Oct 14, 2022, at 2:43 PM, Mark E. Shoulson via Unicode >> wrote: >> >> There are really two distinct animi(?) behind the push for >> ever-more-detailed emoji, or rather, two animi for using emoji, and >> they pull in different directions, almost opposite. >> >> On one hand, people want an emoji that looks JUST like they want it >> to look. Maybe not even only for people!? But of course we see it >> most with people.? People want an emoji that looks like _they_ look >> (do they really use it?? or do they just feel left out if it isn't >> available?? I'm not a heavy emoji-user, so I'm no judge, but I bet >> the second motivation is non-trivial as well.)? So the really do want >> "tall ectomorph female physics teacher with dark hair in Princess >> Leia buns, an eyebrow piercing (right), and a birthmark on the left >> side of the chin." This motivates the various suggestions of somehow >> encoding a teeny image and pretending it's somehow "plain text", or >> encoding a link to it or something.? I had what I thought were >> helpful ideas about the recent notions being floated at the last >> Unicode meeting I attended, and maybe they're even being thought >> about...? But although I've thought that maybe sending images like >> this was the best of the suggested solutions, it's still awful. >> >> The other problem is the other animus involved.? Because it isn't >> just a picture that people want when they use emoji.? It isn't enough >> that there's a little picture of a person wearing a mortarboard hat >> or something, there's actual semantic information embedded in the >> encoded text as well.? It isn't just a picture, it's a >> codepoint(-sequence) that means something, a bit-sequence that >> _means_ "TEACHER." Just like U+0065 means something more than the ink >> used to draw it in whatever font. People want some snippet of "text" >> that not only looks like them, but also *means* them.? Hence in my >> example above, expressing some of the various physical traits are one >> thing, and you can represent TEACHER with some cultural convention >> like a mortarboard hat (or THIEF with a mask), but how do you get >> across "PHYSICS teacher"?? Or a dozen other subjects, arbitrarily >> finely divided? >> >> When I think about it, I don't know that people would really be >> satisfied with image-sticker emoji.? After all, not everyone has the >> skill to draw them (which is why we rely on emoji-font artists in the >> first place), or make them Just So, and I really do think that people >> would feel the lack of semantic meaning.? A lot of messaging services >> already let you include little graphics images, but I don't think the >> people using them feel this desire for new emoji any less.? How many >> homedrawn emoji do you really think will be made?? How many used more >> than twice?? How many used by more than one person?? How many will >> even be understood by the recipient? >> >> Asmus' point about comparing it to swapping out lego bits is >> well-taken. There have long been these kind of "avatar engines" that >> let you swap around features to get something kind of like you (I >> remember one on the Wii way back when.) And maybe there is some >> reasonable limit to how much customization can be provided (though >> I'd bet anything we'd take forever agreeing on it.)? And even within >> specific limits like hair-style and hats, there'll always be one more >> that we're lacking, one more Mr Potato Head piece people will push for. >> >> (Actually, thinking about Asmus' line about "faces drawn on," how's >> this for an idea, combining raster with standard? Instead of drawing >> some random picture yourself for an emoji, you have an image of, say, >> facial features that's to be projected onto a blank emoji-face, which >> can be any "standard" emoji or whatever. Could have other distinct >> ways of specifying headgear images, etc.? The renderer would be smart >> enough to scale or transform the image appropriately for different >> kinds of emoji with faces in different places and orientations etc. >> Probably a beast to implement, but I'm just floating ideas.? This one >> actually has signs of MAYBE bridging the gap between the two drives >> for emoji.) >> >> Perhaps the best "generalized emoji" implementation is something >> along the lines of the Emoji Kitchen, where you can combine arbitrary >> emoji in arbitrary numbers and orders and the system does its level >> best to figure out SOME way for the resulting image to make sense.? >> This gives you some genericness, and you can express all kinds of >> shades of meaning by combining enough emoji, but retains the semantic >> meaning of them as well.? Of course, it puts you at the mercy of how >> the system chooses to combine things, how good the designers are, etc >> etc.? You have no control over what really emerges at the end.? (I >> suppose if this ever became something widespread there would develop >> conventions for combining with a little more control (like Egyptian >> hieroglyph combiner marks??? Probably not, but with some semantic >> similarities.)) >> >> Anyway.? Wanted to rant a bit on this "two desires" notion that I was >> thinking about since the last meeting.? I think it's important to >> remember the second one, which gets missed out on when people focus >> on controlling the picture just so (though it is what's behind the >> idea of using Wikidata codes.)? And the "images of features" notion >> occurred to me while typing this, and I think it's interesting. >> >> I'm not really trying to suggest answers here (though I did remark on >> some things favorably); this is more asking the questions.? There's >> always going to be these two conflicting needs, and there'll always >> be people who want ever-finer distinctions in emoji, and there may >> simply not be any really good answers. Emoji probably never should >> have been part of Unicode (not "plain text"), but that ship sailed >> long ago, and even there it's not cut and dried (webdings? map symbols?) >> >> Thoughts? >> >> ~mark >> >> >> On 10/14/22 00:54, Asmus Freytag via Unicode wrote: >>> People that grew up on games are used to character editors that >>> allow any avatar to be assembled from building blocks. Short of a >>> common "avatar engine" shared across all platforms, a limited set of >>> emoji-legos isn't that unreasonable. >>> >>> We have skin tones, male/female, some limited use of color (black + >>> cat). >>> >>> Because of their small size, emoji faces would support more >>> customization; it's hard to create a full character emoji on the >>> level of detail of a game character. So you'd be limited to less >>> detail than you can implement with real lego blocks. (And yes, the >>> ones for the heads of the little figure have removable hair (and >>> head gear). Plus a variety of of faces (pirate) painted on. >>> >>> If that can be done in the physical world, there's no reason a >>> subset of that couldn't be supported in emoji rendering. >>> >>> People will intuitively sense that that should be possible and thus >>> the pressure to innovate in that direction won't stop. >>> >>> Just my $1/50. >>> >>> A./ >>> >>> On 10/13/2022 4:38 PM, Mark E. Shoulson via Unicode wrote: >>>> Again, this way lieth madness.? People aren't satisfied with an >>>> emoji for "female teacher with dark hair"; they want "TALL, THIN, >>>> female PHYSICS teacher with dark hair IN PRINCESS-LEIA BUNS AND A >>>> PIERCED EYEBROW (GOLD RING)." And if you give in on "welllllll, >>>> okay, we'll give in on the tall/short...," you're only encouraging >>>> them to beg for the rest. ("How about only a _little_ tall?? How >>>> about broad-shouldered? small-breasted?") >>>> >>>> (Though my opinion isn't actually quite what that sounds like: even >>>> I admit that there probably *are* things that are appropriate to >>>> give in on, and I know we all can argue all the day long about them.) >>>> >>>> ~mark >>>> >>>> On 10/13/22 09:22, William_J_G Overington via Unicode wrote: >>>>> Thank you for posting about this. >>>>> >>>>> Could one use variation selectors with this too, so as to have a >>>>> default style of glasses and various styles of glasses available? >>>>> >>>>> Or would one need to have separate styles of glasses each encoded >>>>> separately? >>>>> >>>>> If both approaches are possible, which one would be better? >>>>> >>>>> If it is to be encoded, and I hope it will be, it would be good to >>>>> go for the lot all at once. Lots of styles as glasses are in lots >>>>> of styles. >>>>> >>>>> In my opinion it is no use just doing one and leaving the rest for >>>>> some future time as that is often a recipe for the rest never >>>>> getting done. >>>>> >>>>> If the lot is done as one grand forward leap then that is the way >>>>> to keep Unicode thriving. >>>>> >>>>> William Overington >>>>> >>>>> Thursday 13 October 2022 >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at kli.org Wed Oct 19 17:22:18 2022 From: mark at kli.org (Mark E. Shoulson) Date: Wed, 19 Oct 2022 18:22:18 -0400 Subject: The conflicting needs of emoji In-Reply-To: References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

Message-ID: On 10/14/22 17:43, Mark E. Shoulson via Unicode wrote: > I had what I thought were helpful ideas about the recent notions being > floated at the last Unicode meeting I attended, and maybe they're even > being thought about...? But although I've thought that maybe sending > images like this was the best of the suggested solutions, it's still > awful. Just so it's on-record and for people to comment on it, let me try to reconstruct. The idea under consideration was I think https://www.unicode.org/L2/L2016/16105r-unicode-image-hash.pdf, or possibly something similar; it looks like it involves having a Base64-encoded image plus a secure hash of some kind.? I was thinking that on the contrary, the thing to do is NOT have a secure hash, but just the image, and cache the image using a normal hash on the image data.? That way, later references to the image in the same document could include only the hash for the system to fetch from its cache and not have to encode all the data again.? And even better, as some private emoji become popular, maybe vendors will start shipping with them pre-cached, so the "short form" becomes something that can be used for interchange even without the long form (you hope.)? And "emoji servers" could form, websites that serve up emoji-images from a large cache so an even larger set of emoji can be encoded just by their hashes and not by their data. Doesn't this seem like a good idea?? It tries to offload the creation and approval of emoji from Unicode, lets the users create their own, and even encourages the organic growth of infrastructure to support it.? Maybe even shorter and shorter forms of the hash for very popular emoji etc.? But always with the fallback of the full-data form so it isn't actually dependent on the servers. And yet with all that, it's *still* a pretty crummy idea!? It will always be a huge stretch to pass off anything with encoded images as "plain text," and hashes of images are even farther removed.? And it doesn't address the "meaning" animus at all.? And without that, when you're just doing images, not only do you wind up with potentially awful emoji that don't mean anything to most people (because people can't all draw well), you also wind up with a hundred variations on a theme, as a hundred different artists each give their interpretation of "CUTE COQUETTISH FACE WITH SMILING EYES AND COLD SWEAT" or whatever.? Consider the simple smiley as available from all the various different vendors. So, yeah, potentially a good implementation for a lousy solution. I didn't say I had answers, I'm just exploring the questions more, in the context of some proposals.? This is not an easy nut to crack, and there are conflicting desiderata, so probably any solution will be bad from one reasonable POV or another. ~mark From marius.spix at web.de Wed Oct 19 17:26:34 2022 From: marius.spix at web.de (Marius Spix) Date: Thu, 20 Oct 2022 00:26:34 +0200 Subject: The conflicting needs of emoji In-Reply-To: <6afd4a1b-2fd3-8fcb-f882-fafe5a429b90@shoulson.com> References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

<6afd4a1b-2fd3-8fcb-f882-fafe5a429b90@shoulson.com> Message-ID: <20221020002634.278451a6@spixxi> There is actually a sequence of Unicode characters to clearly describe a ?Physics Teacher? without the downsides you have mentioned: U+0050 U+0068 U+0079 U+0073 U+0069 U+0063 U+0073 U+0020 U+0054 U+0065 U+0061 U+0063 U+0068 U+0065 U+0072 Am Wed, 19 Oct 2022 17:58:10 -0400 schrieb "Mark E. Shoulson via Unicode" : > On 10/14/22 18:28, Adib Behjat via Unicode wrote: > > I second Mark?s sentiment and really like the suggestion. I also do > > think it would be wiser if this process was handled and managed by > > OSs/Apps versus Unicode. > > > > After reading Mark?s suggestion, I was reminded of a game where you > > combine basic/generic elements to create more complex elements > > (e.g. https://littlealchemy.com/). > > > > For example, if someone wants to generate ?Physics Teacher?, a user > > can type in their device: > > ????? > > Yes, this is essentially the "emoji kitchen" approach.? It has > definite appeal and could work... and also distinct downsides: lack > of control over what you actually mean, different people thinking of > different formul?, etc.? No scheme is perfect.? Those alchemy games > are indeed a good example of this kind of thinking, but it has its > ups and downs. > > ~mark > > > > > And based on this combination, the OS/App can give the user the > > option to generate a custom avatar to represent Physics Teacher. > > With regards to rendering, tools like DALL-E (or other similar > > diffusion models) can enable this capability. In addition, this > > process will help encourage the introduction of more generic emoji > > characters to help expand the foundational building blocks for > > these tools. > > > > > >> On Oct 14, 2022, at 2:43 PM, Mark E. Shoulson via Unicode > >> wrote: > >> > >> There are really two distinct animi(?) behind the push for > >> ever-more-detailed emoji, or rather, two animi for using emoji, > >> and they pull in different directions, almost opposite. > >> > >> On one hand, people want an emoji that looks JUST like they want > >> it to look. Maybe not even only for people!? But of course we see > >> it most with people.? People want an emoji that looks like _they_ > >> look (do they really use it?? or do they just feel left out if it > >> isn't available?? I'm not a heavy emoji-user, so I'm no judge, but > >> I bet the second motivation is non-trivial as well.)? So the > >> really do want "tall ectomorph female physics teacher with dark > >> hair in Princess Leia buns, an eyebrow piercing (right), and a > >> birthmark on the left side of the chin." This motivates the > >> various suggestions of somehow encoding a teeny image and > >> pretending it's somehow "plain text", or encoding a link to it or > >> something.? I had what I thought were helpful ideas about the > >> recent notions being floated at the last Unicode meeting I > >> attended, and maybe they're even being thought about...? But > >> although I've thought that maybe sending images like this was the > >> best of the suggested solutions, it's still awful. > >> > >> The other problem is the other animus involved.? Because it isn't > >> just a picture that people want when they use emoji.? It isn't > >> enough that there's a little picture of a person wearing a > >> mortarboard hat or something, there's actual semantic information > >> embedded in the encoded text as well.? It isn't just a picture, > >> it's a codepoint(-sequence) that means something, a bit-sequence > >> that _means_ "TEACHER." Just like U+0065 means something more than > >> the ink used to draw it in whatever font. People want some snippet > >> of "text" that not only looks like them, but also *means* them. > >> Hence in my example above, expressing some of the various physical > >> traits are one thing, and you can represent TEACHER with some > >> cultural convention like a mortarboard hat (or THIEF with a mask), > >> but how do you get across "PHYSICS teacher"?? Or a dozen other > >> subjects, arbitrarily finely divided? > >> > >> When I think about it, I don't know that people would really be > >> satisfied with image-sticker emoji.? After all, not everyone has > >> the skill to draw them (which is why we rely on emoji-font artists > >> in the first place), or make them Just So, and I really do think > >> that people would feel the lack of semantic meaning.? A lot of > >> messaging services already let you include little graphics images, > >> but I don't think the people using them feel this desire for new > >> emoji any less.? How many homedrawn emoji do you really think will > >> be made?? How many used more than twice?? How many used by more > >> than one person?? How many will even be understood by the > >> recipient? > >> > >> Asmus' point about comparing it to swapping out lego bits is > >> well-taken. There have long been these kind of "avatar engines" > >> that let you swap around features to get something kind of like > >> you (I remember one on the Wii way back when.) And maybe there is > >> some reasonable limit to how much customization can be provided > >> (though I'd bet anything we'd take forever agreeing on it.)? And > >> even within specific limits like hair-style and hats, there'll > >> always be one more that we're lacking, one more Mr Potato Head > >> piece people will push for. > >> > >> (Actually, thinking about Asmus' line about "faces drawn on," > >> how's this for an idea, combining raster with standard? Instead of > >> drawing some random picture yourself for an emoji, you have an > >> image of, say, facial features that's to be projected onto a blank > >> emoji-face, which can be any "standard" emoji or whatever. Could > >> have other distinct ways of specifying headgear images, etc.? The > >> renderer would be smart enough to scale or transform the image > >> appropriately for different kinds of emoji with faces in different > >> places and orientations etc. Probably a beast to implement, but > >> I'm just floating ideas.? This one actually has signs of MAYBE > >> bridging the gap between the two drives for emoji.) > >> > >> Perhaps the best "generalized emoji" implementation is something > >> along the lines of the Emoji Kitchen, where you can combine > >> arbitrary emoji in arbitrary numbers and orders and the system > >> does its level best to figure out SOME way for the resulting image > >> to make sense. This gives you some genericness, and you can > >> express all kinds of shades of meaning by combining enough emoji, > >> but retains the semantic meaning of them as well.? Of course, it > >> puts you at the mercy of how the system chooses to combine things, > >> how good the designers are, etc etc.? You have no control over > >> what really emerges at the end.? (I suppose if this ever became > >> something widespread there would develop conventions for combining > >> with a little more control (like Egyptian hieroglyph combiner > >> marks??? Probably not, but with some semantic similarities.)) > >> > >> Anyway.? Wanted to rant a bit on this "two desires" notion that I > >> was thinking about since the last meeting.? I think it's important > >> to remember the second one, which gets missed out on when people > >> focus on controlling the picture just so (though it is what's > >> behind the idea of using Wikidata codes.)? And the "images of > >> features" notion occurred to me while typing this, and I think > >> it's interesting. > >> > >> I'm not really trying to suggest answers here (though I did remark > >> on some things favorably); this is more asking the questions. > >> There's always going to be these two conflicting needs, and > >> there'll always be people who want ever-finer distinctions in > >> emoji, and there may simply not be any really good answers. Emoji > >> probably never should have been part of Unicode (not "plain > >> text"), but that ship sailed long ago, and even there it's not cut > >> and dried (webdings? map symbols?) > >> > >> Thoughts? > >> > >> ~mark > >> > >> > >> On 10/14/22 00:54, Asmus Freytag via Unicode wrote: > >>> People that grew up on games are used to character editors that > >>> allow any avatar to be assembled from building blocks. Short of a > >>> common "avatar engine" shared across all platforms, a limited set > >>> of emoji-legos isn't that unreasonable. > >>> > >>> We have skin tones, male/female, some limited use of color (black > >>> + cat). > >>> > >>> Because of their small size, emoji faces would support more > >>> customization; it's hard to create a full character emoji on the > >>> level of detail of a game character. So you'd be limited to less > >>> detail than you can implement with real lego blocks. (And yes, > >>> the ones for the heads of the little figure have removable hair > >>> (and head gear). Plus a variety of of faces (pirate) painted on. > >>> > >>> If that can be done in the physical world, there's no reason a > >>> subset of that couldn't be supported in emoji rendering. > >>> > >>> People will intuitively sense that that should be possible and > >>> thus the pressure to innovate in that direction won't stop. > >>> > >>> Just my $1/50. > >>> > >>> A./ > >>> > >>> On 10/13/2022 4:38 PM, Mark E. Shoulson via Unicode wrote: > >>>> Again, this way lieth madness.? People aren't satisfied with an > >>>> emoji for "female teacher with dark hair"; they want "TALL, > >>>> THIN, female PHYSICS teacher with dark hair IN PRINCESS-LEIA > >>>> BUNS AND A PIERCED EYEBROW (GOLD RING)." And if you give in on > >>>> "welllllll, okay, we'll give in on the tall/short...," you're > >>>> only encouraging them to beg for the rest. ("How about only a > >>>> _little_ tall?? How about broad-shouldered? small-breasted?") > >>>> > >>>> (Though my opinion isn't actually quite what that sounds like: > >>>> even I admit that there probably *are* things that are > >>>> appropriate to give in on, and I know we all can argue all the > >>>> day long about them.) > >>>> > >>>> ~mark > >>>> > >>>> On 10/13/22 09:22, William_J_G Overington via Unicode wrote: > >>>>> Thank you for posting about this. > >>>>> > >>>>> Could one use variation selectors with this too, so as to have > >>>>> a default style of glasses and various styles of glasses > >>>>> available? > >>>>> > >>>>> Or would one need to have separate styles of glasses each > >>>>> encoded separately? > >>>>> > >>>>> If both approaches are possible, which one would be better? > >>>>> > >>>>> If it is to be encoded, and I hope it will be, it would be good > >>>>> to go for the lot all at once. Lots of styles as glasses are in > >>>>> lots of styles. > >>>>> > >>>>> In my opinion it is no use just doing one and leaving the rest > >>>>> for some future time as that is often a recipe for the rest > >>>>> never getting done. > >>>>> > >>>>> If the lot is done as one grand forward leap then that is the > >>>>> way to keep Unicode thriving. > >>>>> > >>>>> William Overington > >>>>> > >>>>> Thursday 13 October 2022 > >>> > >>> > > From lyratelle at gmx.de Thu Oct 20 04:07:40 2022 From: lyratelle at gmx.de (Dominikus Dittes Scherkl) Date: Thu, 20 Oct 2022 11:07:40 +0200 Subject: The conflicting needs of emoji In-Reply-To: <20221020002634.278451a6@spixxi> References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

<6afd4a1b-2fd3-8fcb-f882-fafe5a429b90@shoulson.com> <20221020002634.278451a6@spixxi> Message-ID: <4624cca1-65c6-2b64-607c-71853d045f94@gmx.de> Am 20.10.22 um 00:26 schrieb Marius Spix via Unicode: > There is actually a sequence of Unicode characters to clearly describe > a ?Physics Teacher? without the downsides you have mentioned: > > U+0050 U+0068 U+0079 U+0073 U+0069 U+0063 U+0073 U+0020 U+0054 U+0065 > U+0061 U+0063 U+0068 U+0065 U+0072 > This has a different downside: You need to speak english to understand it. This is especially what emoji try to circumvent. -- Dominikus Dittes Scherkl From asmusf at ix.netcom.com Thu Oct 20 10:38:30 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Thu, 20 Oct 2022 08:38:30 -0700 Subject: The conflicting needs of emoji In-Reply-To: <4624cca1-65c6-2b64-607c-71853d045f94@gmx.de> References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

<6afd4a1b-2fd3-8fcb-f882-fafe5a429b90@shoulson.com> <20221020002634.278451a6@spixxi> <4624cca1-65c6-2b64-607c-71853d045f94@gmx.de> Message-ID: On 10/20/2022 2:07 AM, Dominikus Dittes Scherkl via Unicode wrote: > Am 20.10.22 um 00:26 schrieb Marius Spix via Unicode: >> There is actually a sequence of Unicode characters to clearly describe >> a ?Physics Teacher? without the downsides you have mentioned: >> >> U+0050 U+0068 U+0079 U+0073 U+0069 U+0063 U+0073 U+0020 U+0054 U+0065 >> U+0061 U+0063 U+0068 U+0065 U+0072 >> > This has a different downside: You need to speak english to understand > it. This is especially what emoji try to circumvent. > > -- No. Emoji weren't and aren't used primarily to be language independent. In fact, I bet there's much use of emoji that is based on puns and similar mechanisms: where the emoji is used to stand for a word in an expression in some language where another language (or culture) would employ a different word or expression, so that even translating the nominal meaning of the emoji wouldn't help you. Emoji, as opposed to emoticons, were first used widely in Japan, where they were used by Japanese communicating with other Japanese thinking in Japanese. So, no, that wasn't about circumventing having a shared language. More, perhaps, about having a shorthand, or also, perhaps a way to express yourself without the directness of using words explicitly. The combination of that with a certain cuteness factor, would seem sufficient to explain their explosive success in Japan. You must be thinking about different sets of symbols, like those used on laundry tags, or those that appear on car and other equipment controls; some have even made the jump to other user interfaces (like, Play, Pause and Stop symbols). For those you would be correct in saying that they try to be language independent. A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From marius.spix at web.de Thu Oct 20 11:39:15 2022 From: marius.spix at web.de (Marius Spix) Date: Thu, 20 Oct 2022 18:39:15 +0200 Subject: Aw: Re: The conflicting needs of emoji In-Reply-To: <4624cca1-65c6-2b64-607c-71853d045f94@gmx.de> References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

<6afd4a1b-2fd3-8fcb-f882-fafe5a429b90@shoulson.com> <20221020002634.278451a6@spixxi> <4624cca1-65c6-2b64-607c-71853d045f94@gmx.de> Message-ID: So, is emoji meant to circumvent the requirement to know a language? So, I assume that you are not Japanese. Which of the following representations is most understandable for you: a) ? b) ???? c) ????? d) naruto maki e) fish cake f) whirlpool-shaped fish cake g) ? fish cake Like English words emoji are not intuitive and you still have to learn their meaning, like you have to learn what the words in you mother tongue mean. For te same reason someone who has never seen a rocket, can not figure out the meaning of the emoji ?. Emojis are not designed to replace written text, but support it. For example you could write ?? fish cake? to unambiguously tell someone about a kind of fish cake with a special shape. > Gesendet: Donnerstag, den 20.10.2022 um 11:07 Uhr > Von: "Dominikus Dittes Scherkl via Unicode" > An: unicode at corp.unicode.org > Cc: "Dominikus Dittes Scherkl" > Betreff: Re: The conflicting needs of emoji > > Am 20.10.22 um 00:26 schrieb Marius Spix via Unicode: > > There is actually a sequence of Unicode characters to clearly describe > > a ?Physics Teacher? without the downsides you have mentioned: > > > > U+0050 U+0068 U+0079 U+0073 U+0069 U+0063 U+0073 U+0020 U+0054 U+0065 > > U+0061 U+0063 U+0068 U+0065 U+0072 > > > This has a different downside: You need to speak english to understand > it. This is especially what emoji try to circumvent. > > -- > Dominikus Dittes Scherkl > > From wjgo_10009 at btinternet.com Thu Oct 20 15:58:59 2022 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Thu, 20 Oct 2022 21:58:59 +0100 (BST) Subject: The conflicting needs of emoji In-Reply-To: References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

<6afd4a1b-2fd3-8fcb-f882-fafe5a429b90@shoulson.com> <20221020002634.278451a6@spixxi> <4624cca1-65c6-2b64-607c-71853d045f94@gmx.de> Message-ID: <1b2edac4.5cbb.183f7313018.Webtop.105@btinternet.com> MoMA, The Museum of Modern Art in New York, had an exhibition about the original emoji a few years ago. https://stories.moma.org/the-original-emoji-set-has-been-added-to-the-museum-of-modern-arts-collection-c6060e141f61 https://www.moma.org/calendar/exhibitions/3639 http://www.users.globalnet.co.uk/~ngo/emoji_installation_at_MoMA.htm William Overington Thursday 20 October 2020 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at kli.org Thu Oct 20 18:46:26 2022 From: mark at kli.org (Mark E. Shoulson) Date: Thu, 20 Oct 2022 19:46:26 -0400 Subject: The conflicting needs of emoji In-Reply-To: References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

<6afd4a1b-2fd3-8fcb-f882-fafe5a429b90@shoulson.com> <20221020002634.278451a6@spixxi> <4624cca1-65c6-2b64-607c-71853d045f94@gmx.de> Message-ID: <9d0c0bb7-ae00-370a-2976-5b7e339fdbdb@shoulson.com> On 10/20/22 11:38, Asmus Freytag via Unicode wrote: > On 10/20/2022 2:07 AM, Dominikus Dittes Scherkl via Unicode wrote: >> Am 20.10.22 um 00:26 schrieb Marius Spix via Unicode: >>> There is actually a sequence of Unicode characters to clearly describe >>> a ?Physics Teacher? without the downsides you have mentioned: >>> >>> U+0050 U+0068 U+0079 U+0073 U+0069 U+0063 U+0073 U+0020 U+0054 U+0065 >>> U+0061 U+0063 U+0068 U+0065 U+0072 >>> >> This has a different downside: You need to speak english to understand >> it. This is especially what emoji try to circumvent. >> >> -- > > No. Emoji weren't and aren't used primarily to be language > independent. In fact, I bet there's much use of emoji that is based on > puns and similar mechanisms: where the emoji is used to stand for a > word in an expression in some language where another language (or > culture) would employ a different word or expression, so that even > translating the nominal meaning of the emoji wouldn't help you. > A few years ago, I bought The Emoji Haggadah (https://www.amazon.com/Emoji-Haggadah-Martin-Bodek/dp/0359159370), which has essentially the whole text of the Haggadah in emoji.? In *English* in emoji, mind you.? So for example I think it tended to use ? to mean "rabbi". The truly disturbing thing about it was that I found I could read it!! Emoji are very definitely culture-centric.? Are they language-centric like the string of letters?? Probably not.? I think I have to agree that the string of Latin letters is not an acceptable substitute for an emoji, but that doesn't mean emoji are a language-free neutral zone of graphics. ~mark From gwalla at gmail.com Thu Oct 20 19:49:27 2022 From: gwalla at gmail.com (Garth Wallace) Date: Thu, 20 Oct 2022 17:49:27 -0700 Subject: The conflicting needs of emoji In-Reply-To: <9d0c0bb7-ae00-370a-2976-5b7e339fdbdb@shoulson.com> References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

<6afd4a1b-2fd3-8fcb-f882-fafe5a429b90@shoulson.com> <20221020002634.278451a6@spixxi> <4624cca1-65c6-2b64-607c-71853d045f94@gmx.de> <9d0c0bb7-ae00-370a-2976-5b7e339fdbdb@shoulson.com> Message-ID: Pretty much every example I?ve seen of using emoji for a text of significant length or complexity has relied heavily on rebuses. On Thu, Oct 20, 2022 at 4:48 PM Mark E. Shoulson via Unicode < unicode at corp.unicode.org> wrote: > On 10/20/22 11:38, Asmus Freytag via Unicode wrote: > > On 10/20/2022 2:07 AM, Dominikus Dittes Scherkl via Unicode wrote: > >> Am 20.10.22 um 00:26 schrieb Marius Spix via Unicode: > >>> There is actually a sequence of Unicode characters to clearly describe > >>> a ?Physics Teacher? without the downsides you have mentioned: > >>> > >>> U+0050 U+0068 U+0079 U+0073 U+0069 U+0063 U+0073 U+0020 U+0054 U+0065 > >>> U+0061 U+0063 U+0068 U+0065 U+0072 > >>> > >> This has a different downside: You need to speak english to understand > >> it. This is especially what emoji try to circumvent. > >> > >> -- > > > > No. Emoji weren't and aren't used primarily to be language > > independent. In fact, I bet there's much use of emoji that is based on > > puns and similar mechanisms: where the emoji is used to stand for a > > word in an expression in some language where another language (or > > culture) would employ a different word or expression, so that even > > translating the nominal meaning of the emoji wouldn't help you. > > > A few years ago, I bought The Emoji Haggadah > (https://www.amazon.com/Emoji-Haggadah-Martin-Bodek/dp/0359159370), > which has essentially the whole text of the Haggadah in emoji. In > *English* in emoji, mind you. So for example I think it tended to use > ? to mean "rabbi". > > The truly disturbing thing about it was that I found I could read it!! > > Emoji are very definitely culture-centric. Are they language-centric > like the string of letters? Probably not. I think I have to agree that > the string of Latin letters is not an acceptable substitute for an > emoji, but that doesn't mean emoji are a language-free neutral zone of > graphics. > > ~mark > -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Thu Oct 20 21:51:20 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Thu, 20 Oct 2022 19:51:20 -0700 Subject: The conflicting needs of emoji In-Reply-To: <9d0c0bb7-ae00-370a-2976-5b7e339fdbdb@shoulson.com> References: <14dc2730.5775e.183d1827d67.Webtop.105@btinternet.com>

<6afd4a1b-2fd3-8fcb-f882-fafe5a429b90@shoulson.com> <20221020002634.278451a6@spixxi> <4624cca1-65c6-2b64-607c-71853d045f94@gmx.de> <9d0c0bb7-ae00-370a-2976-5b7e339fdbdb@shoulson.com> Message-ID: <68c33d74-b61d-61a4-f7b9-3a9f4641b205@ix.netcom.com> On 10/20/2022 4:46 PM, Mark E. Shoulson via Unicode wrote: > On 10/20/22 11:38, Asmus Freytag via Unicode wrote: >> On 10/20/2022 2:07 AM, Dominikus Dittes Scherkl via Unicode wrote: >>> Am 20.10.22 um 00:26 schrieb Marius Spix via Unicode: >>>> There is actually a sequence of Unicode characters to clearly describe >>>> a ?Physics Teacher? without the downsides you have mentioned: >>>> >>>> U+0050 U+0068 U+0079 U+0073 U+0069 U+0063 U+0073 U+0020 U+0054 U+0065 >>>> U+0061 U+0063 U+0068 U+0065 U+0072 >>>> >>> This has a different downside: You need to speak english to understand >>> it. This is especially what emoji try to circumvent. >>> >>> -- >> >> No. Emoji weren't and aren't used primarily to be language >> independent. In fact, I bet there's much use of emoji that is based >> on puns and similar mechanisms: where the emoji is used to stand for >> a word in an expression in some language where another language (or >> culture) would employ a different word or expression, so that even >> translating the nominal meaning of the emoji wouldn't help you. >> > A few years ago, I bought The Emoji Haggadah > (https://www.amazon.com/Emoji-Haggadah-Martin-Bodek/dp/0359159370), > which has essentially the whole text of the Haggadah in emoji.? In > *English* in emoji, mind you.? So for example I think it tended to use > ? to mean "rabbi". > > The truly disturbing thing about it was that I found I could read it!! > > Emoji are very definitely culture-centric.? Are they language-centric > like the string of letters?? Probably not.? I think I have to agree > that the string of Latin letters is not an acceptable substitute for > an emoji, but that doesn't mean emoji are a language-free neutral zone > of graphics. > > ~mark There are lots of expressions that would lend themselves to being emojified. Like "pear shaped". I can easily imagine a conversation where you could use a single PEAR emoji to express that something might turn out badly (or has done so). Unlike STAR used in a way derived from movie star, the concept of something going "pear shaped" has not crossed over widely into other languages and cultures. Your example of rabbi(t) is also a good one, because such pun-like uses of emoji are common. All of them are intricately bound up with a language or culture or both. Whereas the two vertical rectangles, or the right pointing triangle are truly language independent means of conveying "pause" and "play". One more example: for an English speaking user, combining the apple with an eye emoji might convey "apple of my eye", or something precious. A German speaking user would more likely read that as a very literal attempt at rendering"eyeball" (Augapfel). The whole idea that emoji as a system have anything to do with language-independence is simply a red herring. It doesn't match their origin story, doesn't match their usage history when they first became popular and doesn't match how they are used today. That remains the case, even if there are ways you can try to use them (or a subset of them) to get your point across when you don't share a language with someone. But such usages are probably hit or miss. Lucky coincidences if they work. A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.wordingham at ntlworld.com Tue Oct 25 21:02:49 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Wed, 26 Oct 2022 03:02:49 +0100 Subject: Encoding of Text in the Myanmar Script Message-ID: <20221026030249.6807ed65@JRWUBU2> Has the government of Myanmar promulgated a single method of encoding text in the Myanmar script? If so, is it readily available? Are iPhones expected to come to support it? I'm asking because the differences between the TUS encoding for Burmese and UTN-11 are causing problems, including acrimony, and I don't know if there is a multilingual standard. Or are different encodings prescribed for different languages? >From what little I can find, it is entirely conceivable that the government has merely required that text be compatible with the Unicode Standard. Richard. From aprilop at fn.de Wed Oct 26 00:52:08 2022 From: aprilop at fn.de (Andreas Prilop) Date: Wed, 26 Oct 2022 05:52:08 +0000 Subject: Encoding of Text in the Myanmar Script In-Reply-To: <20221026030249.6807ed65@JRWUBU2> References: <20221026030249.6807ed65@JRWUBU2> Message-ID: On 26 October 2022, Richard Wordingham wrote: > Myanmar When you write ?Myanmar? instead of ?Burma?, you should also write ?Espanyar? instead of ?Spain?. From richard.wordingham at ntlworld.com Wed Oct 26 02:47:36 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Wed, 26 Oct 2022 08:47:36 +0100 Subject: Encoding of Text in the Myanmar Script In-Reply-To: References: <20221026030249.6807ed65@JRWUBU2> Message-ID: <20221026084736.1979cd2f@JRWUBU2> On Wed, 26 Oct 2022 05:52:08 +0000 Andreas Prilop via Unicode wrote: > On 26 October 2022, Richard Wordingham wrote: > > > Myanmar > > When you write ?Myanmar? instead of ?Burma?, > you should also write ?Espanyar? instead of ?Spain?. > I thought 'Burmese Empire' might be deliberately misunderstood. Richard. From johannes at bergerhausen.com Thu Oct 27 04:33:22 2022 From: johannes at bergerhausen.com (Johannes Bergerhausen) Date: Thu, 27 Oct 2022 11:33:22 +0200 Subject: WWS update 2022 Message-ID: <75FD9BB1-7D39-4EF1-B872-0ED97CD4E96A@bergerhausen.com> Dear List, the update to Unicode 15.0 is now online: www.worldswritingsystems.org Two new scripts and a few corrections integrated. Or count: 128 scripts not yet encoded (click on ?Unicode? and scroll down) Third revised edition of the poster: https://kd.hs-mainz.de/shop/the-worlds-writing-systems-poster/ All the best, Deborah, Thomas, Johannes ? Helmig Bergerhausen Gladbacher Stra?e 40, D-50672 K?ln, Germany www.helmigbergerhausen.de ? Prof. Bergerhausen School of Design, Hochschule Mainz Holzstra?e 36, D-55116 Mainz, Germany https://kd.hs-mainz.de www.designlabor-gutenberg.de www.decodeunicode.org www.worldswritingsystems.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From gtbot2007 at gmail.com Fri Oct 28 10:53:54 2022 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Fri, 28 Oct 2022 11:53:54 -0400 Subject: Negative/Negation Sign Message-ID: The negative sign (also known as the negation sign) found on graphing calculator sets (such as the Ti-83/84 sets) appears to be absent from Unicode. Is there a reason for this? -------------- next part -------------- An HTML attachment was scrubbed... URL: From marius.spix at web.de Fri Oct 28 14:24:23 2022 From: marius.spix at web.de (Marius Spix) Date: Fri, 28 Oct 2022 21:24:23 +0200 Subject: Negative/Negation Sign In-Reply-To: References: Message-ID: <20221028212423.5d5fa9a4@spixxi> The minus sign IS in Unicode (U+2212 MINUS SIGN). The braces on the calculator key are for differentiation of the unary minus (negation) and the binary minus (subtraction). There is also a logical negation sign for propositional calculus (U+00AC NOT SIGN). Am Fri, 28 Oct 2022 11:53:54 -0400 schrieb Gabriel Tellez via Unicode : > The negative sign (also known as the negation sign) found on graphing > calculator sets (such as the Ti-83/84 sets) appears to be absent from > Unicode. Is there a reason for this? From gtbot2007 at gmail.com Fri Oct 28 14:39:29 2022 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Fri, 28 Oct 2022 15:39:29 -0400 Subject: Negative/Negation Sign In-Reply-To: <20221028212423.5d5fa9a4@spixxi> References: <20221028212423.5d5fa9a4@spixxi> Message-ID: It?s not the minus sign tho. The calculator button might say (-) but the calculator it self has a different symbol that basically a raised minus sign. On Fri, Oct 28, 2022 at 3:24 PM Marius Spix wrote: > The minus sign IS in Unicode (U+2212 MINUS SIGN). The braces on the > calculator key are for differentiation of the unary minus (negation) > and the binary minus (subtraction). There is also a logical negation > sign for propositional calculus (U+00AC NOT SIGN). > > Am Fri, 28 Oct 2022 11:53:54 -0400 > schrieb Gabriel Tellez via Unicode : > > > The negative sign (also known as the negation sign) found on graphing > > calculator sets (such as the Ti-83/84 sets) appears to be absent from > > Unicode. Is there a reason for this? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From harjitmoe at outlook.com Fri Oct 28 15:08:17 2022 From: harjitmoe at outlook.com (Harriet Riddle) Date: Fri, 28 Oct 2022 21:08:17 +0100 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi> Message-ID: Gabriel Tellez via Unicode wrote: > It?s not the minus sign tho. The calculator button might say (-) but > the calculator it self has a different symbol that basically a raised > minus sign. As in U+207B ? SUPERSCRIPT MINUS?? Or less raised than that? I've also heard of styles where U+2212 MINUS SIGN is used as the unary operactor and U+002D HYPHEN-MINUS (usually the shorter of the two) is used as the binary operator. Another codepoint that might be useful, if what you're really concerned about is converting to Unicode without loss of information (round-trip, essentially) is U+FE63 ?, which is actually called SMALL HYPHEN-MINUS, although it really exists so that Big5 and CNS 11643 round-trip.? The rubric attached to Apple's Big5 mapping[1] actually describes the Big5 character in question (0xA1DF) as one of a set of "alternate (centered) forms for paired punctuation; UTC table maps these to small forms", although to the best of my knowledge the Small Form Variants exist solely to round-trip the Big5 and CNS 11643 characters. [1] https://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/CHINTRAD.TXT From gtbot2007 at gmail.com Fri Oct 28 15:10:36 2022 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Fri, 28 Oct 2022 16:10:36 -0400 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi> Message-ID: Is superscript minus use for this? On Fri, Oct 28, 2022 at 4:08 PM Harriet Riddle wrote: > Gabriel Tellez via Unicode wrote: > > It?s not the minus sign tho. The calculator button might say (-) but > > the calculator it self has a different symbol that basically a raised > > minus sign. > > > As in U+207B ? SUPERSCRIPT MINUS? Or less raised than that? > > I've also heard of styles where U+2212 MINUS SIGN is used as the unary > operactor and U+002D HYPHEN-MINUS (usually the shorter of the two) is > used as the binary operator. > > Another codepoint that might be useful, if what you're really concerned > about is converting to Unicode without loss of information (round-trip, > essentially) is U+FE63 ?, which is actually called SMALL HYPHEN-MINUS, > although it really exists so that Big5 and CNS 11643 round-trip. The > rubric attached to Apple's Big5 mapping[1] actually describes the Big5 > character in question (0xA1DF) as one of a set of "alternate (centered) > forms for paired punctuation; UTC table maps these to small forms", > although to the best of my knowledge the Small Form Variants exist > solely to round-trip the Big5 and CNS 11643 characters. > > [1] https://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/CHINTRAD.TXT > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sosipiuk at gmail.com Fri Oct 28 15:35:21 2022 From: sosipiuk at gmail.com (=?UTF-8?Q?S=C5=82awomir_Osipiuk?=) Date: Fri, 28 Oct 2022 20:35:21 +0000 Subject: Negative/Negation Sign In-Reply-To: References: Message-ID: <1666988720914.2480867530.839539952@gmail.com> What about U+02D7 (modifier letter minus sign)? If the TI-83 charset were to be mapped to Unicode (maybe it already is?), I think it would simply take advantage of the existing options: ?23.45 (U+02D7 modifier letter minus) -23.45 (U+002D hyphen minus) ?23.45 (U+2212 minus sign) ?23.45 (U+207B superscript minus) ?23.45 (U+FE63 small hyphen minus) These all look distinct with my default font. S?awomir Osipiuk From jukkakk at gmail.com Fri Oct 28 15:56:21 2022 From: jukkakk at gmail.com (Jukka K. Korpela) Date: Fri, 28 Oct 2022 23:56:21 +0300 Subject: Negative/Negation Sign In-Reply-To: References: Message-ID: The appearance of a symbol in some device does not really mean something that should be relevant to Unicode. It is not an element of text but a graphic symbol in a particular technical environment. It is not even clear which symbol you are referring to. In any case. if you wanted to suggest that some symbol be encoded as a Unicode character, you would need to demonstrate its use in texts (or, stretching a lot, the need for using it in texts) and its essence as a separate character rather than a typographic or stylistic variant of an existing character. Jukka, https://jkorpela.fi/ pe 28. lokak. 2022 klo 18.57 Gabriel Tellez via Unicode ( unicode at corp.unicode.org) kirjoitti: > The negative sign (also known as the negation sign) found on graphing > calculator sets (such as the Ti-83/84 sets) appears to be absent from > Unicode. Is there a reason for this? -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Fri Oct 28 17:10:15 2022 From: doug at ewellic.org (Doug Ewell) Date: Fri, 28 Oct 2022 22:10:15 +0000 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi> Message-ID: Gabriel Tellez wrote: > Is superscript minus use for this? Is *anything* used for this, outside of the TI-83 and TI-84 machines, other than an ordinary minus sign or hyphen-minus? There are actual mathematics experts on this list, but my understanding is that normal mathematical notation?as used both by experts and the general public?uses the same symbol for both unary and binary minus. The TI calculators may have distinguished between the two to make input or internal parsing easier. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From richard.wordingham at ntlworld.com Fri Oct 28 17:10:48 2022 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Fri, 28 Oct 2022 23:10:48 +0100 Subject: Encoding of Text in the Myanmar Script In-Reply-To: References: <20221026030249.6807ed65@JRWUBU2> Message-ID: <20221028231048.6db1bf66@JRWUBU2> On Fri, 28 Oct 2022 12:54:52 -0400 Wunna Ko wrote: > Richard, > > There is no official Standard in Myanmar. There is no official > encoding as well. But the Burmese users generally accepted Unicode as > a standard. > > Not sure what you are referring to when you mention TUS encoding. The TUS encoding for Modern Burmese is Table 16-4 in the Unicode Standard, currently (Version 15.0) on p673, in Chapter 16. Now, if one applies those rules to the _Mon_ word generally transliterated as 'to choose', one gets . However if one applies the Microsoft Typography rules, one gets . One renders properly, while the other gets a dashed circle inserted. Safari on iPhone and Edge on Windows 10 render different ones properly. My key question is, has the Government ruled which is correct? There's a similar issue with the sequence v. , but slightly different - Apple rejects the first, HarfBuzz accepts both (or a font corrects the renderer), and I think UTN-11 Version 4 is meant to reject the second as it accepts the first, but it is not mentioned in Footnote 3 to the table on pp6-7. This is also a nasty one, as it is not beyond the wit of man to ascribe different renderings and even meanings to the two sequences. Richard. From gtbot2007 at gmail.com Fri Oct 28 17:41:30 2022 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Fri, 28 Oct 2022 18:41:30 -0400 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi> Message-ID: Ok but compatibility or something. Also I would love to have a separate Unicode for negative. On Fri, Oct 28, 2022 at 6:10 PM Doug Ewell wrote: > Gabriel Tellez wrote: > > > Is superscript minus use for this? > > Is *anything* used for this, outside of the TI-83 and TI-84 machines, > other than an ordinary minus sign or hyphen-minus? > > There are actual mathematics experts on this list, but my understanding is > that normal mathematical notation?as used both by experts and the general > public?uses the same symbol for both unary and binary minus. The TI > calculators may have distinguished between the two to make input or > internal parsing easier. > > -- > Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Fri Oct 28 17:52:52 2022 From: doug at ewellic.org (Doug Ewell) Date: Fri, 28 Oct 2022 22:52:52 +0000 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi>

Message-ID: Gabriel Tellez wrote: > Ok but compatibility or something. Is there a need to interchange data between the TI calculators and other systems? That's what compatibility in Unicode is about. > Also I would love to have a separate Unicode for negative. Personal desire to reform math notation is probably not a justification. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org From gtbot2007 at gmail.com Sat Oct 29 09:47:02 2022 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Sat, 29 Oct 2022 10:47:02 -0400 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi>

Message-ID: It?s not my personal reform, it was already done by these character sets. On Fri, Oct 28, 2022 at 6:52 PM Doug Ewell wrote: > Gabriel Tellez wrote: > > > Ok but compatibility or something. > > Is there a need to interchange data between the TI calculators and other > systems? That's what compatibility in Unicode is about. > > > Also I would love to have a separate Unicode for negative. > > Personal desire to reform math notation is probably not a justification. > > -- > Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Sat Oct 29 12:29:27 2022 From: doug at ewellic.org (Doug Ewell) Date: Sat, 29 Oct 2022 17:29:27 +0000 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi>

Message-ID: Gabriel Tellez wrote: > It?s not my personal reform, it was already done by these character > sets. "I would love to have a separate Unicode for negative" did make it sound like a personal wish. Other people would have loved to have separate Unicode characters for . as an abbreviation marker, a sentence-ending punctuation mark, and a decimal point. Do you have any examples of how the TI calculators represent the two discrete symbols when interchanging data with a computer, using TI-GRAPH LINK? Does this interface exchange plain-text data, or is it in a proprietary binary format? Can the two characters leak into the outside world in any other way? Remember the bit about interoperability: with what external systems would this character interoperate? Otherwise, the solution of U+207B SUPERSCRIPT MINUS seems to fill the need. -- Doug Ewell, CC, ALB | Lakewood, CO, US | http://ewellic.org From asmusf at ix.netcom.com Sat Oct 29 14:43:03 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 29 Oct 2022 12:43:03 -0700 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi>

Message-ID: <2473ea28-4f6c-eaec-93be-9f76cc7c835d@ix.netcom.com> According to https://en.wikipedia.org/wiki/TI_calculator_character_sets the "negation" is mapped to U+207B SUPERSCRIPT MINUS in TI Character sets. Unless that information is definitely incorrect, this should be the end of discussion. A./ On 10/29/2022 10:29 AM, Doug Ewell via Unicode wrote: > Gabriel Tellez wrote: > >> It?s not my personal reform, it was already done by these character >> sets. > "I would love to have a separate Unicode for negative" did make it sound like a personal wish. Other people would have loved to have separate Unicode characters for . as an abbreviation marker, a sentence-ending punctuation mark, and a decimal point. > > Do you have any examples of how the TI calculators represent the two discrete symbols when interchanging data with a computer, using TI-GRAPH LINK? Does this interface exchange plain-text data, or is it in a proprietary binary format? Can the two characters leak into the outside world in any other way? Remember the bit about interoperability: with what external systems would this character interoperate? > > Otherwise, the solution of U+207B SUPERSCRIPT MINUS seems to fill the need. > > -- > Doug Ewell, CC, ALB | Lakewood, CO, US |http://ewellic.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sosipiuk at gmail.com Sat Oct 29 15:18:46 2022 From: sosipiuk at gmail.com (=?UTF-8?Q?S=C5=82awomir_Osipiuk?=) Date: Sat, 29 Oct 2022 20:18:46 +0000 Subject: Negative/Negation Sign In-Reply-To: <2473ea28-4f6c-eaec-93be-9f76cc7c835d@ix.netcom.com> References: <2473ea28-4f6c-eaec-93be-9f76cc7c835d@ix.netcom.com> Message-ID: <1667073947992.3769056767.4121947635@gmail.com> On Saturday, 29 October 2022, 15:43:03 (-04:00), Asmus Freytag via Unicode wrote: According to https://en.wikipedia.org/wiki/TI_calculator_character_sets the "negation" is mapped to U+207B SUPERSCRIPT MINUS in TI Character sets. Unless that information is definitely incorrect, this should be the end of discussion. A./ I tried to look through the sources for that page but found no definitive mapping. The Unicode values seem to have simply been matched by sight by the editor. The sources contain only bitmaps of the characters and their TI-internal byte values. Just another reminder that Wikipedia is not always reliable. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gtbot2007 at gmail.com Sat Oct 29 15:31:41 2022 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Sat, 29 Oct 2022 16:31:41 -0400 Subject: Negative/Negation Sign In-Reply-To: <1667073947992.3769056767.4121947635@gmail.com> References: <2473ea28-4f6c-eaec-93be-9f76cc7c835d@ix.netcom.com> <1667073947992.3769056767.4121947635@gmail.com> Message-ID: Looks like the real answer (hopefully) would be if we could find out what U+207B SUPERSCRIPT MINUS was originally added for. On Sat, Oct 29, 2022 at 4:21 PM S?awomir Osipiuk via Unicode < unicode at corp.unicode.org> wrote: > On Saturday, 29 October 2022, 15:43:03 (-04:00), Asmus Freytag via Unicode > wrote: > > According to https://en.wikipedia.org/wiki/TI_calculator_character_sets > the "negation" is mapped to U+207B SUPERSCRIPT MINUS in TI Character sets. > Unless that information is definitely incorrect, this should be the end of > discussion. > > A./ > > > I tried to look through the sources for that page but found no definitive > mapping. The Unicode values seem to have simply been matched by sight by > the editor. The sources contain only bitmaps of the characters and their > TI-internal byte values. Just another reminder that Wikipedia is not always > reliable. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Sat Oct 29 16:42:22 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 29 Oct 2022 14:42:22 -0700 Subject: Negative/Negation Sign In-Reply-To: <1667073947992.3769056767.4121947635@gmail.com> References: <2473ea28-4f6c-eaec-93be-9f76cc7c835d@ix.netcom.com> <1667073947992.3769056767.4121947635@gmail.com> Message-ID: <2226b003-475f-8823-daa4-6ad8f469eb8e@ix.netcom.com> On 10/29/2022 1:18 PM, S?awomir Osipiuk wrote: > On Saturday, 29 October 2022, 15:43:03 (-04:00), Asmus Freytag via > Unicode wrote: > > According to > https://en.wikipedia.org/wiki/TI_calculator_character_sets the > "negation" is mapped to U+207B SUPERSCRIPT MINUS in TI Character > sets. Unless that information is definitely incorrect, this should > be the end of discussion. > > A./ > > I tried to look through the sources for that page but found no > definitive mapping. The Unicode values seem to have simply been > matched by sight by the editor. The sources contain only bitmaps of > the characters and their TI-internal byte values. Just another > reminder that Wikipedia is not always reliable. The Wikipedia article does show a mapping. And, no matter its origin, that mapping appears uncontested. (I haven't looked through the page history, but that's where you would find any disagreement on the issue; unless you can point us to something in there, I'll assume it's uncontested; let me know what you find). Because it's a mapping and out there, there's now a published choice for how to represent that character in Unicode. That fact alone changes the question from a completely open one to one where there's a de-facto "proposed solution". If you (or anyone) disagrees, you would have to demonstrate why that choice is incorrect or insufficient. And, "matching by sight" isn't necessarily an incorrect approach. Unicode distinguishes between the identity of a character and the thing that it denotes in a certain context --- with very deliberate exceptions. For '.', for example, the precedent is very strong: The identity is the "period" whether used as a full stop or decimal point, or delimiter in internet addresses or abbreviation marker.? For ':' we don't code a different character for the use of abbreviation marker in Swedish, and so on. For letters, on the other hand, membership in a certain script, or having a particular case mapping can contribute to the defining characteristics of a character's identity, leading to disunification of otherwise identical shapes. For dashes, Unicode considers that differences length, and position relative to baseline or centerline are charateristics that make two dashes distinct symbols. However, that means that when two dashes have identical appearance, they should not be disunified based simply on how they are used. (The issue is a bit more complex than that, because ASCII unifies two of them into 002D, but that's a historical one-off, not a precedent). So, if you disagree with this mapping, you'll have to demonstrate that there's a consistent visual difference to the "actual" character, such that it would render SUPERSCRIPT MINUS distinct from the unary negation. Otherwise, the conclusion stands that there is one known convention (TI character set) that uses SUPERSCRIPT MINUS to indicate unary negation. A./ PS: interestingly enough, one of the sources cited for the Wikipedia article actually has a mapping to U+203E (spacing overline). You now have two choices of "de-facto" mappings; however, I think we can agree that U+203E seems a much poorer match for the glyph given for negation that U+207B; the former is at caps height, the latter between centerline and caps. The dot matrix glyph image has the negation 1 pixel above center.? The resolution severely limits the available positions; like the position of SUPERSCRIPT MINUS in Cambria math, the TI negation sits on just between the centerline of superscripted digits and their (raised) baseline. I think whoever came up with that mapping did a better job than whoever mapped this to U+203E. -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Sat Oct 29 16:56:20 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 29 Oct 2022 14:56:20 -0700 Subject: Negative/Negation Sign In-Reply-To: References: <2473ea28-4f6c-eaec-93be-9f76cc7c835d@ix.netcom.com> <1667073947992.3769056767.4121947635@gmail.com> Message-ID: <579dca3f-389d-cebf-a83f-f69e6da9a056@ix.netcom.com> On 10/29/2022 1:31 PM, Gabriel Tellez via Unicode wrote: > Looks like the real answer (hopefully) would be if we could find out > what U+207B SUPERSCRIPT MINUS was originally added for. These characters were in Unicode from very early on. Unlike some of the later additions there is no link to a particular citation "in the wild". Instead, the original repertoire collected a superset of then existing character sets in reasonably wide usage. If any of their members violated Unicode encoding principles, they were added as compatibility characters (to facilitate round trip), otherwise as ordinary characters. However, the question implicitly supposes that symbols and punctuation are encoded by function. That is not generally correct. They are encoded based on distinct (contrasting) shape compared to other symbols (noting that for dashes and similar symbols, shape is not only defined by the "ink" but also where that "ink" is placed). If a symbol was reused for something else without a change in appearance, it would not therefore qualify for being re-encoded. In this case, the appearance of SUPERSCRIPT MINUS in a modern math font shows a relative positioning to superscript digits, full sized digits and relative length to the standard minus sign that matches to the TI character for negation (within the constrained imposed by limited resolution raster images). I can see no indication that the TI engineers had some other symbol in mind, that is had they had the choice of a Unicode-encoded outline font, they would have chosen something with an appearance very distinct from SUPERSCRIPT MINUS. Unless and until someone can come up with a very cogent argument that they were really trying to model something that is visually distinct from a superscript minus sign, there is no reason to reject that mapping. However, as I pointed out in another message, we should reject a mapping to 203E even though some sources have it: the visuals simply do not match. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sosipiuk at gmail.com Sat Oct 29 17:42:30 2022 From: sosipiuk at gmail.com (=?UTF-8?Q?S=C5=82awomir_Osipiuk?=) Date: Sat, 29 Oct 2022 22:42:30 +0000 Subject: Negative/Negation Sign In-Reply-To: <579dca3f-389d-cebf-a83f-f69e6da9a056@ix.netcom.com> References: <579dca3f-389d-cebf-a83f-f69e6da9a056@ix.netcom.com> Message-ID: <1667082249887.4172968439.3675259820@gmail.com> On Saturday, 29 October 2022, 17:56:20 (-04:00), Asmus Freytag via Unicode wrote: I can see no indication that the TI engineers had some other symbol in mind, that is had they had the choice of a Unicode-encoded outline font, they would have chosen something with an appearance very distinct from SUPERSCRIPT MINUS. Unless and until someone can come up with a very cogent argument that they were really trying to model something that is visually distinct from a superscript minus sign, there is no reason to reject that mapping. The argument is simple enough: a minus sign as part of the exponent should be visually distinct from a negation sign in the base. The TI engineers were trying to visually separate subtraction and negation. To the extent we can try to deduce their reasoning, they would not have wanted to immediately confuse negation with negative exponents, which is what the superscript does. Someone with the appropriate calculator can confirm: What does 3???-3 ("three to the power of negative one, then subtract negative three") look like? If the negative symbol in the exponent and the one preceding the three are the same, I'll admit the superscript is fine in this case. My view is that the modifier letter minus (U+02D7) is the best option to respect the intended semantics, while the plain hyphen-minus (U+002D) would be my second choice. As for Wikipedia, it's ridiculous to say that one person's opinion on an extremely esoteric detail, left uncontested (or more likely unnoticed and unquestioned) is enough to form some kind of de facto standard. But, if we are going by that logic, I suggest you check the page again. ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Sat Oct 29 18:29:00 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 29 Oct 2022 16:29:00 -0700 Subject: Negative/Negation Sign In-Reply-To: <1667082249887.4172968439.3675259820@gmail.com> References: <579dca3f-389d-cebf-a83f-f69e6da9a056@ix.netcom.com> <1667082249887.4172968439.3675259820@gmail.com> Message-ID: On 10/29/2022 3:42 PM, S?awomir Osipiuk wrote: > On Saturday, 29 October 2022, 17:56:20 (-04:00), Asmus Freytag via > Unicode wrote: > > I can see no indication that the TI engineers had some other > symbol in mind, that is had they had the choice of a > Unicode-encoded outline font, they would have chosen something > with an appearance very distinct from SUPERSCRIPT MINUS. Unless > and until someone can come up with a very cogent argument that > they were really trying to model something that is visually > distinct from a superscript minus sign, there is no reason to > reject that mapping. > > > The argument is simple enough: a minus sign as part of the exponent > should be visually distinct from a negation sign in the base. And they are. One comes before the base, the other one after the base. And since negation is unary, it's never preceded by anything other than an operator, delimiter or a space. > > The TI engineers were trying to visually separate subtraction and > negation. To the extent we can try to deduce their reasoning, they > would not have wanted to immediately confuse negation with negative > exponents, which is what the superscript does. > > Someone with the appropriate calculator can confirm: What does 3???-3 > ("three to the power of negative one, then subtract negative three") > look like? If the negative symbol in the exponent and the one > preceding the three are the same, I'll admit the superscript is fine > in this case. You don't need the calculator - you can look up the 7x5 bitmaps for the fonts. The result looks like 3????3 which is clear and unambiguous. The second ? cannot ever be part of an exponent. BTW, the same engineers have a provided a precomposed symbol for ??. The Wikipedia suggests mapping that to <207B 00B9>, while the other source maps that to? <203E 00B9> which uses the clearly inappropriate overline. > > My view is that the modifier letter minus (U+02D7) is the best option > to respect the intended semantics, while the plain hyphen-minus > (U+002D) would be my second choice. > > As for Wikipedia, it's ridiculous to say that one person's opinion on > an extremely esoteric detail, left uncontested (or more likely > unnoticed and unquestioned) is enough to form some kind of de facto > standard. But, if we are going by that logic, I suggest you check the > page again. ;-) No more ridiculous than your personal choice, immediately contested here. :) The fact is that a centerline glyph, no matter whether shorter than the minus sign or not, does not match the conventions used by TI. You are, of course, free to suggest that your notation is superior; it's just not a better *mapping* for what is available on those calculators. However, nothing prevents you from using it in your own documents. Neither makes an argument for encoding a new characters - which is what had started this thread. A./ PS: incidentally, the TI font has both a "DASH" and? "HYPHEN" (names given in one of the listings for the characters). The former has the width of what Unicode encodes as 2212 (same as the while the latter is shorter. The location in the original set near "+" and "=" makes clear that "DASH" is meant for the minus sign and the mapping in both source and Wikipedia therefore has 2021 for it, while the "HYPHEN" is mapped to 2010. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at kli.org Sat Oct 29 19:15:34 2022 From: mark at kli.org (Mark E. Shoulson) Date: Sat, 29 Oct 2022 20:15:34 -0400 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi> Message-ID: The APL language used a high-minus for negative numbers and a normal hyphen-minus for the operator, monadic or dyadic.? The high-minus was syntactically part of the number, while the regular minus operated on a number (which would affect its precedence.) Come to think of it, when they were teaching us negative numbers in grade school I think my math book initially used a high-minus sign and then introduced the concept that negation was an operation that can be done to numbers and from then on used the regular minus sign. Non-typographically, Lojban mathematical syntax (mekso) distinguishes {vu'u}, the subtraction operator, from {ni'u}, the negative-number indicator.? The latter is syntactically considered a *digit*, while the former is an operator. Unicode has a long history of tolerating the typographic weirdness of APL (all those APL symbols).? That there isn't an APL high-minus sign already would indicate to me that APL contents itself with U+207B SUPERSCRIPT MINUS and that's Just Fine. ~mark On 10/28/22 18:10, Doug Ewell via Unicode wrote: > Gabriel Tellez wrote: > >> Is superscript minus use for this? > Is *anything* used for this, outside of the TI-83 and TI-84 machines, other than an ordinary minus sign or hyphen-minus? > > There are actual mathematics experts on this list, but my understanding is that normal mathematical notation?as used both by experts and the general public?uses the same symbol for both unary and binary minus. The TI calculators may have distinguished between the two to make input or internal parsing easier. > > -- > Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org > From gtbot2007 at gmail.com Sat Oct 29 21:18:16 2022 From: gtbot2007 at gmail.com (Gabriel Tellez) Date: Sat, 29 Oct 2022 22:18:16 -0400 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi>

Message-ID: > > interestingly enough, one of the sources cited for the Wikipedia article > actually has a mapping to U+203E (spacing overline). You mean... it's contested?!?! > These characters were in Unicode from very early on. Unlike some of the > later additions there is no link to a particular citation "in the wild". > Instead, the original repertoire collected a superset of then existing > character sets in reasonably wide usage. If any of their members violated > Unicode encoding principles, they were added as compatibility characters > (to facilitate round trip), otherwise as ordinary characters. Compatibility characters from what set? APL contents itself with U+207B SUPERSCRIPT MINUS No? Other then on one Wikipedia Page, most places I looked (including the APL wiki) used ? U+00AF MACRON. On Sat, Oct 29, 2022 at 8:18 PM Mark E. Shoulson via Unicode < unicode at corp.unicode.org> wrote: > The APL language used a high-minus for negative numbers and a normal > hyphen-minus for the operator, monadic or dyadic. The high-minus was > syntactically part of the number, while the regular minus operated on a > number (which would affect its precedence.) Come to think of it, when > they were teaching us negative numbers in grade school I think my math > book initially used a high-minus sign and then introduced the concept > that negation was an operation that can be done to numbers and from then > on used the regular minus sign. > > Non-typographically, Lojban mathematical syntax (mekso) distinguishes > {vu'u}, the subtraction operator, from {ni'u}, the negative-number > indicator. The latter is syntactically considered a *digit*, while the > former is an operator. > > Unicode has a long history of tolerating the typographic weirdness of > APL (all those APL symbols). That there isn't an APL high-minus sign > already would indicate to me that APL contents itself with U+207B > SUPERSCRIPT MINUS and that's Just Fine. > > ~mark > > On 10/28/22 18:10, Doug Ewell via Unicode wrote: > > Gabriel Tellez wrote: > > > >> Is superscript minus use for this? > > Is *anything* used for this, outside of the TI-83 and TI-84 machines, > other than an ordinary minus sign or hyphen-minus? > > > > There are actual mathematics experts on this list, but my understanding > is that normal mathematical notation?as used both by experts and the > general public?uses the same symbol for both unary and binary minus. The TI > calculators may have distinguished between the two to make input or > internal parsing easier. > > > > -- > > Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pandey at umich.edu Sun Oct 30 00:20:48 2022 From: pandey at umich.edu (Anshuman Pandey) Date: Sun, 30 Oct 2022 00:20:48 -0500 Subject: Negative/Negation Sign In-Reply-To: References: Message-ID: <55855D06-5C02-4FDB-BC4B-B5D38D16C960@umich.edu> An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Sun Oct 30 02:55:56 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sun, 30 Oct 2022 00:55:56 -0700 Subject: Negative/Negation Sign In-Reply-To: <55855D06-5C02-4FDB-BC4B-B5D38D16C960@umich.edu> References: <55855D06-5C02-4FDB-BC4B-B5D38D16C960@umich.edu> Message-ID: <2333e2fd-117c-8222-da87-fea89ccf6e07@ix.netcom.com> On 10/29/2022 10:20 PM, Anshuman Pandey via Unicode wrote: > While we?re on this topic, I?d like to interfere with the need to > encode a distinctive negation sign used in the Bakhshali manuscript, a > mathematical treatise written in the Sharada script: > > https://www.unicode.org/L2/L2013/13080-sharada-bakhshali-minus.pdf > > Obviously, using the common ?+? for indicating negation in plain text > does not capture the semantic intent of the Sharada ?+?. > I see that the proposal is from 2013 and in the intervening 9 years hasn't been encoded. And that's a good thing. As usual, it's impossible with a simple search to locate the result of any UTC deliberation or decision on this to verify the status. A./ > >> On Oct 29, 2022, at 9:19 PM, Gabriel Tellez via Unicode >> wrote: >> >> ? >> >> interestingly enough, one of the sources cited for the Wikipedia >> article actually has a mapping to U+203E (spacing overline). >> >> You mean... it's contested?!?! >> >> These characters were in Unicode from very early on. Unlike some >> of the later additions there is no link to a particular citation >> "in the wild". Instead, the original repertoire collected a >> superset of then existing character sets in reasonably wide >> usage. If any of their members violated Unicode encoding >> principles, they were added as compatibility characters (to >> facilitate round trip), otherwise as ordinary characters. >> >> >> Compatibility characters from what set? >> >> APL contents itself with U+207B SUPERSCRIPT MINUS >> >> >> No? Other then on one Wikipedia Page, most places I looked (including >> the APL wiki) used?? U+00AF MACRON. >> >> On Sat, Oct 29, 2022 at 8:18 PM Mark E. Shoulson via Unicode >> wrote: >> >> The APL language used a high-minus for negative numbers and a normal >> hyphen-minus for the operator, monadic or dyadic.? The high-minus >> was >> syntactically part of the number, while the regular minus >> operated on a >> number (which would affect its precedence.) Come to think of it, >> when >> they were teaching us negative numbers in grade school I think my >> math >> book initially used a high-minus sign and then introduced the >> concept >> that negation was an operation that can be done to numbers and >> from then >> on used the regular minus sign. >> >> Non-typographically, Lojban mathematical syntax (mekso) >> distinguishes >> {vu'u}, the subtraction operator, from {ni'u}, the negative-number >> indicator.? The latter is syntactically considered a *digit*, >> while the >> former is an operator. >> >> Unicode has a long history of tolerating the typographic >> weirdness of >> APL (all those APL symbols).? That there isn't an APL high-minus >> sign >> already would indicate to me that APL contents itself with U+207B >> SUPERSCRIPT MINUS and that's Just Fine. >> >> ~mark >> >> On 10/28/22 18:10, Doug Ewell via Unicode wrote: >> > Gabriel Tellez wrote: >> > >> >> Is superscript minus use for this? >> > Is *anything* used for this, outside of the TI-83 and TI-84 >> machines, other than an ordinary minus sign or hyphen-minus? >> > >> > There are actual mathematics experts on this list, but my >> understanding is that normal mathematical notation?as used both >> by experts and the general public?uses the same symbol for both >> unary and binary minus. The TI calculators may have distinguished >> between the two to make input or internal parsing easier. >> > >> > -- >> > Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org >> >> > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmusf at ix.netcom.com Sun Oct 30 03:12:01 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sun, 30 Oct 2022 01:12:01 -0700 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi>

Message-ID: On 10/29/2022 7:18 PM, Gabriel Tellez via Unicode wrote: > > interestingly enough, one of the sources cited for the Wikipedia > article actually has a mapping to U+203E (spacing overline). > > You mean... it's contested?!?! Meaning whatever you like. In the source document the comments give what is effectively a numeric character entity that supposedly can be used to achieve the same appearance. However, the appearance of the glyph in the bitmap font clearly does not match that of U+203E (the latter is both longer and placed higher in the character cell). So, I would call it a mistake. If it had been the case of some other code point that also looks similar to what is actually displayed on the calculator, I would be more comfortable with considering it the result of a reasonable difference in opinion of how to best map the character. > These characters were in Unicode from very early on. Unlike some > of the later additions there is no link to a particular citation > "in the wild". Instead, the original repertoire collected a > superset of then existing character sets in reasonably wide usage. > If any of their members violated Unicode encoding principles, they > were added as compatibility characters (to facilitate round trip), > otherwise as ordinary characters. > > > Compatibility characters from what set? > > APL contents itself with U+207B SUPERSCRIPT MINUS > > > No? Other then on one Wikipedia Page, most places I looked (including > the APL wiki) used?? U+00AF MACRON. Because that was available before Unicode. There's nothing wrong with it, if that's a notation you like, but it doesn't match the look of the symbol in the TI bitmap fonts. So, for example, U+203E and U+00AF are much closer in appearance than either is with U+207B. It might be that the suggested character entity was chosen to make the notation look more like APL. That's not unreasonable, per se. However, for a plain-text mappings (unlike display effect of using a character entity in a suitable environment) it would have been better to either stick with something that matches the original appearance or something that matches some other plain text format that uses a comparable notation. It all depends. Are you simply interested in round tripping data through Unicode? Are you interested in passing text of to some existing mathematical interpreter? Or are you interested in creating a text stream that somewhat looks like what the calculator is showing, perhaps for discussing results or programming techniques? All three may conceivably have different mappings, but none of these cinches the case for adding something new to Unicode at this point. A./ > > On Sat, Oct 29, 2022 at 8:18 PM Mark E. Shoulson via Unicode > wrote: > > The APL language used a high-minus for negative numbers and a normal > hyphen-minus for the operator, monadic or dyadic.? The high-minus was > syntactically part of the number, while the regular minus operated > on a > number (which would affect its precedence.) Come to think of it, when > they were teaching us negative numbers in grade school I think my > math > book initially used a high-minus sign and then introduced the concept > that negation was an operation that can be done to numbers and > from then > on used the regular minus sign. > > Non-typographically, Lojban mathematical syntax (mekso) distinguishes > {vu'u}, the subtraction operator, from {ni'u}, the negative-number > indicator.? The latter is syntactically considered a *digit*, > while the > former is an operator. > > Unicode has a long history of tolerating the typographic weirdness of > APL (all those APL symbols).? That there isn't an APL high-minus sign > already would indicate to me that APL contents itself with U+207B > SUPERSCRIPT MINUS and that's Just Fine. > > ~mark > > On 10/28/22 18:10, Doug Ewell via Unicode wrote: > > Gabriel Tellez wrote: > > > >> Is superscript minus use for this? > > Is *anything* used for this, outside of the TI-83 and TI-84 > machines, other than an ordinary minus sign or hyphen-minus? > > > > There are actual mathematics experts on this list, but my > understanding is that normal mathematical notation?as used both by > experts and the general public?uses the same symbol for both unary > and binary minus. The TI calculators may have distinguished > between the two to make input or internal parsing easier. > > > > -- > > Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From beckiergb at gmail.com Sun Oct 30 12:42:04 2022 From: beckiergb at gmail.com (Rebecca Bettencourt) Date: Sun, 30 Oct 2022 10:42:04 -0700 Subject: Negative/Negation Sign In-Reply-To: References: <20221028212423.5d5fa9a4@spixxi>

Message-ID: Smalltalk also uses a distinct symbol for unary minus. I mapped it to U+207B, because that's what it looks like, and they have the same semantics. I would do the same for the TI-89 character set, because I had a TI-89, and I remember that's what it looks like, and they have the same semantics. -------------- next part -------------- An HTML attachment was scrubbed... URL: