From unicode at unicode.org Sat Nov 9 17:18:13 2019 From: unicode at unicode.org (Peter Constable via Unicode) Date: Sat, 9 Nov 2019 23:18:13 +0000 Subject: New Public Review on QID emoji In-Reply-To: <3d02402e-6ab0-2417-8c23-c958f3ae9092@sonic.net> References: <5eff0ea4.a4d.16e1dc1e66b.Webtop.52@btinternet.com> <3d02402e-6ab0-2417-8c23-c958f3ae9092@sonic.net> Message-ID: > Yet if QID emoji are implemented by Unicode Inc. without also being > implemented by ISO/IEC 10646 then that could lead to future problems, Neither Unicode Inc. or ISO/IEC 10646 would _implement_ QID emoji. Unicode would provide a specification for QID emoji that software vendors could implement, while ISO/IEC 10646 would not define that specification. As Ken mentions, there are already many emoji in use inter-operably based on specifications provided by Unicode but not by ISO/IEC 10646. Ken's other point is also worth stressing: there are inter-op issues inherent to the architecture of the QID mechanism. If adopted as a Unicode spec, any software vendor could choose to implement anything they might want as a QID emoji sequence, and there would be no guarantee that any other software would interoperate with that beyond a very minimum: if other software supported the QID mechanism, it would recognize the sequence as a QID sequence, and might handle it as a unit for segmentation purposes (selection, line breaking), but only render the base emoji character in a legible way and nothing more. Moreover, it's possible that two vendors might implement the same QID sequence with significantly different appearances, enough to connote different meanings. (Think of issues from recent years in which the major vendors had significantly different appearances for the same emoji character. Then extend that possibility to every QID sequence that any vendor implements.) Also, it's entirely possible that two different vendors might implement _different_ QID sequences with similar appearances and semantic intent. The PRI doc mentions the possibility of a registry for QID sequences; a key benefit of a registry is that it may mitigate against these non-interop risks. But the current proposal does not in fact provide any mitigations for these issues other than the possibility that a QID sequence might be at some point become an RGI sequence. Peter -----Original Message----- From: Unicode On Behalf Of Ken Whistler via Unicode Sent: Wednesday, October 30, 2019 12:19 PM To: wjgo_10009 at btinternet.com Cc: unicode at unicode.org Subject: Re: New Public Review on QID emoji On 10/30/2019 10:41 AM, wjgo_10009 at btinternet.com via Unicode wrote: > > At present I have a question to which I cannot find the answer. > > Is the QID emoji format, if approved by the Unicode Technical > Committee going to be sent to the ISO/IEC 10646 committee for > consideration by that committee? No. > > As the QID emoji format is in a Unicode Technical Standard and does > not include the encoding of any new _atomic_ characters, I am > concerned that the answer to the above question may well be along the > lines of "No" maybe with some reasoning as to why not. As you surmised. > > Yet will a QID emoji essentially be _de facto_ a character even if not > _de jure_ a character? That distinction is effectively meaningless. There are any number of entities that end users perceive as "characters", which are not represented by a single code point in the Unicode Standard (or 10646) -- and this has been the case now for decades. > > > Yet if QID emoji are implemented by Unicode Inc. without also being > implemented by ISO/IEC 10646 then that could lead to future problems, > notwithstanding any _de jure_ situation that QID emoji are not > characters, because they will be much more than Private Use characters > yet less than characters that are in ISO/IEC 10646. What you are missing is that *many* emoji are already represented by sequences of characters. See emoji modifier sequences, emoji flag sequences, emoji ZWJ sequences. *None* of those are specified in 10646, have not been for years now, and never will be. And yet, there is no de jure standardization crisis here, or any interoperability issue for emoji arising from that situation. > > I am in favour of the encoding of the QID emoji mechanism and its > practical application. However I wonder about what are the > consequences for interoperability and communication if QID emoji > become used - maybe quite widely - and yet the tag sequences are not > discernable in meaning from ISO/IEC 10646 or any related ISO/IEC > documents. There may well be interoperability concerns specifically for the QID emoji mechanism, but that would be an issue pertaining to the architecture of that mechanism specifically. It isn't anything to do with the relationship between the Unicode Standard (and UTS #51) and ISO/IEC 10646. --Ken From unicode at unicode.org Sat Nov 9 21:08:57 2019 From: unicode at unicode.org (Asmus Freytag via Unicode) Date: Sat, 9 Nov 2019 19:08:57 -0800 Subject: New Public Review on QID emoji In-Reply-To: References: <5eff0ea4.a4d.16e1dc1e66b.Webtop.52@btinternet.com> <3d02402e-6ab0-2417-8c23-c958f3ae9092@sonic.net> Message-ID: <806afd78-b6f2-1a53-067d-79d8f36bd667@ix.netcom.com> An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Nov 11 05:56:54 2019 From: unicode at unicode.org (Philippe Verdy via Unicode) Date: Mon, 11 Nov 2019 12:56:54 +0100 Subject: Encoding the Nsibidi script (African) for writing the Igbo language Message-ID: Encoding the Nsibidi script (African) for writing the Efik, Ekoi, Ibibio, Igbo language. See this site as an example of use, with links to published educational books. http://blog.nsibiri.org/ Also this online dictionary: https://fr.scribd.com/doc/281219778/Ikpokwu Other links: https://en.wikipedia.org/wiki/Nsibidi But first there's still no code in ISO 15924 (first step easy to complete before encoding in the UCS). -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Nov 11 10:30:53 2019 From: unicode at unicode.org (Markus Scherer via Unicode) Date: Mon, 11 Nov 2019 08:30:53 -0800 Subject: Encoding the Nsibidi script (African) for writing the Igbo language In-Reply-To: References: Message-ID: On Mon, Nov 11, 2019 at 4:03 AM Philippe Verdy via Unicode < unicode at unicode.org> wrote: > But first there's still no code in ISO 15924 (first step easy to complete > before encoding in the UCS). > That's not first; it's nearly last. The script code standard says "In general, script codes shall be added to ISO 15924 when the script has been coded in ISO/IEC 10646, and when the script is agreed, by experts in ISO 15924/RA-JAC to be unique and a *candidate for encoding in the UCS*." We generally assign the script code when the script is in the pipeline for a near-future version of Unicode, which demonstrates that it's "a candidate for encoding". We also want the name of the script to be settled, so that the script code can be roughly mnemonic for the name. markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Nov 11 16:37:56 2019 From: unicode at unicode.org (Philippe Verdy via Unicode) Date: Mon, 11 Nov 2019 23:37:56 +0100 Subject: Encoding the Nsibidi script (African) for writing the Igbo language In-Reply-To: References: Message-ID: Names of this script can very a bit "Nsibidi", "Nsibiri", but not a lot (d/r variation may be phonetic remonization in one of the supported languages). It is stable across various sites. Uniqueness is quite easy to assert, there's not a lot of ideographic scripts, at least in modern use. But still not as complex as Chinese scripts. The site speaks about a inventory of about 500 base characters (in the first educational books), probably the double (in which case it compares to the modern use of sinograms in China for children, whereas adults use only about 2000 signs for almost everything, compare to the same average of 2000 common words in Indo-European languages, and in Afroasiatic or Nilo-Saharan languages; Igbo is still a minority language, and most of their speakers have low level of litteracy, even in Latin or Arabic scripts and due to the proliferation of vernacualr languages, they may as well use about 500-1000 basic words to understand each other). anyway, I suppose that you were already aware of that script, but were just looking for more evidences to have some comparative researches from a few more sources (lack of interest or finances for linguistic projects in Africa, that prefer placing their efforts in major scripts that have official national support in their educational and cultural programs: Latin, Arabic, Ethiopic, Tifinagh; other scripts are still of interest due to their important historic background and centuries of propagation across countries or caused by wars, invasions, diplomacy, or commercial interests) Le lun. 11 nov. 2019 ? 17:31, Markus Scherer a ?crit : > On Mon, Nov 11, 2019 at 4:03 AM Philippe Verdy via Unicode < > unicode at unicode.org> wrote: > >> But first there's still no code in ISO 15924 (first step easy to complete >> before encoding in the UCS). >> > > That's not first; it's nearly last. > > The script code standard says "In general, script codes shall be added to > ISO 15924 when the script has been coded in ISO/IEC 10646, and when the > script is agreed, by experts in ISO 15924/RA-JAC to be unique and a *candidate > for encoding in the UCS*." > > We generally assign the script code when the script is in the pipeline for > a near-future version of Unicode, which demonstrates that it's "a candidate > for encoding". We also want the name of the script to be settled, so that > the script code can be roughly mnemonic for the name. > > markus > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Nov 11 16:47:32 2019 From: unicode at unicode.org (Philippe Verdy via Unicode) Date: Mon, 11 Nov 2019 23:47:32 +0100 Subject: Encoding the Nsibidi script (African) for writing the Igbo language In-Reply-To: References: Message-ID: Le lun. 11 nov. 2019 ? 17:31, Markus Scherer a ?crit : > We generally assign the script code when the script is in the pipeline for > a near-future version of Unicode, which demonstrates that it's "a candidate > for encoding". We also want the name of the script to be settled, so that > the script code can be roughly mnemonic for the name. > This is not true for some scripts that have been encoded since long in ISO 15924, not all with a proposal candidate for encoding (notably the various Tolkien's invented scripts, Cirth, Tengwar, ... and Klingon, which all have limited use and active supporters). Other scripts were added even without lot of evidence, or that are not even deciphered (Mayan hieroglyphs, Linear A...). There are also missing scripts in India which are still in contemporary use and important for the local cultures (but with limited support in specific states or smaller communities at subregional level only), in Myanmar/Burma, and in aboriginal communities some southern Indonesian islands (I think there are also some aboriginal logographic scripts in Australia, and other Precolombian scripts in Central and South America and very remote islands in Southern Pacific, and still in North-eastern Russia/Beringia). -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Tue Nov 12 09:54:34 2019 From: unicode at unicode.org (wjgo_10009@btinternet.com via Unicode) Date: Tue, 12 Nov 2019 15:54:34 +0000 (GMT) Subject: New Public Review on QID emoji In-Reply-To: References: <5eff0ea4.a4d.16e1dc1e66b.Webtop.52@btinternet.com> <3d02402e-6ab0-2417-8c23-c958f3ae9092@sonic.net> Message-ID: <69c14ea2.b91.16e6052df1d.Webtop.214@btinternet.com> WJGO >>Yet if QID emoji are implemented by Unicode Inc. without also being implemented by ISO/IEC 10646 then that could lead to future problems, ... Peter Constable wrote as follows. > Neither Unicode Inc. or ISO/IEC 10646 would _implement_ QID emoji. That is correct. I should have made clear that I was referring to the specification for QID emoji rather than QID emoji. How quite to express precisely and concisely the formal acceptance of the specification by Unicode Inc. to become a published Unicode Inc. document giving the go-ahead for implementation by anyone (not just software vendors) is somewhat difficult without using the word 'implement'. Peter within his post also wrote as follows. > The PRI doc mentions the possibility of a registry for QID sequences; > a key benefit of a registry is that it may mitigate against these > non-interop risks. But the current proposal does not in fact provide > any mitigations for these issues other than the possibility that a QID > sequence might be at some point become an RGI sequence. I put forward on Friday 8 November 2019 a suggestion that might help towards solving the problem. https://www.unicode.org/review/pri408/ William Overington Tuesday 12 November 2019 From unicode at unicode.org Tue Nov 12 10:41:32 2019 From: unicode at unicode.org (wjgo_10009@btinternet.com via Unicode) Date: Tue, 12 Nov 2019 16:41:32 +0000 (GMT) Subject: New Public Review on QID emoji In-Reply-To: <806afd78-b6f2-1a53-067d-79d8f36bd667@ix.netcom.com> References: <5eff0ea4.a4d.16e1dc1e66b.Webtop.52@btinternet.com> <3d02402e-6ab0-2417-8c23-c958f3ae9092@sonic.net> <806afd78-b6f2-1a53-067d-79d8f36bd667@ix.netcom.com> Message-ID: <4b74c7d1.c72.16e607de06f.Webtop.214@btinternet.com> Asmus Freytag wrote as follows. > While I have a certain understanding for the underlying concerns, it > still is the case that this proposal promises to be a bad example of > "leading standardization": throwing out a spec in the hopes it may be > taken up and take off, instead of something that meets an expressed > need of the stakeholders and that they are eagerly awaiting. I suppose that it could be called "leading standardization" but I think that that is a good thing. Unicode has traditionally been locked into the past. If a symbol could be found carved in stone years ago than that was fine but anything for the future that could possibly become useful was a huge insuperable problem. Yet for me "could possibly become useful" is a good reason for encoding, and QID emoji opens up great futuristic possibilities. For me the big problem with the proposal at present are the restrictions upon which QID items are valid to become encoded as QID emoji. So anything abstract is locked out. That to me is an unnecessary restriction, yet it could easily be removed. Yet abstract shapes are important in communication. I regard QID emoji as a research project. The specification may need some alterations, maybe it is just the start of a whole new path of exploration in communication, much wider than emoji. I am a researcher and I try to find what is good in an idea and focus on that and think where a new idea can lead, applying critical consideration of ideas, yet trying to move forward rather than seizing on problems found as a reason for dismissing the whole idea. So find the problems, try to think round them, try to go forward. Look for what could be done and if it is good, try to do it. Try to go forward rather than quash. > That, then, finally undermines Unicode's implied guarantee as being > the medium for unambiguous interchange. Giving up that guarantee seems > a bad bargain. Many recent emoji encoding proposals seem to delight, as if required, in providing multiple meanings for each newly proposed character. There was a talk at the Unicode and Internationalization Conference a few years ago on what are the meanings of emoji. I was not there but there is a video available on YouTube. https://www.youtube.com/watch?v=9ldSVbXbjl4 William Overington Tuesday 12 November 2019 From unicode at unicode.org Tue Nov 12 11:57:14 2019 From: unicode at unicode.org (Asmus Freytag via Unicode) Date: Tue, 12 Nov 2019 09:57:14 -0800 Subject: New Public Review on QID emoji In-Reply-To: <4b74c7d1.c72.16e607de06f.Webtop.214@btinternet.com> References: <5eff0ea4.a4d.16e1dc1e66b.Webtop.52@btinternet.com> <3d02402e-6ab0-2417-8c23-c958f3ae9092@sonic.net> <806afd78-b6f2-1a53-067d-79d8f36bd667@ix.netcom.com> <4b74c7d1.c72.16e607de06f.Webtop.214@btinternet.com> Message-ID: An HTML attachment was scrubbed... URL: From unicode at unicode.org Tue Nov 12 14:32:59 2019 From: unicode at unicode.org (wjgo_10009@btinternet.com via Unicode) Date: Tue, 12 Nov 2019 20:32:59 +0000 (GMT) Subject: New Public Review on QID emoji In-Reply-To: References: <5eff0ea4.a4d.16e1dc1e66b.Webtop.52@btinternet.com> <3d02402e-6ab0-2417-8c23-c958f3ae9092@sonic.net> <806afd78-b6f2-1a53-067d-79d8f36bd667@ix.netcom.com> <4b74c7d1.c72.16e607de06f.Webtop.214@btinternet.com> Message-ID: <502b23f7.2293.16e6151c533.Webtop.55@btinternet.com> Asmus Freytag wrote as follows. > If leading standardization was such a good thing in communication, why > don't we see more "dictionaries of words not yet in use"? After all, > it would be a huge benefit for people coining new terms to have their > definitions already worked out. Nothing inherent in the technology of > dictionaries has directly prevented overtures in that direction, but > it overwhelmingly remains a path not taken. > One wonders why. The comparison is not of like with like. In 1974 I invented a new concept in broadcasting. I coined the word telesoftware to denote my invention. I was able to use the word immediately, because the format for introducing a new word into English was already established. In 1976 I sent a letter to the editor of a trade magazine using the word. A gentleman who read the letter replied and that reply was published in a later issue of the magazine. Eventually, some years later, the word was added into the Oxford English Dictionary. At first into a volume of the supplement to the first edition and then, when it was published, in the second edition of the Oxford English Dictionary. If someone wants to coin a new word something to do with character encoding then he or she can do so and just start using it, perhaps in a thread in this mailing list nd maybe other people will start using the new word too. Yet if a new emoji or some other symbol is desired to be introduced then the symbol cannot just be included in plain text. QID emoji can provide the capability to get something encoded promptly and used in plain text. I appreciate that there is then a font provision issue, yet with the way to encode the emoji or symbol available an attempt can be made to provide font support. Such font support possibility may well depend upon the platform. I remember that when emoji were introduced into Unicode Doug Ewell predicted that the supporting of emoji on platforms would have the effect of providing support for other characters encoded in plane 1, when such support might have been much slower if emoji had not been encoded. Doug was right. Also colour font technology was developed and implemented and can today be used with any character, not just emoji. So introducing QID emoji could possibly lead to the introduction of advances for other things than emoji as well as for emoji. > Just because you can write something that is a very detailed specification doesn't mean that it is, or ever should be, a standard. Yes, but that does not mean that it should necessarily not become a standard. For communication to take place one needs to start somewhere. The QID emoji proposal is a start. It has been considered at (at least) two Unicode Technical Committee meetings and now there is a public review taking place. Everyone has an opportunity to contribute comments and ideas to the public review and maybe progress will be made. William Overington Tuesday 12 November 2019 From unicode at unicode.org Tue Nov 12 21:00:48 2019 From: unicode at unicode.org (Asmus Freytag via Unicode) Date: Tue, 12 Nov 2019 19:00:48 -0800 Subject: New Public Review on QID emoji In-Reply-To: <502b23f7.2293.16e6151c533.Webtop.55@btinternet.com> References: <5eff0ea4.a4d.16e1dc1e66b.Webtop.52@btinternet.com> <3d02402e-6ab0-2417-8c23-c958f3ae9092@sonic.net> <806afd78-b6f2-1a53-067d-79d8f36bd667@ix.netcom.com> <4b74c7d1.c72.16e607de06f.Webtop.214@btinternet.com> <502b23f7.2293.16e6151c533.Webtop.55@btinternet.com> Message-ID: <1c356c4e-b66f-61da-f01c-5aa2e29ab3fb@ix.netcom.com> An HTML attachment was scrubbed... URL: From unicode at unicode.org Tue Nov 12 21:37:00 2019 From: unicode at unicode.org (James Kass via Unicode) Date: Wed, 13 Nov 2019 03:37:00 +0000 Subject: New Public Review on QID emoji In-Reply-To: <1c356c4e-b66f-61da-f01c-5aa2e29ab3fb@ix.netcom.com> References: <5eff0ea4.a4d.16e1dc1e66b.Webtop.52@btinternet.com> <3d02402e-6ab0-2417-8c23-c958f3ae9092@sonic.net> <806afd78-b6f2-1a53-067d-79d8f36bd667@ix.netcom.com> <4b74c7d1.c72.16e607de06f.Webtop.214@btinternet.com> <502b23f7.2293.16e6151c533.Webtop.55@btinternet.com> <1c356c4e-b66f-61da-f01c-5aa2e29ab3fb@ix.netcom.com> Message-ID: On 2019-11-13 3:00 AM, Asmus Freytag via Unicode wrote: > The current effort starts from an unrelated problem (Unicode not wanting to > administer emoji applications) and in my analysis, seriously puts the cart > before the horse. But it does solve the unrelated problem. There's nothing stopping vendors from making software which recognizes tag character strings to reference in-line graphics. There's nothing stopping users from employing those in-line graphics as emoji images.? It would be considered a higher level protocol which uses tag character strings in lieu of, for example, ASCII strings like .? Either way, it's rich-text expressed with plain-text strings. But for Unicode to provide this mechanism which "should be correctly parsed by all conformant implementations" as well as possibly maintaining a registry of "tag sequences known to be in use" suggests that Unicode now considers that random images (with no symbolic meaning other than they're pictures of something) should be exchanged as plain-text. The QID Emoji in Unicode makes as much sense as the original emoji inclusion.? It's a natural result of the slippery slope of emoji encoding. Emoji are open-ended but Unicode currently has barriers erected. QID Emoji would eliminate limitations on what's supposed to be an open-ended set.? That's the problem that the current effort would resolve.? In my opinion it's better to open up a myriad of images and see which sequences actually get used than to have vendors/enthusiasts create images in the hope or expectation that anyone will actually use them. From unicode at unicode.org Wed Nov 13 12:27:32 2019 From: unicode at unicode.org (wjgo_10009@btinternet.com via Unicode) Date: Wed, 13 Nov 2019 18:27:32 +0000 (GMT) Subject: New Public Review on QID emoji In-Reply-To: <1c356c4e-b66f-61da-f01c-5aa2e29ab3fb@ix.netcom.com> References: <5eff0ea4.a4d.16e1dc1e66b.Webtop.52@btinternet.com> <3d02402e-6ab0-2417-8c23-c958f3ae9092@sonic.net> <806afd78-b6f2-1a53-067d-79d8f36bd667@ix.netcom.com> <4b74c7d1.c72.16e607de06f.Webtop.214@btinternet.com> <502b23f7.2293.16e6151c533.Webtop.55@btinternet.com> <1c356c4e-b66f-61da-f01c-5aa2e29ab3fb@ix.netcom.com> Message-ID: <426d0efa.bff.16e660546eb.Webtop.223@btinternet.com> Asmus Freytag wrote as follows. > Just because a select group of people engages in communication about > the arcane details of a proposed specification it doesn't mean that > the outcome will benefit some entirely different and larger group > communicate better. This is logically true. However the same could have been said about people discussing the details of the then proposed Unicode specification over a quarter of a century ago, wanting to use 16 bits for each character used in ordinary English instead of just 8 bits. Yet Unicode has benefitted many many people around the world who may not know much about the underlying theory and technology. I looked up the word 'arcane' and I opine that the details of the QID emoji proposal are not arcane. They are clear and available free to view, without registration, on the internet. https://www.lexico.com/en/definition/arcane https://www.unicode.org/review/pri408/ > There's too much of the "might possibly" about this; ? It provides and opportunity for progress. > ... and it is quite different from the early days of Unicode itself, > where there was a groundswell of pent-up demand for a solution to the > fragmented character encoding landscape; the discussions quickly > became about the best way to do that, and about how to ensure that the > result would be supported. Yes, fine, and a good job was done and has benefitted many many people around the world. That was then and that was how things happened then for that situation. Now is now and this is a different approach for a different situation. > The current effort starts from an unrelated problem (Unicode not > wanting to administer emoji applications) and in my analysis, > seriously puts the cart before the horse. Well I was not aware of that purported reason, but then I am not part of the inner loop so you may well therefore have more information about the motivation than is accessible to me. William Overington Wednesday 13 November 2019 From unicode at unicode.org Tue Nov 19 12:59:36 2019 From: unicode at unicode.org (Costello, Roger L. via Unicode) Date: Tue, 19 Nov 2019 18:59:36 +0000 Subject: Is the Unicode Standard "The foundation for all modern software and communications around the world"? Message-ID: Hi Folks, Today I received an email from the Unicode organization. The email said this: (italics and yellow highlighting are mine) The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones-plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). That is a remarkable statement! But is it entirely true? Isn't it assuming that everything is text? What about binary information such as JPEG, GIF, MPEG, WAV; those are pretty core items to the Web, right? The Unicode Standard is silent about them, right? Isn't the above quote a bit misleading? /Roger -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Tue Nov 19 14:02:55 2019 From: unicode at unicode.org (James Kass via Unicode) Date: Tue, 19 Nov 2019 20:02:55 +0000 Subject: Is the Unicode Standard "The foundation for all modern software and communications around the world"? In-Reply-To: References: Message-ID: <9eb66533-ebae-c7bc-3442-0461db7c5822@gmail.com> On 2019-11-19 6:59 PM, Costello, Roger L. via Unicode wrote: > Today I received an email from the Unicode organization. The email said this: (italics and yellow highlighting are mine) > > The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones-plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). > > That is a remarkable statement! But is it entirely true? Isn't it assuming that everything is text? What about binary information such as JPEG, GIF, MPEG, WAV; those are pretty core items to the Web, right? The Unicode Standard is silent about them, right? Isn't the above quote a bit misleading? > A bit, perhaps.? But think of it as a press release. The statement smacks of hyperbole at first blush, but "foundation" can mean basis or starting point.? File names (and URLs) of *.WAV, *.MPG, etc. are stored and exchanged via Unicode.? Likewise, the tags (metadata) for audio/video files are stored (and displayed) via Unicode.? So fields such as Title, Artist, Comments/Notes, Release Date, Label, Composer, and so forth aren't limited to ASCII data. From unicode at unicode.org Tue Nov 19 14:04:22 2019 From: unicode at unicode.org (Michael Everson via Unicode) Date: Tue, 19 Nov 2019 20:04:22 +0000 Subject: Is the Unicode Standard "The foundation for all modern software and communications around the world"? In-Reply-To: References: Message-ID: Of course it?s not ?misleading?. Human language is best conveyed by text. Michael Everson > On 19 Nov 2019, at 18:59, Costello, Roger L. via Unicode wrote: > > Hi Folks, > > Today I received an email from the Unicode organization. The email said this: (italics and yellow highlighting are mine) > > The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones?plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). > > That is a remarkable statement! But is it entirely true? Isn?t it assuming that everything is text? What about binary information such as JPEG, GIF, MPEG, WAV; those are pretty core items to the Web, right? The Unicode Standard is silent about them, right? Isn?t the above quote a bit misleading? > > /Roger From unicode at unicode.org Tue Nov 19 15:03:55 2019 From: unicode at unicode.org (Asmus Freytag via Unicode) Date: Tue, 19 Nov 2019 13:03:55 -0800 Subject: Is the Unicode Standard "The foundation for all modern software and communications around the world"? In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From unicode at unicode.org Tue Nov 19 15:05:58 2019 From: unicode at unicode.org (Jonathan Rosenne via Unicode) Date: Tue, 19 Nov 2019 21:05:58 +0000 Subject: Is the Unicode Standard "The foundation for all modern software and communications around the world"? In-Reply-To: <9eb66533-ebae-c7bc-3442-0461db7c5822@gmail.com> References: <9eb66533-ebae-c7bc-3442-0461db7c5822@gmail.com> Message-ID: As a user of bidirectional text when I think of our world before Unicode and the situation today I cannot but wholeheartedly agree. Without Unicode, few international vendors, major and in particular minor ones, would have considered implementing Hebrew in their products. Now we have everything (good things and not so good too). Best Regards, Jonathan Rosenne -----Original Message----- From: Unicode [mailto:unicode-bounces at unicode.org] On Behalf Of James Kass via Unicode Sent: Tuesday, November 19, 2019 10:03 PM To: unicode at unicode.org Subject: Re: Is the Unicode Standard "The foundation for all modern software and communications around the world"? On 2019-11-19 6:59 PM, Costello, Roger L. via Unicode wrote: > Today I received an email from the Unicode organization. The email said this: (italics and yellow highlighting are mine) > > The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones-plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). > > That is a remarkable statement! But is it entirely true? Isn't it assuming that everything is text? What about binary information such as JPEG, GIF, MPEG, WAV; those are pretty core items to the Web, right? The Unicode Standard is silent about them, right? Isn't the above quote a bit misleading? > A bit, perhaps. But think of it as a press release. The statement smacks of hyperbole at first blush, but "foundation" can mean basis or starting point. File names (and URLs) of *.WAV, *.MPG, etc. are stored and exchanged via Unicode. Likewise, the tags (metadata) for audio/video files are stored (and displayed) via Unicode. So fields such as Title, Artist, Comments/Notes, Release Date, Label, Composer, and so forth aren't limited to ASCII data. -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Tue Nov 19 17:00:21 2019 From: unicode at unicode.org (Mark E. Shoulson via Unicode) Date: Tue, 19 Nov 2019 18:00:21 -0500 Subject: Is the Unicode Standard "The foundation for all modern software and communications around the world"? In-Reply-To: References: Message-ID: It says "foundation", not "sum total, all there is."? I don't think this is much overreach.? MAYBE it counts as "enthusiastic", but not misleading. Why so concerned with these minuti?? Were you in fact misled?? (Doesn't sound like it.)? Do you know someone who was, or whom you fear would be?? What incorrect conclusions might they draw from that misunderstanding, and how serious would they be?? Doesn't sound like this is really anything serious even if you were right. ~mark On 11/19/19 1:59 PM, Costello, Roger L. via Unicode wrote: > > Hi Folks, > > Today I received an email from the Unicode organization. The email > said this: (italics and yellow highlighting are mine) > > /The Unicode Standard is the foundation for all modern software and > communications around the world, including all modern operating > systems, browsers, laptops, and smart phones?plus the Internet and Web > (URLs, HTML, XML, CSS, JSON, etc.)./ > > That is a remarkable statement! But is it entirely true? Isn?t it > assuming that everything is text? What about binary information such > as JPEG, GIF, MPEG, WAV; those are pretty core items to the Web, > right? The Unicode Standard is silent about them, right? Isn?t the > above quote a bit misleading? > > /Roger > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Tue Nov 19 17:19:52 2019 From: unicode at unicode.org (Asmus Freytag via Unicode) Date: Tue, 19 Nov 2019 15:19:52 -0800 Subject: Is the Unicode Standard "The foundation for all modern software and communications around the world"? In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From unicode at unicode.org Tue Nov 19 18:06:53 2019 From: unicode at unicode.org (James Kass via Unicode) Date: Wed, 20 Nov 2019 00:06:53 +0000 Subject: Is the Unicode Standard "The foundation for all modern software and communications around the world"? In-Reply-To: References: Message-ID: <95501005-8962-1453-ae0e-8f4d867b710f@gmail.com> On 2019-11-19 11:00 PM, Mark E. Shoulson via Unicode wrote: > Why so concerned with these minuti?? Were you in fact misled?? > (Doesn't sound like it.)? Do you know someone who was, or whom you > fear would be?? What incorrect conclusions might they draw from that > misunderstanding, and how serious would they be?? Doesn't sound like > this is really anything serious even if you were right. Anyone unfamiliar with our timeline, such as a millennial, might be led to believe that Unicode was in place before personal computers existed.? A bit of research would have dispelled that notion.? But thereafter any assertion from Unicode would be suspect. Limiting the claims to text, as Asmus Freytag suggests, might be too limiting.? Many people may not realize how prevalent textual data really is in our exchanges of information.? Imagine producing a video offering closed captioning/subtitling in French, Italian, and Hebrew without the underlying foundation of Unicode. Rather than limiting this to text, why not substitute something for the word "foundation"?? For example: The Unicode Standard is the lodestar for all modern software and communications around the world, ... From unicode at unicode.org Wed Nov 20 16:48:52 2019 From: unicode at unicode.org (Richard Wordingham via Unicode) Date: Wed, 20 Nov 2019 22:48:52 +0000 Subject: Is the Unicode Standard "The foundation for all modern software and communications around the world"? In-Reply-To: <9eb66533-ebae-c7bc-3442-0461db7c5822@gmail.com> References: <9eb66533-ebae-c7bc-3442-0461db7c5822@gmail.com> Message-ID: <20191120224852.21b3f1f4@JRWUBU2> On Tue, 19 Nov 2019 20:02:55 +0000 James Kass via Unicode wrote: > On 2019-11-19 6:59 PM, Costello, Roger L. via Unicode wrote: > > Today I received an email from the Unicode organization. The email > > said this: (italics and yellow highlighting are mine) > > > > The Unicode Standard is the foundation for all modern software and > > communications around the world, including all modern operating > > systems, browsers, laptops, and smart phones-plus the Internet and > > Web (URLs, HTML, XML, CSS, JSON, etc.). > > > > That is a remarkable statement! But is it entirely true? Isn't it > > assuming that everything is text? What about binary information > > such as JPEG, GIF, MPEG, WAV; those are pretty core items to the > > Web, right? The Unicode Standard is silent about them, right? Isn't > > the above quote a bit misleading? > A bit, perhaps.? But think of it as a press release. > > The statement smacks of hyperbole at first blush, but "foundation" > can mean basis or starting point.? File names (and URLs) of *.WAV, > *.MPG, etc. are stored and exchanged via Unicode.? Likewise, the tags > (metadata) for audio/video files are stored (and displayed) via > Unicode.? So fields such as Title, Artist, Comments/Notes, Release > Date, Label, Composer, and so forth aren't limited to ASCII data. But file names, URLs and syntax tags are still mostly in ASCII. It's only when you come to text data that you get to Unicode; the usual unreliable assumption is that the recipient has the means to display that text. Now, a feature of a *modern* system is that file names and (sometimes) syntax tags can be in Unicode. But have the nightmares of file names and canonical equivalence come to an end? And remember that canonical equivalence isn't just a matter of precomposed letters. Moving away from communications, I still find that if I use 'sort -u' to eliminate repeated lines in unordered lines of text, I have to ensure that I'm using binary identity for comparison - too many collations still treat unknown characters as identical. And this is with a distribution that has UTF-8 as its basic encoding. There's now a looming threat to passwords in truly complex scripts. Keyboards are coming that will prevent certain sequences of characters - Thais have long faced such constraints. Some people may discover that an upgrade of their keyboards renders them unable to type their passwords! Richard. From unicode at unicode.org Thu Nov 21 17:30:12 2019 From: unicode at unicode.org (Peter Constable via Unicode) Date: Thu, 21 Nov 2019 23:30:12 +0000 Subject: Is the Unicode Standard "The foundation for all modern software and communications around the world"? In-Reply-To: References: Message-ID: I suspect if you look at the JPEG and MPEG standards you'll find there is a normative reference to Unicode or ISO/IEC 10646. Same for standards specifying C, ECMAScript and other languages in which modern software is written. So, arguably the statement isn't much of a stretch at all. Peter From: Unicode On Behalf Of Costello, Roger L. via Unicode Sent: Tuesday, November 19, 2019 11:00 AM To: unicode at unicode.org Subject: Is the Unicode Standard "The foundation for all modern software and communications around the world"? Hi Folks, Today I received an email from the Unicode organization. The email said this: (italics and yellow highlighting are mine) The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones-plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). That is a remarkable statement! But is it entirely true? Isn't it assuming that everything is text? What about binary information such as JPEG, GIF, MPEG, WAV; those are pretty core items to the Web, right? The Unicode Standard is silent about them, right? Isn't the above quote a bit misleading? /Roger -------------- next part -------------- An HTML attachment was scrubbed... URL: