From unicode at unicode.org Sun Apr 1 01:20:50 2018 From: unicode at unicode.org (Nathan Galt via Unicode) Date: Sat, 31 Mar 2018 23:20:50 -0700 Subject: Accessibility Emoji In-Reply-To: <17804074.45213.1522083115554.JavaMail.defaultUser@defaultHost> References: <8570587.39967.1522077150247.JavaMail.root@webmail02.bt.ext.cpcloud.co.uk> <17804074.45213.1522083115554.JavaMail.defaultUser@defaultHost> Message-ID: I predict that these emoji will be extraordinarily popular in insults between gamers on both Twitch and Discord. I?d wager, with suitable metrics available, that using these for insult purposes will be the majority of all accessibility-emoji use worldwide. Expected meanings: - PERSON WITH WHITE CANE: ?the person under discussion didn?t see that guy who killed him/his partner/his whole team? - DEAF SIGN: ?the person under discussion failed to notice an audio cue that would have prevented his/his partner?s/his team?s death(s)? - PERSON IN MECHANIZED WHEELCHAIR: ?the person under discussion failed to properly press keys and move his mouse as he should have and his mechanical failures caused his/his partner?s/his team's death(s)? I don?t think the cultural impact of these will be as uniformly positive as Apple hopes. > On Mar 26, 2018, at 9:51 AM, William_J_G Overington via Unicode wrote: > > I have been looking with interest at the following publication. > > Proposal For New Accessibility Emoji > > by Apple Inc. > > www.unicode.org/L2/L2018/18080-accessibility-emoji.pdf > > I am supportive of the proposal. Indeed please have more such emoji as well. > > [snip] > > How could the accessibility emoji in the proposal be used in practice? > > William Overington > > Monday 26 March 2018 From unicode at unicode.org Sun Apr 1 23:31:51 2018 From: unicode at unicode.org (=?UTF-8?Q?Martin_J._D=c3=bcrst?= via Unicode) Date: Mon, 2 Apr 2018 13:31:51 +0900 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: <20180401152900.E7D00B81E47@rfc-editor.org> References: <20180401152900.E7D00B81E47@rfc-editor.org> Message-ID: <00199602-3d42-24a7-1b86-584b719b8fa5@it.aoyama.ac.jp> Please enjoy. Sorry for being late with forwarding, at least in some parts of the world. Regards, Martin. -------- Forwarded Message -------- Subject: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode Date: Sun, 1 Apr 2018 08:29:00 -0700 (PDT) From: rfc-editor at rfc-editor.org Reply-To: ietf at ietf.org To: ietf-announce at ietf.org, rfc-dist at rfc-editor.org CC: drafts-update-ref at iana.org, rfc-editor at rfc-editor.org A new Request for Comments is now available in online RFC libraries. RFC 8369 Title: Internationalizing IPv6 Using 128-Bit Unicode Author: H. Kaplan Status: Informational Stream: Independent Date: 1 April 2018 Mailbox: hadriel at 128technology.com Pages: 11 Characters: 24429 Updates/Obsoletes/SeeAlso: None I-D Tag: draft-kaplan-unicode-ipv6-00.txt URL: https://www.rfc-editor.org/info/rfc8369 DOI: 10.17487/RFC8369 It is clear that Unicode will eventually exhaust its supply of code points, and more will be needed. Assuming ISO and the Unicode Consortium follow the practices of the IETF, the next Unicode code point size will be 128 bits. This document describes how this future 128-bit Unicode can be leveraged to improve IPv6 adoption and finally bring internationalization support to IPv6. INFORMATIONAL: This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited. This announcement is sent to the IETF-Announce and rfc-dist lists. To subscribe or unsubscribe, see https://www.ietf.org/mailman/listinfo/ietf-announce https://mailman.rfc-editor.org/mailman/listinfo/rfc-dist For searching the RFC series, see https://www.rfc-editor.org/search For downloading RFCs, see https://www.rfc-editor.org/retrieve/bulk Requests for special distribution should be addressed to either the author of the RFC in question, or to rfc-editor at rfc-editor.org. Unless specifically noted otherwise on the RFC itself, all RFCs are for unlimited distribution. The RFC Editor Team Association Management Solutions, LLC . From unicode at unicode.org Mon Apr 2 11:29:41 2018 From: unicode at unicode.org (Peter Constable via Unicode) Date: Mon, 2 Apr 2018 16:29:41 +0000 Subject: Thai phintuu + sara u(u) Message-ID: Does anyone know of any attested cases in Thai script of a phintuu appearing together with either sara u or sara uu, _and_ with the phintuu positioned below the sara u(u)? Thanks Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Apr 2 12:15:14 2018 From: unicode at unicode.org (Doug Ewell via Unicode) Date: Mon, 02 Apr 2018 10:15:14 -0700 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode Message-ID: <20180402101514.665a7a7059d7ee80bb4d670165c8327d.3068454ef9.wbe@email03.godaddy.com> Martin J. D?rst wrote: > Please enjoy. Sorry for being late with forwarding, at least in some > parts of the world. Unfortunately, we know some folks will look past the humor and use this as a springboard for the recurring theme "Yes, what *will* we do when Unicode runs out of code points?" I did appreciate the Acknowledgements section which lists the members of ABBA as a source of inspiration. -- Doug Ewell | Thornton, CO, US | ewellic.org From unicode at unicode.org Mon Apr 2 13:15:37 2018 From: unicode at unicode.org (William_J_G Overington via Unicode) Date: Mon, 2 Apr 2018 19:15:37 +0100 (BST) Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode Message-ID: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> Doug Ewell wrote: > Martin J. D?rst wrote: >> Please enjoy. Sorry for being late with forwarding, at least in some >> parts of the world. > Unfortunately, we know some folks will look past the humor and use this as a springboard for the recurring theme "Yes, what *will* we do when Unicode runs out of code points?" An interesting thing about the document is that it suggests a Unicode code point for an individual item of a particular type, what the document terms an imoji. This being beyond what Unicode encodes at present. I wondered if this could link in some ways to the Internet of Things. I had never heard of IPv6. Indeed I checked on the Internet to find whether that was real. So I have started reading and learning. It would, in fact, be quite straightforward to encode what the document terms 128-bit Unicode characters. For example, U+FFF8 could be used as a base character and then followed by a sequence of 32 tag characters, each of those 32 tag characters being from the range U+E0030 TAG DIGIT ZERO .. U+E0039 TAG DIGIT NINE, U+E0041 TAG LATIN CAPITAL LETTER A .. U+E0046 TAG LATIN CAPITAL LETTER F That is, a newly-defined character from the Specials and then 32 tag characters encoding a hexadecimal code point. Now, if that were called 128-bit Unicode then there could be problems of policy, but if it were given another name so that it sits upon a Unicode structure so as to provide an application platform that can be manipulated using Unicode tools, including existing Unicode interchange formats, and display formats for character glyphs, then maybe something useful can be produced. Thus using 128-bit binary numbers in a local computer system and using existing Unicode characters for interchange of information between computer systems, converting from the one format to the other depending upon the needs for local processing and for interchange of information. Of particular significance is the concept of encoding individual items each with its own code point. Could this be used to relate glyphs to the Internet of Things? Could things like International Standard Book Numbers be included, with a code point for each book edition? What about individual copies of a rare book? What about museum items? What about paintings and sculptures? Could this tie up with serial numbers used in GS1-128 Barcodes? Please note that the 128 in GS1-128 refers to the 128 characters of ASCII, not to 128-bits. I am wondering whether U+FFF8 plus 32 tag characters could be handled directly by a GSUB glyph substitution within an OpenType font. However, with such a large code space, there would need to be a way to access glyph information over the internet, maybe use of a one-glyph web font for each glyph would be possible in some way. William Overington Monday 2 April 2018 From unicode at unicode.org Mon Apr 2 14:04:15 2018 From: unicode at unicode.org (J Decker via Unicode) Date: Mon, 2 Apr 2018 12:04:15 -0700 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> Message-ID: I was really hoping this was a joke... it didn't hit me it was April 1... https://en.wikipedia.org/wiki/Plane_(Unicode) PlaneAllocated code points[note 1] Assigned characters[note 2] Totals 280,016 136,755 almost 50% used now. Though that table omits 655,350 code points as 'unassigned' so it's really only about 16% (1/6) used using only 4-byte utf8 or 2 byte utf-16... and of those, that's only 20(plus or minus a faction of 1) bits? so a proposal of something a power of 6 larger than that when even just 1 more bit gives another million characters.... https://en.wikipedia.org/wiki/List_of_dictionaries_by_number_of_words I guess if it was encoded every word as a single code point... that wouldn't be enough seems about 7,716,121 words... so.. 24 bits. plus 1 to double it for good measure? *shrug* On Mon, Apr 2, 2018 at 11:15 AM, William_J_G Overington via Unicode < unicode at unicode.org> wrote: > Doug Ewell wrote: > > > Martin J. D?rst wrote: > > >> Please enjoy. Sorry for being late with forwarding, at least in some > >> parts of the world. > > > Unfortunately, we know some folks will look past the humor and use this > as a springboard for the recurring theme "Yes, what *will* we do when > Unicode runs out of code points?" > > An interesting thing about the document is that it suggests a Unicode code > point for an individual item of a particular type, what the document terms > an imoji. > > This being beyond what Unicode encodes at present. > > I wondered if this could link in some ways to the Internet of Things. > > I had never heard of IPv6. Indeed I checked on the Internet to find > whether that was real. So I have started reading and learning. > > It would, in fact, be quite straightforward to encode what the document > terms 128-bit Unicode characters. > > For example, U+FFF8 could be used as a base character and then followed by > a sequence of 32 tag characters, each of those 32 tag characters being from > the range > > U+E0030 TAG DIGIT ZERO .. U+E0039 TAG DIGIT NINE, U+E0041 TAG LATIN > CAPITAL LETTER A .. U+E0046 TAG LATIN CAPITAL LETTER F > > That is, a newly-defined character from the Specials and then 32 tag > characters encoding a hexadecimal code point. > > Now, if that were called 128-bit Unicode then there could be problems of > policy, but if it were given another name so that it sits upon a Unicode > structure so as to provide an application platform that can be manipulated > using Unicode tools, including existing Unicode interchange formats, and > display formats for character glyphs, then maybe something useful can be > produced. > > Thus using 128-bit binary numbers in a local computer system and using > existing Unicode characters for interchange of information between computer > systems, converting from the one format to the other depending upon the > needs for local processing and for interchange of information. > > Of particular significance is the concept of encoding individual items > each with its own code point. > > Could this be used to relate glyphs to the Internet of Things? > > Could things like International Standard Book Numbers be included, with a > code point for each book edition? > > What about individual copies of a rare book? > > What about museum items? > > What about paintings and sculptures? > > Could this tie up with serial numbers used in GS1-128 Barcodes? > > Please note that the 128 in GS1-128 refers to the 128 characters of ASCII, > not to 128-bits. > > I am wondering whether U+FFF8 plus 32 tag characters could be handled > directly by a GSUB glyph substitution within an OpenType font. > > However, with such a large code space, there would need to be a way to > access glyph information over the internet, maybe use of a one-glyph web > font for each glyph would be possible in some way. > > William Overington > > Monday 2 April 2018 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Apr 2 19:42:21 2018 From: unicode at unicode.org (Mark E. Shoulson via Unicode) Date: Mon, 2 Apr 2018 20:42:21 -0400 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> Message-ID: <31793d21-4d49-022d-105a-af17f944da03@kli.org> For unique identifiers for every person, place, thing, etc, consider https://en.wikipedia.org/wiki/Universally_unique_identifier which are indeed 128 bits. What makes you think a single "glyph" that represents one of these 3.4?38 items could possibly be sensibly distinguishable at any sort of glance (including long stares) from all the others?? I have an idea for that: we can show the actual *digits* of some encoding of the 128-bit number.? Then just inspecting for a different digit will do. Now, what about a registry for "important" (and not-necessarily-important) UUIDs for key things and people, which associates them with an image of some kind?? Some sort of global icon?? And indeed, perhaps used for Internet-of-Things-like things?? Not necessarily a bad idea?but decidedly outside of the scope of Unicode.? (Maybe you could even assign your beloved sentences to some UUIDs and stick them in such a registry.? Again, who knows, maybe a decent idea.? But it ain't Unicode.) ~mark On 04/02/2018 02:15 PM, William_J_G Overington via Unicode wrote: > Doug Ewell wrote: > >> Martin J. D?rst wrote: > >>> Please enjoy. Sorry for being late with forwarding, at least in some >>> parts of the world. > >> Unfortunately, we know some folks will look past the humor and use this > as a springboard for the recurring theme "Yes, what *will* we do when > Unicode runs out of code points?" > > An interesting thing about the document is that it suggests a Unicode code point for an individual item of a particular type, what the document terms an imoji. > > This being beyond what Unicode encodes at present. > > I wondered if this could link in some ways to the Internet of Things. From unicode at unicode.org Mon Apr 2 19:52:03 2018 From: unicode at unicode.org (J Decker via Unicode) Date: Mon, 2 Apr 2018 17:52:03 -0700 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: <31793d21-4d49-022d-105a-af17f944da03@kli.org> References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> <31793d21-4d49-022d-105a-af17f944da03@kli.org> Message-ID: On Mon, Apr 2, 2018 at 5:42 PM, Mark E. Shoulson via Unicode < unicode at unicode.org> wrote: > For unique identifiers for every person, place, thing, etc, consider > https://en.wikipedia.org/wiki/Universally_unique_identifier which are > indeed 128 bits. > > What makes you think a single "glyph" that represents one of these 3.4?38 > items could possibly be sensibly distinguishable at any sort of glance > (including long stares) from all the others? I have an idea for that: we > can show the actual *digits* of some encoding of the 128-bit number. Then > just inspecting for a different digit will do. > there's no restirction that it be one character cell in size... rendered glyphs could be thousands of pixels wide... sorry to drag this on ;) > > Now, what about a registry for "important" (and not-necessarily-important) > UUIDs for key things and people, which associates them with an image of > some kind? Some sort of global icon? And indeed, perhaps used for > Internet-of-Things-like things? Not necessarily a bad idea?but decidedly > outside of the scope of Unicode. (Maybe you could even assign your beloved > sentences to some UUIDs and stick them in such a registry. Again, who > knows, maybe a decent idea. But it ain't Unicode.) > > ~mark > > > On 04/02/2018 02:15 PM, William_J_G Overington via Unicode wrote: > >> Doug Ewell wrote: >> >> Martin J. D?rst wrote: >>> >> >> >>> Please enjoy. Sorry for being late with forwarding, at least in some >>>> parts of the world. >>>> >>> >> >>> Unfortunately, we know some folks will look past the humor and use this >>> >> as a springboard for the recurring theme "Yes, what *will* we do when >> Unicode runs out of code points?" >> >> An interesting thing about the document is that it suggests a Unicode >> code point for an individual item of a particular type, what the document >> terms an imoji. >> >> This being beyond what Unicode encodes at present. >> >> I wondered if this could link in some ways to the Internet of Things. >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Apr 2 20:06:51 2018 From: unicode at unicode.org (Mark E. Shoulson via Unicode) Date: Mon, 2 Apr 2018 21:06:51 -0400 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> <31793d21-4d49-022d-105a-af17f944da03@kli.org> Message-ID: <35986306-7786-bf57-4e50-ccac2539fdea@kli.org> On 04/02/2018 08:52 PM, J Decker via Unicode wrote: > > > On Mon, Apr 2, 2018 at 5:42 PM, Mark E. Shoulson via Unicode > > wrote: > > For unique identifiers for every person, place, thing, etc, > consider > https://en.wikipedia.org/wiki/Universally_unique_identifier > > which are indeed 128 bits. > > What makes you think a single "glyph" that represents one of these > 3.4?38 items could possibly be sensibly distinguishable at any > sort of glance (including long stares) from all the others?? I > have an idea for that: we can show the actual *digits* of some > encoding of the 128-bit number.? Then just inspecting for a > different digit will do. > > > there's no restirction that it be one character cell in size... > rendered glyphs could be thousands of pixels wide... Yes, but at that point it becomes a huge stretch to call it a "character".? It becomes more like a "picture" or "graphic" or something.? And even then, considering the tremendohunormous number of them we're dealing with, can we really be sure each one can be uniquely recognized as the one it's *supposed* to be, by everyone? ~mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Apr 2 20:49:51 2018 From: unicode at unicode.org (Philippe Verdy via Unicode) Date: Tue, 3 Apr 2018 03:49:51 +0200 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: <35986306-7786-bf57-4e50-ccac2539fdea@kli.org> References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> <31793d21-4d49-022d-105a-af17f944da03@kli.org> <35986306-7786-bf57-4e50-ccac2539fdea@kli.org> Message-ID: It's fun to consider the introdroduction (after emojis) of imojis, amojis, umojis and omojis for individual people (or named pets), alien species (E.T. wants to be able to call home with his own language and script !), unknown things, and obfuscated entities. Also fun for new "trollface" characters. In fact you could represent every individual or even single atom in the universe that has ever created since the BingBang ! But unlike peoples and social entities, characters to encode don't grow exponentially but still linearily at a slowing speed. Unicode characters are not exploding like Internet addresses (organizations, users, computers, phones: the IPv4 space expoloded only because the equipement rate of people accelerated but now it is slowing down with high equipment replacement rate, and only the explosion of IoT continues to drive some growth but it will rapidly reach a cap, so even the IPv6 address space will never be filled even if it will be much larger that the UCS encoding space; I can expect a maximum in the range of ~300 billions devices at most with all planet resources as global population will not be able to grow exponentially and will necessarily cap, plus ~100 milions services/organizations; all the rest will die and will be replaced and even if we give a delay of 100 years before reusing addresses of died devices and people in IPv6, this will leave lot of space, we'll never reach a small fraction of the number of entities in the universe; we are also completely unable to make any physical measurement with so many digits of precision: even just 64 bit bit is really extremely large, but 128 bit was chosen in IPv6 just to allow random allocation without needeing excessive centralized management, an IPv6 address is even a good subtitute to the whole DNS system and its overvalued black market of domain names: IPv6 is extremely economic !) 2018-04-03 3:06 GMT+02:00 Mark E. Shoulson via Unicode : > On 04/02/2018 08:52 PM, J Decker via Unicode wrote: > > > > On Mon, Apr 2, 2018 at 5:42 PM, Mark E. Shoulson via Unicode < > unicode at unicode.org> wrote: > >> For unique identifiers for every person, place, thing, etc, consider >> https://en.wikipedia.org/wiki/Universally_unique_identifier which are >> indeed 128 bits. >> >> What makes you think a single "glyph" that represents one of these 3.4?38 >> items could possibly be sensibly distinguishable at any sort of glance >> (including long stares) from all the others? I have an idea for that: we >> can show the actual *digits* of some encoding of the 128-bit number. Then >> just inspecting for a different digit will do. >> > > there's no restirction that it be one character cell in size... rendered > glyphs could be thousands of pixels wide... > > > Yes, but at that point it becomes a huge stretch to call it a > "character". It becomes more like a "picture" or "graphic" or something. > And even then, considering the tremendohunormous number of them we're > dealing with, can we really be sure each one can be uniquely recognized as > the one it's *supposed* to be, by everyone? > > ~mark > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Apr 2 20:56:58 2018 From: unicode at unicode.org (Mark E. Shoulson via Unicode) Date: Mon, 2 Apr 2018 21:56:58 -0400 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> <31793d21-4d49-022d-105a-af17f944da03@kli.org> <35986306-7786-bf57-4e50-ccac2539fdea@kli.org> Message-ID: <15cc6933-cd7b-4fcd-d281-50920f9c5bfc@kli.org> Whew!? Thanks for explaining the joke! Everyone here really thought they were serious.? Maybe you should write to the authors of the RFC and explain to them that their growth-function is incorrect.? I'm sure they'd be glad of the correction. ~mark On 04/02/2018 09:49 PM, Philippe Verdy via Unicode wrote: > It's fun to consider the introdroduction (after emojis) of imojis, > amojis, umojis and omojis for individual people (or named pets), alien > species (E.T. wants to be able to call home with his own language and > script !), unknown things, and obfuscated entities. Also fun for new > "trollface" characters. In fact you could represent every individual > or even single atom in the universe that has ever created since the > BingBang ! > > But unlike peoples and social entities, characters to encode don't > grow exponentially but still linearily at a slowing speed. .... From unicode at unicode.org Mon Apr 2 21:02:21 2018 From: unicode at unicode.org (Philippe Verdy via Unicode) Date: Tue, 3 Apr 2018 04:02:21 +0200 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> <31793d21-4d49-022d-105a-af17f944da03@kli.org> <35986306-7786-bf57-4e50-ccac2539fdea@kli.org> Message-ID: Note: We're missing the definition of "ymojis", a safer alternatives of "umojis" (unknown), but that "you" can create yourself for use by yourself (i.e. private-use umojis) and whose meaning is not meant to be understood by anyone else than you, but which is warrantied to be understood by you, unlike umojis that are interchangeable and not so private (a common problem shared by PUAs in Unicode because of the "ConScript" registry which is a defacto standard for PUA!). 2018-04-03 3:49 GMT+02:00 Philippe Verdy : > It's fun to consider the introdroduction (after emojis) of imojis, amojis, > umojis and omojis for individual people (or named pets), alien species > (E.T. wants to be able to call home with his own language and script !), > unknown things, and obfuscated entities. Also fun for new "trollface" > characters. In fact you could represent every individual or even single > atom in the universe that has ever created since the BingBang ! > > But unlike peoples and social entities, characters to encode don't grow > exponentially but still linearily at a slowing speed. Unicode characters > are not exploding like Internet addresses (organizations, users, computers, > phones: the IPv4 space expoloded only because the equipement rate of people > accelerated but now it is slowing down with high equipment replacement > rate, and only the explosion of IoT continues to drive some growth but it > will rapidly reach a cap, so even the IPv6 address space will never be > filled even if it will be much larger that the UCS encoding space; I can > expect a maximum in the range of ~300 billions devices at most with all > planet resources as global population will not be able to grow > exponentially and will necessarily cap, plus ~100 milions > services/organizations; all the rest will die and will be replaced and even > if we give a delay of 100 years before reusing addresses of died devices > and people in IPv6, this will leave lot of space, we'll never reach a small > fraction of the number of entities in the universe; we are also completely > unable to make any physical measurement with so many digits of precision: > even just 64 bit bit is really extremely large, but 128 bit was chosen in > IPv6 just to allow random allocation without needeing excessive centralized > management, an IPv6 address is even a good subtitute to the whole DNS > system and its overvalued black market of domain names: IPv6 is extremely > economic !) > > > 2018-04-03 3:06 GMT+02:00 Mark E. Shoulson via Unicode < > unicode at unicode.org>: > >> On 04/02/2018 08:52 PM, J Decker via Unicode wrote: >> >> >> >> On Mon, Apr 2, 2018 at 5:42 PM, Mark E. Shoulson via Unicode < >> unicode at unicode.org> wrote: >> >>> For unique identifiers for every person, place, thing, etc, consider >>> https://en.wikipedia.org/wiki/Universally_unique_identifier which are >>> indeed 128 bits. >>> >>> What makes you think a single "glyph" that represents one of these >>> 3.4?38 items could possibly be sensibly distinguishable at any sort of >>> glance (including long stares) from all the others? I have an idea for >>> that: we can show the actual *digits* of some encoding of the 128-bit >>> number. Then just inspecting for a different digit will do. >>> >> >> there's no restirction that it be one character cell in size... rendered >> glyphs could be thousands of pixels wide... >> >> >> Yes, but at that point it becomes a huge stretch to call it a >> "character". It becomes more like a "picture" or "graphic" or something. >> And even then, considering the tremendohunormous number of them we're >> dealing with, can we really be sure each one can be uniquely recognized as >> the one it's *supposed* to be, by everyone? >> >> ~mark >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Apr 2 21:07:47 2018 From: unicode at unicode.org (Philippe Verdy via Unicode) Date: Tue, 3 Apr 2018 04:07:47 +0200 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> <31793d21-4d49-022d-105a-af17f944da03@kli.org> <35986306-7786-bf57-4e50-ccac2539fdea@kli.org> Message-ID: For "ymojis" to be really private, they would have to be very secure, and 128 bit would not be enough: "ymojis" should now be encoded with and least 512 bits and probably even 1024 bits (secure digital fingerprints), not 128 bits like MD5, or 160 bits like SHA1. So let's encode Unicode with 1024-bit codepoints to secure it ! 2018-04-03 4:02 GMT+02:00 Philippe Verdy : > Note: We're missing the definition of "ymojis", a safer alternatives of > "umojis" (unknown), but that "you" can create yourself for use by yourself > (i.e. private-use umojis) and whose meaning is not meant to be understood > by anyone else than you, but which is warrantied to be understood by you, > unlike umojis that are interchangeable and not so private (a common problem > shared by PUAs in Unicode because of the "ConScript" registry which is a > defacto standard for PUA!). > > > 2018-04-03 3:49 GMT+02:00 Philippe Verdy : > >> It's fun to consider the introdroduction (after emojis) of imojis, >> amojis, umojis and omojis for individual people (or named pets), alien >> species (E.T. wants to be able to call home with his own language and >> script !), unknown things, and obfuscated entities. Also fun for new >> "trollface" characters. In fact you could represent every individual or >> even single atom in the universe that has ever created since the BingBang ! >> >> But unlike peoples and social entities, characters to encode don't grow >> exponentially but still linearily at a slowing speed. Unicode characters >> are not exploding like Internet addresses (organizations, users, computers, >> phones: the IPv4 space expoloded only because the equipement rate of people >> accelerated but now it is slowing down with high equipment replacement >> rate, and only the explosion of IoT continues to drive some growth but it >> will rapidly reach a cap, so even the IPv6 address space will never be >> filled even if it will be much larger that the UCS encoding space; I can >> expect a maximum in the range of ~300 billions devices at most with all >> planet resources as global population will not be able to grow >> exponentially and will necessarily cap, plus ~100 milions >> services/organizations; all the rest will die and will be replaced and even >> if we give a delay of 100 years before reusing addresses of died devices >> and people in IPv6, this will leave lot of space, we'll never reach a small >> fraction of the number of entities in the universe; we are also completely >> unable to make any physical measurement with so many digits of precision: >> even just 64 bit bit is really extremely large, but 128 bit was chosen in >> IPv6 just to allow random allocation without needeing excessive centralized >> management, an IPv6 address is even a good subtitute to the whole DNS >> system and its overvalued black market of domain names: IPv6 is extremely >> economic !) >> >> >> 2018-04-03 3:06 GMT+02:00 Mark E. Shoulson via Unicode < >> unicode at unicode.org>: >> >>> On 04/02/2018 08:52 PM, J Decker via Unicode wrote: >>> >>> >>> >>> On Mon, Apr 2, 2018 at 5:42 PM, Mark E. Shoulson via Unicode < >>> unicode at unicode.org> wrote: >>> >>>> For unique identifiers for every person, place, thing, etc, consider >>>> https://en.wikipedia.org/wiki/Universally_unique_identifier which are >>>> indeed 128 bits. >>>> >>>> What makes you think a single "glyph" that represents one of these >>>> 3.4?38 items could possibly be sensibly distinguishable at any sort of >>>> glance (including long stares) from all the others? I have an idea for >>>> that: we can show the actual *digits* of some encoding of the 128-bit >>>> number. Then just inspecting for a different digit will do. >>>> >>> >>> there's no restirction that it be one character cell in size... rendered >>> glyphs could be thousands of pixels wide... >>> >>> >>> Yes, but at that point it becomes a huge stretch to call it a >>> "character". It becomes more like a "picture" or "graphic" or something. >>> And even then, considering the tremendohunormous number of them we're >>> dealing with, can we really be sure each one can be uniquely recognized as >>> the one it's *supposed* to be, by everyone? >>> >>> ~mark >>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Apr 2 22:52:48 2018 From: unicode at unicode.org (Ken Whistler via Unicode) Date: Mon, 2 Apr 2018 20:52:48 -0700 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> <31793d21-4d49-022d-105a-af17f944da03@kli.org> <35986306-7786-bf57-4e50-ccac2539fdea@kli.org> Message-ID: <870b3643-e678-b8f8-56df-2d0f7bc880bd@att.net> On 4/2/2018 7:02 PM, Philippe Verdy via Unicode wrote: > We're missing the definition of "ymojis", a safer alternatives of > "umojis" (unknown), but that "you" can create yourself for use by > yourself Not to mention "?mojis", as in "Uh, Moe! Jeez, why are we still talking about this?!" --Ken From unicode at unicode.org Tue Apr 3 00:43:27 2018 From: unicode at unicode.org (=?UTF-8?Q?Martin_J._D=c3=bcrst?= via Unicode) Date: Tue, 3 Apr 2018 14:43:27 +0900 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: <15cc6933-cd7b-4fcd-d281-50920f9c5bfc@kli.org> References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> <31793d21-4d49-022d-105a-af17f944da03@kli.org> <35986306-7786-bf57-4e50-ccac2539fdea@kli.org> <15cc6933-cd7b-4fcd-d281-50920f9c5bfc@kli.org> Message-ID: On 2018/04/03 10:56, Mark E. Shoulson via Unicode wrote: > Whew!? Thanks for explaining the joke! Everyone here really thought they > were serious.? Maybe you should write to the authors of the RFC and > explain to them that their growth-function is incorrect.? I'm sure > they'd be glad of the correction. I'm sure they know they exaggerated quite a bit. I'm also sure they trust the Unicode Consortium to know when they would have to enlarge the code space, if every. Regards, Martin. From unicode at unicode.org Tue Apr 3 07:29:08 2018 From: unicode at unicode.org (Philippe Verdy via Unicode) Date: Tue, 3 Apr 2018 14:29:08 +0200 Subject: Fwd: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> <31793d21-4d49-022d-105a-af17f944da03@kli.org> <35986306-7786-bf57-4e50-ccac2539fdea@kli.org> <15cc6933-cd7b-4fcd-d281-50920f9c5bfc@kli.org> Message-ID: No, This RFC does not require any correction, the "errors" are part of the April joke itself ! But we can suggest enhancements to the joke. Well this RFC is still very Latin-centric. I would have loved the introduction of "?mojis" and "?mojis" (Greek mojis), "?mojis" (yeah!), "?mojis" (not from me, meant only for "you"!), and "?mojis" (let's respect the origins of what we call "emojis", this kind of "mojis" would be used for character with authentic origin we can trace to its real author!) 2018-04-03 7:43 GMT+02:00 Martin J. D?rst via Unicode : > On 2018/04/03 10:56, Mark E. Shoulson via Unicode wrote: > >> Whew! Thanks for explaining the joke! Everyone here really thought they >> were serious. Maybe you should write to the authors of the RFC and explain >> to them that their growth-function is incorrect. I'm sure they'd be glad >> of the correction. >> > > I'm sure they know they exaggerated quite a bit. I'm also sure they trust > the Unicode Consortium to know when they would have to enlarge the code > space, if every. > > Regards, Martin. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Tue Apr 3 14:57:34 2018 From: unicode at unicode.org (Richard Wordingham via Unicode) Date: Tue, 3 Apr 2018 20:57:34 +0100 Subject: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> <31793d21-4d49-022d-105a-af17f944da03@kli.org> <35986306-7786-bf57-4e50-ccac2539fdea@kli.org> Message-ID: <20180403205734.501a1f11@JRWUBU2> On Tue, 3 Apr 2018 03:49:51 +0200 Philippe Verdy via Unicode wrote: > It's fun to consider the introdroduction (after emojis) of imojis, > amojis, umojis and omojis for individual people (or named pets), > alien species (E.T. wants to be able to call home with his own > language and script !), unknown things, and obfuscated entities. Also > fun for new "trollface" characters. In fact you could represent every > individual or even single atom in the universe that has ever created > since the BingBang ! > > But unlike peoples and social entities, characters to encode don't > grow exponentially but still linearily at a slowing speed. Ultimately, the number of literate species will grow as the cube of time. It's just that so far, we are only addressing one species' scripts and are mostly clearing the backlog. It seems that most new characters, as opposed to newly encoded characters, are emoji. Richard. From unicode at unicode.org Tue Apr 3 15:39:39 2018 From: unicode at unicode.org (Philippe Verdy via Unicode) Date: Tue, 3 Apr 2018 22:39:39 +0200 Subject: RFC 8369 on Internationalizing IPv6 Using 128-Bit Unicode In-Reply-To: <20180403205734.501a1f11@JRWUBU2> References: <4921330.36434.1522692937685.JavaMail.defaultUser@defaultHost> <31793d21-4d49-022d-105a-af17f944da03@kli.org> <35986306-7786-bf57-4e50-ccac2539fdea@kli.org> <20180403205734.501a1f11@JRWUBU2> Message-ID: 2018-04-03 21:57 GMT+02:00 Richard Wordingham via Unicode < unicode at unicode.org>: > Ultimately, the number of literate species will grow as the cube of > time. > Do you realize that litteracy level has already started to shrink (replaced by audio/images/videos) including in most developed countries, and that areas where it is still growing a bit are in the poor developing world within its rather small middle class that can access to education, largely influences by those that control the network media? The rest is connected now to the Internet via cheap mobile devices to just look at videos and send basic emojis over social networks as the only mean of communication, but not as a mean of learning... So I have serious doubts about your statement of growth as the cube of time; if this was ever true in the past, then the trend for de-litteracy will also evolve at the same rate, and people will then no longer need Unicode at all as they will no longer need to interchange text they can no longer decipher. -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Wed Apr 4 11:29:12 2018 From: unicode at unicode.org (=?utf-8?Q?Daniel_B=C3=BCnzli?= via Unicode) Date: Wed, 4 Apr 2018 18:29:12 +0200 Subject: UAX #42 update for 11.0.0 & \p{Extended_Pictographic} Message-ID: Hello,? Is there any ETA for an update to the ucdxml for 11.0.0 ?? Also while reviewing the proposed update to UAX #29, I noticed it refers to a property (\p{Extended_Pictographic}) that doesn't seem to be formally part of the UCD but to be found in UTS #51. Is there any chance for this property to be part of a possible update to UAX #42 for 11.0.0 ? That would significantly help implementers whose pipeline relies on the ucdxml to implement the standard.? Best,? Daniel From unicode at unicode.org Wed Apr 11 10:04:12 2018 From: unicode at unicode.org (Stephane Bortzmeyer via Unicode) Date: Wed, 11 Apr 2018 17:04:12 +0200 Subject: Evolution of writing : pictures -> letters -> emojis :-) Message-ID: <20180411150412.3la7pm7cxnmsjxse@nic.fr> https://x0r.be/@szbalint/99834795406169086 From unicode at unicode.org Sat Apr 14 19:50:27 2018 From: unicode at unicode.org (Marcel Schneider via Unicode) Date: Sun, 15 Apr 2018 02:50:27 +0200 (CEST) Subject: More scripts, not more emoji (Re: Accessibility Emoji) In-Reply-To: References: <8570587.39967.1522077150247.JavaMail.root@webmail02.bt.ext.cpcloud.co.uk> <17804074.45213.1522083115554.JavaMail.defaultUser@defaultHost> Message-ID: <327620702.12776.1523753427616.JavaMail.www@wwinf1m21> We need to get more scripts into Unicode, not more emoji. That is ? somewhat inflated ? the core message of a NYT article published six months ago, and never shared here (no more than so many articles about Unicode, scripts, and emoji). Some 100 scripts are missing in the Standard, affecting as many as 400 million people worldwide. https://www.nytimes.com/2017/10/18/magazine/how-the-appetite-for-emojis-complicates-the-effort-to-standardize-the-worlds-alphabets.html (Just found while searching for Hanifi Rohingya script, thanks to the Wikipedia entry [trying to find out whether to include Hanifi Rohingya in beta feedback {closing soon}]). On 01/04/18 08:27 Nathan Galt via Unicode wrote > > I predict that these emoji will be extraordinarily popular in insults between gamers on both Twitch and Discord. I?d wager, with suitable metrics available, that using these for insult purposes will be the majority of all accessibility-emoji use worldwide. Expected meanings: > > - PERSON WITH WHITE CANE: ?the person under discussion didn?t see that guy who killed him/his partner/his whole team? > - DEAF SIGN: ?the person under discussion failed to notice an audio cue that would have prevented his/his partner?s/his team?s death(s)? > - PERSON IN MECHANIZED WHEELCHAIR: ?the person under discussion failed to properly press keys and move his mouse as he should have and his mechanical failures caused his/his partner?s/his team's death(s)? > > I don?t think the cultural impact of these will be as uniformly positive as Apple hopes. > > > > On Mar 26, 2018, at 9:51 AM, William_J_G Overington via Unicode wrote: > > > > I have been looking with interest at the following publication. > > > > Proposal For New Accessibility Emoji > > > > by Apple Inc. > > > > www.unicode.org/L2/L2018/18080-accessibility-emoji.pdf > > > > I am supportive of the proposal. Indeed please have more such emoji as well. > > > > [snip] > > > > How could the accessibility emoji in the proposal be used in practice? > > > > William Overington > > > > Monday 26 March 2018 > > > From unicode at unicode.org Sat Apr 14 22:29:40 2018 From: unicode at unicode.org (Markus Scherer via Unicode) Date: Sat, 14 Apr 2018 20:29:40 -0700 Subject: More scripts, not more emoji (Re: Accessibility Emoji) In-Reply-To: <327620702.12776.1523753427616.JavaMail.www@wwinf1m21> References: <8570587.39967.1522077150247.JavaMail.root@webmail02.bt.ext.cpcloud.co.uk> <17804074.45213.1522083115554.JavaMail.defaultUser@defaultHost> <327620702.12776.1523753427616.JavaMail.www@wwinf1m21> Message-ID: On Sat, Apr 14, 2018 at 5:50 PM, Marcel Schneider via Unicode < unicode at unicode.org> wrote: > We need to get more scripts into Unicode, not more emoji. > > That is ? somewhat inflated ? the core message of a NYT article published > six months ago, > and never shared here (no more than so many articles about Unicode, > scripts, and emoji). > Some 100 scripts are missing in the Standard, affecting as many as 400 > million people worldwide. > > https://www.nytimes.com/2017/10/18/magazine/how-the-appetite-for-emojis- > complicates-the-effort-to-standardize-the-worlds-alphabets.html You are right. One good way that you can help make it happen is to support the Script Encoding Initiative which is mentioned in the article. Some of the AAC money goes there. And since the most popular adopted characters are emoji, their popularity is helping close the gap that you pointed out. They have also helped in other ways -- they really motivated developers to make their code work for supplementary code points, grapheme cluster boundaries, font ligatures, spurred development of color font technology, and got organizations to update to newer versions of Unicode faster than before. Several of these things are especially useful for recently added scripts. Best regards, markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Sun Apr 15 01:24:22 2018 From: unicode at unicode.org (Marcel Schneider via Unicode) Date: Sun, 15 Apr 2018 08:24:22 +0200 (CEST) Subject: More scripts, not more emoji (Re: Accessibility Emoji) In-Reply-To: References: <8570587.39967.1522077150247.JavaMail.root@webmail02.bt.ext.cpcloud.co.uk> <17804074.45213.1522083115554.JavaMail.defaultUser@defaultHost> <327620702.12776.1523753427616.JavaMail.www@wwinf1m21> Message-ID: <133083119.304.1523773463035.JavaMail.www@wwinf1d34> On Sat, 14 Apr 2018 20:29:40 -0700, Markus Scherer wrote: > > On Sat, Apr 14, 2018 at 5:50 PM, Marcel Schneider via Unicode wrote: > > > > We need to get more scripts into Unicode, not more emoji. > > > > That is ? somewhat inflated ? the core message of a NYT article published six months ago, > > and never shared here (no more than so many articles about Unicode, scripts, and emoji). > > Some 100 scripts are missing in the Standard, affecting as many as 400 million people worldwide. > > > > https://www.nytimes.com/2017/10/18/magazine/how-the-appetite-for-emojis-complicates-the-effort-to-standardize-the-worlds-alphabets.html > > You are right. One good way that you can help make it happen is to support the?Script Encoding Initiative which is mentioned in the article. > > Some of the AAC money goes there. And since the most popular adopted characters are emoji, their popularity is helping close the gap that you > pointed out. > > > They have also helped in other ways -- they really motivated developers to make their code work for supplementary code points, grapheme cluster > boundaries, font ligatures, spurred development of color font technology, and got organizations to update to newer versions of Unicode faster than > before. Several of these things are especially useful for recently added scripts. Thank you for the point. Indeed, the NYT article, too, is much more balanced than what I bounced to the List as an exaggerated takeaway. We send our thanks to the sponsors of the Adopt A Character program, to the SEI, and to the United States National Endowment for the Humanities, which funded the Universal Scripts Project. And last but not least, to the Unicode Consortium. I note, too, that the cited 400 million people do write in less than fifty yet unsupported ? but hopefully soon encoded ? scripts. Best regards, Marcel From unicode at unicode.org Thu Apr 19 05:51:04 2018 From: unicode at unicode.org (=?UTF-8?Q?Christoph_P=C3=A4per?= via Unicode) Date: Thu, 19 Apr 2018 12:51:04 +0200 (CEST) Subject: Submissions open for 2020 Emoji In-Reply-To: <5AD64336.5050202@unicode.org> References: <5AD64336.5050202@unicode.org> Message-ID: <1007136881.226749.1524135064282@ox.hosteurope.de> announcements at unicode.org: > > The emoji subcommittee has also produced a new page which shows the > Emoji Requests > submitted so far. You can look at what other people have proposed or > suggested. In many cases, people have made suggestions, but have not > followed through with complete submission forms, or have submitted > forms, but not followed through on requested modifications to the forms. This good news! However, imagine I discover that someone has already proposed the emoji that I am interested in, but their formal proposal needs some work: From the public data I can not see when this proposal has been received or whether it has been updated. Since I also cannot contact the author, either I have to hope they are still working on the proposal or I have to submit a separate proposal of my own, duplicating all the work. Also, there seems to be no systematic reason for which proposals get shelved as "Added to larger set" while related ones (e.g. random animals) progress to the UTC. The ESC should not have this power of gatekeeping. If an emoji proposal is well-formed and fits the general scope it should be forwarded to UTC, hence be published in the L2 repository. Alternatively, the ESC should collect *all* proposals that semantically belong to a larger set (e.g. animals) in a composite document and forward this annually, for instance. Some entries are also opaque or ambiguous, i.e. not helpful, e.g.: 705 Six Chinese Styles Added to larger set Mixed 706 Six Chinese-style Emoji No proposal form Other Others are outdated, for instance because the larger set they have been added to has already been processed by UTC and they were declined. Some categories have only a single entry, others are clearly aliases of each other or subcategories. I would like to help clean up the data, e.g. by commenting on the Google Spreadsheet that is embedded on the Unicode page. How can I do that as an individual member? From unicode at unicode.org Thu Apr 19 07:32:15 2018 From: unicode at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via Unicode) Date: Thu, 19 Apr 2018 14:32:15 +0200 Subject: Submissions open for 2020 Emoji In-Reply-To: <1007136881.226749.1524135064282@ox.hosteurope.de> References: <5AD64336.5050202@unicode.org> <1007136881.226749.1524135064282@ox.hosteurope.de> Message-ID: > imagine I discover that someone has already proposed the emoji that I am interested in In some cases we've have contacted people to see if they want to engage with other proposers. But to handle larger numbers we'd need a simple, light-weight way to let people know, while maintaining people's privacy when they want it. > Also, there seems to be no systematic reason... The ESC periodically prioritizes some of the larger sets and forwards a list to the UTC. >If an emoji proposal is well-formed and fits the general scope it should be forwarded to UTC. Emoji are a relatively small part of the work of the consortium, and should remain that way. So the UTC depends on the ESC to evaluate the quality and priority of proposals, based on the factors described. > Others are outdated, for instance because the larger set they have been added to has already been processed by UTC and they were declined. Some categories have only a single entry, others are clearly aliases of each other or subcategories. > I would like to help clean up the data, e.g. by commenting on the Google Spreadsheet that is embedded on the Unicode page. How can I do that as an individual member? That would be helpful, thanks. What I would suggest is taking a copy of the sheet, dumping into a spreadsheet (Google or Excel) and adding a column for your suggestions. You can then submit that. Note that the numbers are just to provide a count, there is no binding connection between them and the rest of the line. Mark Mark On Thu, Apr 19, 2018 at 12:51 PM, Christoph P?per via Unicode < unicode at unicode.org> wrote: > announcements at unicode.org: > > > > The emoji subcommittee has also produced a new page which shows the > > Emoji Requests > > submitted so far. You can look at what other people have proposed or > > suggested. In many cases, people have made suggestions, but have not > > followed through with complete submission forms, or have submitted > > forms, but not followed through on requested modifications to the forms. > > This good news! However, imagine I discover that someone has already > proposed the emoji that I am interested in, but their formal proposal needs > some work: From the public data I can not see when this proposal has been > received or whether it has been updated. Since I also cannot contact the > author, either I have to hope they are still working on the proposal or I > have to submit a separate proposal of my own, duplicating all the work. > > Also, there seems to be no systematic reason for which proposals get > shelved as "Added to larger set" while related ones (e.g. random animals) > progress to the UTC. The ESC should not have this power of gatekeeping. If > an emoji proposal is well-formed and fits the general scope it should be > forwarded to UTC, hence be published in the L2 repository. Alternatively, > the ESC should collect *all* proposals that semantically belong to a larger > set (e.g. animals) in a composite document and forward this annually, for > instance. > > Some entries are also opaque or ambiguous, i.e. not helpful, e.g.: > > 705 Six Chinese Styles Added to larger set Mixed > 706 Six Chinese-style Emoji No proposal form Other > > Others are outdated, for instance because the larger set they have been > added to has already been processed by UTC and they were declined. Some > categories have only a single entry, others are clearly aliases of each > other or subcategories. I would like to help clean up the data, e.g. by > commenting on the Google Spreadsheet that is embedded on the Unicode page. > How can I do that as an individual member? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Thu Apr 19 11:22:15 2018 From: unicode at unicode.org (Asmus Freytag via Unicode) Date: Thu, 19 Apr 2018 09:22:15 -0700 Subject: Submissions open for 2020 Emoji In-Reply-To: References: <5AD64336.5050202@unicode.org> <1007136881.226749.1524135064282@ox.hosteurope.de> Message-ID: <91e1937e-ee1c-a041-17ef-a06629f28c46@ix.netcom.com> An HTML attachment was scrubbed... URL: From unicode at unicode.org Thu Apr 19 11:36:53 2018 From: unicode at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via Unicode) Date: Thu, 19 Apr 2018 18:36:53 +0200 Subject: Submissions open for 2020 Emoji In-Reply-To: <91e1937e-ee1c-a041-17ef-a06629f28c46@ix.netcom.com> References: <5AD64336.5050202@unicode.org> <1007136881.226749.1524135064282@ox.hosteurope.de> <91e1937e-ee1c-a041-17ef-a06629f28c46@ix.netcom.com> Message-ID: The UTC didn't want to burden the doc registry with all the emoji proposals. Mark On Thu, Apr 19, 2018 at 6:22 PM, Asmus Freytag via Unicode < unicode at unicode.org> wrote: > On 4/19/2018 5:32 AM, Mark Davis ?? via Unicode wrote: > > > imagine I discover that someone has already proposed the emoji that I > am interested in > > In some cases we've have contacted people to see if they want to engage > with other proposers. But to handle larger numbers we'd need a simple, > light-weight way to let people know, while maintaining people's privacy > when they want it. > > > I would tend to think that actual proposals are a matter of public record. > Emoji should not be handled differently than other proposals for character > encoding in that regard. > > Why should there be an assumption that these are "proposals in private" in > this case? > > A./ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Thu Apr 19 13:50:38 2018 From: unicode at unicode.org (Asmus Freytag (c) via Unicode) Date: Thu, 19 Apr 2018 11:50:38 -0700 Subject: Submissions open for 2020 Emoji In-Reply-To: References: <5AD64336.5050202@unicode.org> <1007136881.226749.1524135064282@ox.hosteurope.de> <91e1937e-ee1c-a041-17ef-a06629f28c46@ix.netcom.com> Message-ID: On 4/19/2018 9:36 AM, Mark Davis ?? wrote: > The UTC didn't want to burden the doc registry with all the emoji > proposals. The question of whether the registry should be divided is independent on whether proposals are public or private in nature. Proposals in private have no place in the context of public standard. A./ > > Mark > // > > On Thu, Apr 19, 2018 at 6:22 PM, Asmus Freytag via Unicode > > wrote: > > On 4/19/2018 5:32 AM, Mark Davis ?? via Unicode wrote: >> > imagine I discover that someone has already proposed the emoji >> that I am interested in >> >> In some cases we've have contacted people to see if they want to >> engage with other proposers. But to handle larger numbers we'd >> need a simple, light-weight way to let people know, while >> maintaining people's privacy when they want it. > > I would tend to think that actual proposals are a matter of public > record. Emoji should not be handled differently than other > proposals for character encoding in that regard. > > Why should there be an assumption that these are "proposals in > private" in this case? > > A./ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Fri Apr 20 02:05:08 2018 From: unicode at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via Unicode) Date: Fri, 20 Apr 2018 09:05:08 +0200 Subject: Submissions open for 2020 Emoji In-Reply-To: References: <5AD64336.5050202@unicode.org> <1007136881.226749.1524135064282@ox.hosteurope.de> <91e1937e-ee1c-a041-17ef-a06629f28c46@ix.netcom.com> Message-ID: If you want, you can make a proposal to the effect that all proposals made to the Unicode be hosted publicly in a place accessible the unicode site. Then the UTC can consider your proposal. I think it would help the discussion to provide in your proposal links to policy statements from the W3C, ICANN, etc. that follow that policy. (I'm not sure exactly what you encompass in your term "public standard": for example, would you include ISO in that list, even though people have to pay for (most of) theirs?) Mark Mark On Thu, Apr 19, 2018 at 8:50 PM, Asmus Freytag (c) wrote: > On 4/19/2018 9:36 AM, Mark Davis ?? wrote: > > The UTC didn't want to burden the doc registry with all the emoji > proposals. > > > The question of whether the registry should be divided is independent on > whether proposals are public or private in nature. > > Proposals in private have no place in the context of public standard. > > A./ > > > Mark > > On Thu, Apr 19, 2018 at 6:22 PM, Asmus Freytag via Unicode < > unicode at unicode.org> wrote: > >> On 4/19/2018 5:32 AM, Mark Davis ?? via Unicode wrote: >> >> > imagine I discover that someone has already proposed the emoji that I >> am interested in >> >> In some cases we've have contacted people to see if they want to engage >> with other proposers. But to handle larger numbers we'd need a simple, >> light-weight way to let people know, while maintaining people's privacy >> when they want it. >> >> >> I would tend to think that actual proposals are a matter of public >> record. Emoji should not be handled differently than other proposals for >> character encoding in that regard. >> >> Why should there be an assumption that these are "proposals in private" >> in this case? >> >> A./ >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Fri Apr 20 03:15:21 2018 From: unicode at unicode.org (Henri Sivonen via Unicode) Date: Fri, 20 Apr 2018 11:15:21 +0300 Subject: Is the Editor's Draft public? Message-ID: Is the Editor's Draft of the Unicode Standard visible publicly? Use case: Checking if things that I might send feedback about have already been addressed since the publication of Unicode 10.0. -- Henri Sivonen hsivonen at hsivonen.fi https://hsivonen.fi/ From unicode at unicode.org Fri Apr 20 03:59:02 2018 From: unicode at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via Unicode) Date: Fri, 20 Apr 2018 10:59:02 +0200 Subject: Submissions open for 2020 Emoji In-Reply-To: References: <5AD64336.5050202@unicode.org> <1007136881.226749.1524135064282@ox.hosteurope.de> Message-ID: BTW, Slide 23 on http://unicode.org/emoji/slides.html ("Unicode Resources: Specs, Data, and Code") shows one view of the relative sizes of Unicode Consortium projects, divided up by cldr, icu, encoding (eg UTC output), and also breaks out emoji. (It does need a bit of updating, since we have added emoji names to cldr.) Mark On Thu, Apr 19, 2018 at 2:32 PM, Mark Davis ?? wrote: > > imagine I discover that someone has already proposed the emoji that I > am interested in > > In some cases we've have contacted people to see if they want to engage > with other proposers. But to handle larger numbers we'd need a simple, > light-weight way to let people know, while maintaining people's privacy > when they want it. > > > Also, there seems to be no systematic reason... > > The ESC periodically prioritizes some of the larger sets and forwards a > list to the UTC. > > >If an emoji proposal is well-formed and fits the general scope it should > be forwarded to UTC. > > Emoji are a relatively small part of the work of the consortium, and > should remain that way. So the UTC depends on the ESC to evaluate the > quality and priority of proposals, based on the factors described. > > > Others are outdated, for instance because the larger set they have been > added to has already been processed by UTC and they were declined. Some > categories have only a single entry, others are clearly aliases of each > other or subcategories. > > I would like to help clean up the data, e.g. by commenting on the Google > Spreadsheet that is embedded on the Unicode page. How can I do that as an > individual member? > > That would be helpful, thanks. What I would suggest is taking a copy of > the sheet, dumping into a spreadsheet (Google or Excel) and adding a column > for your suggestions. You can then submit that. Note that the numbers are > just to provide a count, there is no binding connection between them and > the rest of the line. > > Mark > > Mark > > On Thu, Apr 19, 2018 at 12:51 PM, Christoph P?per via Unicode < > unicode at unicode.org> wrote: > >> announcements at unicode.org: >> > >> > The emoji subcommittee has also produced a new page which shows the >> > Emoji Requests >> > submitted so far. You can look at what other people have proposed or >> > suggested. In many cases, people have made suggestions, but have not >> > followed through with complete submission forms, or have submitted >> > forms, but not followed through on requested modifications to the forms. >> >> This good news! However, imagine I discover that someone has already >> proposed the emoji that I am interested in, but their formal proposal needs >> some work: From the public data I can not see when this proposal has been >> received or whether it has been updated. Since I also cannot contact the >> author, either I have to hope they are still working on the proposal or I >> have to submit a separate proposal of my own, duplicating all the work. >> >> Also, there seems to be no systematic reason for which proposals get >> shelved as "Added to larger set" while related ones (e.g. random animals) >> progress to the UTC. The ESC should not have this power of gatekeeping. If >> an emoji proposal is well-formed and fits the general scope it should be >> forwarded to UTC, hence be published in the L2 repository. Alternatively, >> the ESC should collect *all* proposals that semantically belong to a larger >> set (e.g. animals) in a composite document and forward this annually, for >> instance. >> >> Some entries are also opaque or ambiguous, i.e. not helpful, e.g.: >> >> 705 Six Chinese Styles Added to larger set Mixed >> 706 Six Chinese-style Emoji No proposal form Other >> >> Others are outdated, for instance because the larger set they have been >> added to has already been processed by UTC and they were declined. Some >> categories have only a single entry, others are clearly aliases of each >> other or subcategories. I would like to help clean up the data, e.g. by >> commenting on the Google Spreadsheet that is embedded on the Unicode page. >> How can I do that as an individual member? >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Fri Apr 20 04:12:11 2018 From: unicode at unicode.org (=?UTF-8?Q?Martin_J._D=c3=bcrst?= via Unicode) Date: Fri, 20 Apr 2018 18:12:11 +0900 Subject: Is the Editor's Draft public? In-Reply-To: References: Message-ID: <4c44f4cf-f177-6379-52a9-41d3be6c1529@it.aoyama.ac.jp> Hello Henri, On 2018/04/20 17:15, Henri Sivonen via Unicode wrote: > Is the Editor's Draft of the Unicode Standard visible publicly? > > Use case: Checking if things that I might send feedback about have > already been addressed since the publication of Unicode 10.0. There was an announcement for a public review period just recently. The review period is up to the 23rd of April. I'm not sure whether the announcement is up somewhere on the Web, but I'll forward it to you directly. Regards, Martin. From unicode at unicode.org Fri Apr 20 04:16:01 2018 From: unicode at unicode.org (=?UTF-8?Q?Martin_J._D=c3=bcrst?= via Unicode) Date: Fri, 20 Apr 2018 18:16:01 +0900 Subject: Is the Editor's Draft public? In-Reply-To: <4c44f4cf-f177-6379-52a9-41d3be6c1529@it.aoyama.ac.jp> References: <4c44f4cf-f177-6379-52a9-41d3be6c1529@it.aoyama.ac.jp> Message-ID: <48769755-d468-7c01-7dd1-99057cd13098@it.aoyama.ac.jp> On 2018/04/20 18:12, Martin J. D?rst wrote: > There was an announcement for a public review period just recently. The > review period is up to the 23rd of April. I'm not sure whether the > announcement is up somewhere on the Web, but I'll forward it to you > directly. Sorry, found the Web address of the announcement at the very bottom of the mail: http://blog.unicode.org/2018/04/last-call-on-unicode-110-review.html Regards, Martin. From unicode at unicode.org Fri Apr 20 05:14:56 2018 From: unicode at unicode.org (Henri Sivonen via Unicode) Date: Fri, 20 Apr 2018 13:14:56 +0300 Subject: Is the Editor's Draft public? In-Reply-To: <48769755-d468-7c01-7dd1-99057cd13098@it.aoyama.ac.jp> References: <4c44f4cf-f177-6379-52a9-41d3be6c1529@it.aoyama.ac.jp> <48769755-d468-7c01-7dd1-99057cd13098@it.aoyama.ac.jp> Message-ID: On Fri, Apr 20, 2018 at 12:16 PM, Martin J. D?rst wrote: > On 2018/04/20 18:12, Martin J. D?rst wrote: > >> There was an announcement for a public review period just recently. The >> review period is up to the 23rd of April. I'm not sure whether the >> announcement is up somewhere on the Web, but I'll forward it to you >> directly. > > Sorry, found the Web address of the announcement at the very bottom of the > mail: http://blog.unicode.org/2018/04/last-call-on-unicode-110-review.html Thank you. I checked this review announcement (I should have said so in my email; sorry), but it leads me to https://unicode.org/versions/Unicode11.0.0/ which says the chapters will be "Available June 2018". But even if the 11.0 chapters were available, I'd expect there to exist an Editor's Draft that's now in a post-11.0 but pre-12.0 state. I guess I should just send my comments and take the risk of my concerns already having been addressed. -- Henri Sivonen hsivonen at hsivonen.fi https://hsivonen.fi/ From unicode at unicode.org Fri Apr 20 10:36:45 2018 From: unicode at unicode.org (Ken Whistler via Unicode) Date: Fri, 20 Apr 2018 08:36:45 -0700 Subject: Is the Editor's Draft public? In-Reply-To: References: <4c44f4cf-f177-6379-52a9-41d3be6c1529@it.aoyama.ac.jp> <48769755-d468-7c01-7dd1-99057cd13098@it.aoyama.ac.jp> Message-ID: <517aaa40-a03f-bebe-1583-1e064d401e15@att.net> Henri, There is no formal concept of a public "Editor's Draft" for the Unicode core specification. This is mostly the result of the tools used for editing the core specification, which is still structured more like a book than the usual online internet specification. Currently the Unicode editors are finishing up the 11.0 core specification editing -- and the chapters for that will be available in June, 2018, as noted on the current draft of the Unicode 11.0 page. There is no Version 12.0 "Editor's Draft" right now; instead, work on the 12.0 core specification will start once the 11.0 chapters have been frozen and published. If you have feedback on the core specification, the best thing to do is simply to submit it now as part of the current 11.0 beta review, referring to the published 10.0 core specification text. If it is a small item, such as a typo, there is always the possibility that it has already been reported and fixed, of course -- but it won't hurt to report and check. Suggestions for larger changes in the text will be added to the pile for future consideration by the UTC and the editors, and likely would be taken up for the 12.0 core specification. --Ken On 4/20/2018 3:14 AM, Henri Sivonen via Unicode wrote: > Thank you. I checked this review announcement (I should have said so > in my email; sorry), but it leads me to > https://unicode.org/versions/Unicode11.0.0/ which says the chapters > will be "Available June 2018". But even if the 11.0 chapters were > available, I'd expect there to exist an Editor's Draft that's now in a > post-11.0 but pre-12.0 state. > > I guess I should just send my comments and take the risk of my > concerns already having been addressed. From unicode at unicode.org Fri Apr 20 13:37:52 2018 From: unicode at unicode.org (Roman Chrenko via Unicode) Date: Fri, 20 Apr 2018 20:37:52 +0200 Subject: corrupted document with oracle bone script Message-ID: <001001d3d8d6$b170b170$14521450$@gmail.com> Hello. I found out that document http://www.unicode.org/L2/L2015/15280-n4687-oracle-bone.pdf is corrupted. It is a proposal for inclusion of Oracle Bone Script into ISO/IEC 10646 standard. It is corrupted from page 163. Could someone replace the document with the correct one? Roman -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Fri Apr 20 16:13:56 2018 From: unicode at unicode.org (Manish Goregaokar via Unicode) Date: Fri, 20 Apr 2018 14:13:56 -0700 Subject: Submissions open for 2020 Emoji In-Reply-To: References: <5AD64336.5050202@unicode.org> <1007136881.226749.1524135064282@ox.hosteurope.de> Message-ID: It would also be useful if "Added to larger set" mentioned which proposal it was added to. Last December I proposed emojification for U+1F58E LEFT WRITING HAND, and that's marked as merged but it's unclear which proposal it was merged with. (Also the document isn't on L2 yet, I'm not sure why) Thanks, -Manish On Fri, Apr 20, 2018 at 1:59 AM, Mark Davis ?? via Unicode < unicode at unicode.org> wrote: > BTW, Slide 23 on http://unicode.org/emoji/slides.html ("Unicode > Resources: Specs, Data, and Code") shows one view of the relative sizes of > Unicode Consortium projects, divided up by cldr, icu, encoding (eg UTC > output), and also breaks out emoji. > > (It does need a bit of updating, since we have added emoji names to cldr.) > > Mark > > On Thu, Apr 19, 2018 at 2:32 PM, Mark Davis ?? wrote: > >> > imagine I discover that someone has already proposed the emoji that I >> am interested in >> >> In some cases we've have contacted people to see if they want to engage >> with other proposers. But to handle larger numbers we'd need a simple, >> light-weight way to let people know, while maintaining people's privacy >> when they want it. >> >> > Also, there seems to be no systematic reason... >> >> The ESC periodically prioritizes some of the larger sets and forwards a >> list to the UTC. >> >> >If an emoji proposal is well-formed and fits the general scope it >> should be forwarded to UTC. >> >> Emoji are a relatively small part of the work of the consortium, and >> should remain that way. So the UTC depends on the ESC to evaluate the >> quality and priority of proposals, based on the factors described. >> >> > Others are outdated, for instance because the larger set they have >> been added to has already been processed by UTC and they were declined. >> Some categories have only a single entry, others are clearly aliases of >> each other or subcategories. >> > I would like to help clean up the data, e.g. by commenting on the >> Google Spreadsheet that is embedded on the Unicode page. How can I do that >> as an individual member? >> >> That would be helpful, thanks. What I would suggest is taking a copy of >> the sheet, dumping into a spreadsheet (Google or Excel) and adding a column >> for your suggestions. You can then submit that. Note that the numbers are >> just to provide a count, there is no binding connection between them and >> the rest of the line. >> >> Mark >> >> Mark >> >> On Thu, Apr 19, 2018 at 12:51 PM, Christoph P?per via Unicode < >> unicode at unicode.org> wrote: >> >>> announcements at unicode.org: >>> > >>> > The emoji subcommittee has also produced a new page which shows the >>> > Emoji Requests >>> > submitted so far. You can look at what other people have proposed or >>> > suggested. In many cases, people have made suggestions, but have not >>> > followed through with complete submission forms, or have submitted >>> > forms, but not followed through on requested modifications to the >>> forms. >>> >>> This good news! However, imagine I discover that someone has already >>> proposed the emoji that I am interested in, but their formal proposal needs >>> some work: From the public data I can not see when this proposal has been >>> received or whether it has been updated. Since I also cannot contact the >>> author, either I have to hope they are still working on the proposal or I >>> have to submit a separate proposal of my own, duplicating all the work. >>> >>> Also, there seems to be no systematic reason for which proposals get >>> shelved as "Added to larger set" while related ones (e.g. random animals) >>> progress to the UTC. The ESC should not have this power of gatekeeping. If >>> an emoji proposal is well-formed and fits the general scope it should be >>> forwarded to UTC, hence be published in the L2 repository. Alternatively, >>> the ESC should collect *all* proposals that semantically belong to a larger >>> set (e.g. animals) in a composite document and forward this annually, for >>> instance. >>> >>> Some entries are also opaque or ambiguous, i.e. not helpful, e.g.: >>> >>> 705 Six Chinese Styles Added to larger set Mixed >>> 706 Six Chinese-style Emoji No proposal form Other >>> >>> Others are outdated, for instance because the larger set they have been >>> added to has already been processed by UTC and they were declined. Some >>> categories have only a single entry, others are clearly aliases of each >>> other or subcategories. I would like to help clean up the data, e.g. by >>> commenting on the Google Spreadsheet that is embedded on the Unicode page. >>> How can I do that as an individual member? >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at unicode.org Mon Apr 23 08:33:19 2018 From: unicode at unicode.org (=?UTF-8?Q?Christoph_P=C3=A4per?= via Unicode) Date: Mon, 23 Apr 2018 15:33:19 +0200 (CEST) Subject: Submissions open for 2020 Emoji In-Reply-To: References: <5AD64336.5050202@unicode.org> <1007136881.226749.1524135064282@ox.hosteurope.de> Message-ID: <1908606554.294920.1524490399137@ox.hosteurope.de> Mark Davis: > > In some cases we've have contacted people to see if they want to engage > with other proposers. But to handle larger numbers we'd need a simple, > light-weight way to let people know, while maintaining people's privacy > when they want it. Collaborative editing of (proposal) documents is actually a thing in 2018 and can even be done with (semi-)anonymous contributors. > The ESC periodically prioritizes some of the larger sets and forwards a > list to the UTC. Like I said: no systematic reason. Some animals are put on hold, others are forwarded individually, for instance. The ESC should identify preexisting semantic sets of or criteria for pictograms, or accept collective proposals for such, instead of insisting on a single independent proposal form for each emoji. For instance, a simple sufficient (but not mandatory) criterion for the relevance of animal pictograms would be whether they appear on official road signs anywhere in the world, and a preexisting cultural set for animals would be the [Five Animals] representing styles of kung-fu (i.e. a Crane and possibly a Mantis would be missing). [Five Animals]: https://en.wikipedia.org/wiki/Five_Animals >> If an emoji proposal is well-formed and fits the general scope it should >> be forwarded to UTC. > > Emoji are a relatively small part of the work of the consortium, and should > remain that way. The number of emojis is small compared to some scripts, but large compared to others. The importance of (new) emojis is small by some accounts, but large by others (like the Adopt-a-Character programme). If you accepted the principle of preexisting sets and criteria and agreed on some, a lot of emoji proposals would be simpler to assess, hence reducing the work required for emoji. > So the UTC depends on the ESC to evaluate the quality and > priority of proposals, based on the factors described. I don't really disagree about the control of formal quality the subcommittee provides, I disagree about it keeping the gate on priority. You would not delay the encoding of some characters of a script due to their perceived importance while advancing others; for example, a large chunk of the remaining rarely used CJVK logograms is added in each new version of TUS, but they are not filtered by importance. >> I would like to help clean up the data, e.g. by commenting on the Google >> Spreadsheet that is embedded on the Unicode page. How can I do that as an >> individual member? > > That would be helpful, thanks. What I would suggest is taking a copy of the > sheet, dumping into a spreadsheet (Google or Excel) and adding a column for > your suggestions. Um, what about comments on the actual spreadsheet since Google Docs already provides that feature? I hate to reduplicate work for myself or anyone else.