From asmusf at ix.netcom.com Tue Dec 16 11:36:05 2014 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Tue, 16 Dec 2014 09:36:05 -0800 Subject: emoji are clearly the current meme fad Message-ID: <54906D85.5090900@ix.netcom.com> An HTML attachment was scrubbed... URL: From doug at ewellic.org Tue Dec 16 12:24:24 2014 From: doug at ewellic.org (Doug Ewell) Date: Tue, 16 Dec 2014 11:24:24 -0700 Subject: emoji are clearly the current meme fad Message-ID: <20141216112424.665a7a7059d7ee80bb4d670165c8327d.fb77424fa2.wbe@email03.secureserver.net> Asmus Freytag wrote: > emoji are clearly the current meme fad Well, good. That's exactly the sort of thing where the Unicode Standard should be leading the way. I guess. -- Doug Ewell | Thornton, CO, USA | http://ewellic.org From daniel.buenzli at erratique.ch Wed Dec 17 10:40:53 2014 From: daniel.buenzli at erratique.ch (=?utf-8?Q?Daniel_B=C3=BCnzli?=) Date: Wed, 17 Dec 2014 17:40:53 +0100 Subject: UAX 29 on empty strings Message-ID: <3E2DC690AFE940B385EF856858F84D79@erratique.ch> Hello, Does UAX 29 have something to say about empty strings ? My understanding is that the empty string is the sequence "sot eot" so there's a single boundary position to evaluate the rules on and given the rules {GB,WB,SB}1, we need to report a boundary on that boundary position. Does that feel right ? Best, Daniel From ritt.ks at gmail.com Wed Dec 17 10:55:11 2014 From: ritt.ks at gmail.com (Konstantin Ritt) Date: Wed, 17 Dec 2014 20:55:11 +0400 Subject: UAX 29 on empty strings In-Reply-To: <3E2DC690AFE940B385EF856858F84D79@erratique.ch> References: <3E2DC690AFE940B385EF856858F84D79@erratique.ch> Message-ID: Correct. Regards, Konstantin 2014-12-17 20:40 GMT+04:00 Daniel B?nzli : > > Hello, > > Does UAX 29 have something to say about empty strings ? > > My understanding is that the empty string is the sequence "sot eot" so > there's a single boundary position to evaluate the rules on and given the > rules {GB,WB,SB}1, we need to report a boundary on that boundary position. > > Does that feel right ? > > Best, > > Daniel > > > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.buenzli at erratique.ch Wed Dec 17 11:01:32 2014 From: daniel.buenzli at erratique.ch (=?utf-8?Q?Daniel_B=C3=BCnzli?=) Date: Wed, 17 Dec 2014 18:01:32 +0100 Subject: UAX 29 on empty strings In-Reply-To: References: <3E2DC690AFE940B385EF856858F84D79@erratique.ch> Message-ID: <1A81BCC285934588A72AA87D5CDC3BC2@erratique.ch> Le mercredi, 17 d?cembre 2014 ? 17:55, Konstantin Ritt a ?crit : > Correct. Thanks ! Daniel From emuller at adobe.com Wed Dec 17 12:26:52 2014 From: emuller at adobe.com (Eric Muller) Date: Wed, 17 Dec 2014 10:26:52 -0800 Subject: =?UTF-8?B?U8OpbWluYWlyZSBkb2N0b3JhbCAiQ2hlbWlucyBkZXMgw6ljcml0dXI=?= =?UTF-8?B?ZXMiIHwgR3JpcGlj?= Message-ID: <5491CAEC.9060607@adobe.com> This seminar may be of interest to those in France. http://www.gripic.fr/evenement/seminaire-doctoral-chemins-ecritures Eric. From mark at macchiato.com Wed Dec 17 13:34:28 2014 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=) Date: Wed, 17 Dec 2014 20:34:28 +0100 Subject: emoji are clearly the current meme fad In-Reply-To: <54906D85.5090900@ix.netcom.com> References: <54906D85.5090900@ix.netcom.com> Message-ID: We just had a new blog posting; we've moved the media list out of tr51, and the list already had that item on it. See: http://www.unicode.org/press/emoji.html#media Separately, I keep a list of how the media refers to the Unicode consortium: my favorite is "shadowy emoji overlords". Bonus points to the first person who can find the one that refers to us as "part of a shameful plot to destroy the institution of marriage"... Mark *? Il meglio ? l?inimico del bene ?* On Tue, Dec 16, 2014 at 6:36 PM, Asmus Freytag wrote: > > Everybody wants in on the act: > > http://mashable.com/2014/12/12/bill-nye-evolution-emoji/ > > A./ > > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at macchiato.com Wed Dec 17 14:29:45 2014 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=) Date: Wed, 17 Dec 2014 21:29:45 +0100 Subject: emoji are clearly the current meme fad In-Reply-To: <7e30243013b8497cb7aa65521fa2b8c8@DFM-TK5MBX15-06.exchange.corp.microsoft.com> References: <54906D85.5090900@ix.netcom.com> <7e30243013b8497cb7aa65521fa2b8c8@DFM-TK5MBX15-06.exchange.corp.microsoft.com> Message-ID: On Wed, Dec 17, 2014 at 9:03 PM, Murray Sargent < murrays at exchange.microsoft.com> wrote: > > > http://www.theguardian.com/commentisfree/2014/nov/28/the-problem-with-emojis ?Bingo, Murray wins the prize! [image: Inline image 1]? ?Not to open until Christmas... -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 6575 bytes Desc: not available URL: From everson at evertype.com Wed Dec 17 15:49:48 2014 From: everson at evertype.com (Michael Everson) Date: Wed, 17 Dec 2014 16:49:48 -0500 Subject: emoji are clearly the current meme fad In-Reply-To: <54906D85.5090900@ix.netcom.com> References: <54906D85.5090900@ix.netcom.com> Message-ID: <1BE269FB-D03F-4B7D-B328-B7A1A633F7A1@evertype.com> Clearly the plural of emoji is emojis. On 16 Dec 2014, at 12:36, Asmus Freytag wrote: > Everybody wants in on the act: > > http://mashable.com/2014/12/12/bill-nye-evolution-emoji/ > > A./ > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode Michael Everson * http://www.evertype.com/ From everson at evertype.com Wed Dec 17 20:54:52 2014 From: everson at evertype.com (Michael Everson) Date: Wed, 17 Dec 2014 21:54:52 -0500 Subject: emoji are clearly the current meme fad In-Reply-To: References: <54906D85.5090900@ix.netcom.com> Message-ID: <0DE7EEA3-A4AD-413B-A030-DDCEA663A3F1@evertype.com> On 17 Dec 2014, at 14:34, Mark Davis ?? wrote: > We just had a new blog posting; we've moved the media list out of tr51, and the list already had that item on it. Bonus points to the first person who can find the one that refers to us as "part of a shameful plot to destroy the institution of marriage"? Surely that is me, who contributed to Irish ballot comments that ?COUPLE? be replaced by what came to be several three couples holding hands. :-) Michael Everson * http://www.evertype.com/ From duerst at it.aoyama.ac.jp Wed Dec 17 23:41:46 2014 From: duerst at it.aoyama.ac.jp (=?UTF-8?B?Ik1hcnRpbiBKLiBEw7xyc3Qi?=) Date: Thu, 18 Dec 2014 14:41:46 +0900 Subject: emoji are clearly the current meme fad In-Reply-To: <1BE269FB-D03F-4B7D-B328-B7A1A633F7A1@evertype.com> References: <54906D85.5090900@ix.netcom.com> <1BE269FB-D03F-4B7D-B328-B7A1A633F7A1@evertype.com> Message-ID: <5492691A.1060703@it.aoyama.ac.jp> On 2014/12/18 06:49, Michael Everson wrote: > Clearly the plural of emoji is emojis. Not in Japanese, where there are no plural forms. The question of what it is/will be in English will be decided by usage, not by grammar. I'd use 'emoji', but then I'm too biased towards Japanese to be relevant to make any predictions. Regards, Martin. > On 16 Dec 2014, at 12:36, Asmus Freytag wrote: > >> Everybody wants in on the act: >> >> http://mashable.com/2014/12/12/bill-nye-evolution-emoji/ >> >> A./ >> _______________________________________________ >> Unicode mailing list >> Unicode at unicode.org >> http://unicode.org/mailman/listinfo/unicode > > Michael Everson * http://www.evertype.com/ > > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode > From andrea.giammarchi at gmail.com Thu Dec 18 04:31:37 2014 From: andrea.giammarchi at gmail.com (Andrea Giammarchi) Date: Thu, 18 Dec 2014 10:31:37 +0000 Subject: =?UTF-8?B?KFIpLCAoYykgYW5kIOKEog==?= Message-ID: Hello there, I wonder if it's by accident that 00AE, 00A9, and 2122 are not listed as standard variant sensitive chars. OSX seems to threat them as such, so adding FE0F will force them to be an image, but I know there are few quirks in this behavior and I wonder if there should be an exception. Thanks for any clarification on this. Best Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at macchiato.com Thu Dec 18 05:03:05 2014 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=) Date: Thu, 18 Dec 2014 12:03:05 +0100 Subject: =?UTF-8?B?UmU6IChSKSwgKGMpIGFuZCDihKI=?= In-Reply-To: References: Message-ID: On Thu, Dec 18, 2014 at 11:31 AM, Andrea Giammarchi < andrea.giammarchi at gmail.com> wrote: > > standard variant sensitive ?It is not clear what you mean by "standard variant sensitive"?. Can you elaborate? Mark *? Il meglio ? l?inimico del bene ?* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jkorpela at cs.tut.fi Thu Dec 18 05:06:51 2014 From: jkorpela at cs.tut.fi (Jukka K. Korpela) Date: Thu, 18 Dec 2014 13:06:51 +0200 Subject: (R), (c) and =?windows-1252?Q?=99?= In-Reply-To: References: Message-ID: <5492B54B.8080801@cs.tut.fi> 2014-12-18, 12:31, Andrea Giammarchi wrote: > I wonder if it's by accident that 00AE, 00A9, and 2122 are not listed > as standard variant sensitive chars. Why would that be an accident any more than not listing 100,000 other characters there? Or to put it more constructively, why should they be listed? What glyph variation needs to be expressible in plain text? > OSX seems to threat them as such, so adding FE0F will force them to be > an image, That does not sound correct. Variation selectors should either affect the choice of a glyph or be ignored, and their effects should be limited to characters designated to be affected by them. > but I know there are few quirks in this behavior To me, the behavior as such sounds like a quirk. Yucca From andrea.giammarchi at gmail.com Thu Dec 18 05:09:49 2014 From: andrea.giammarchi at gmail.com (Andrea Giammarchi) Date: Thu, 18 Dec 2014 11:09:49 +0000 Subject: =?UTF-8?B?UmU6IChSKSwgKGMpIGFuZCDihKI=?= In-Reply-To: References: Message-ID: Thanks Mark, I mean not listened anywhere here: http://unicode.org/Public/UNIDATA/StandardizedVariants.txt I'd expect to find the following there: 00A9 FE0E; text style; # COPY RIGHT MARK 00A9 FE0F; emoji style; # COPY RIGHT MARK for the simple reason that 00A9 is listed as emoji: http://www.unicode.org/Public/UNIDATA/EmojiSources.txt Apparently there's no place that says FE0F should affect 00A9, neither a place that states the opposite: 00A9 FE0E as text. Are my expectations wrong or should these chars handled any differently from other emoji ? Thanks On Thu, Dec 18, 2014 at 11:03 AM, Mark Davis ?? wrote: > > > On Thu, Dec 18, 2014 at 11:31 AM, Andrea Giammarchi < > andrea.giammarchi at gmail.com> wrote: >> >> standard variant sensitive > > > ?It is not clear what you mean by "standard variant sensitive"?. Can you > elaborate? > > > > Mark > > *? Il meglio ? l?inimico del bene ?* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at macchiato.com Thu Dec 18 05:25:27 2014 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=) Date: Thu, 18 Dec 2014 12:25:27 +0100 Subject: =?UTF-8?B?UmU6IChSKSwgKGMpIGFuZCDihKI=?= In-Reply-To: References: Message-ID: Note that emoji ? present in http://www.unicode.org/Public/UNIDATA/EmojiSources.txt It would probably be useful to read through http://www.unicode.org/reports/tr51/, which is where we are working on various aspects of emoji, in your case especially - http://www.unicode.org/reports/tr51/#Identification - http://www.unicode.org/reports/tr51/#Presentation_Style There are charts attached to the TR that can also be reviewed (and commented on), such as http://www.unicode.org/Public/emoji/1.0/text-style.html If you have feedback on the data (either supporting what is there, or recommending changes), you can submit your feedback via a link to Feedback (found at the top, and in the review notes for each of the sections). We haven't yet made firm recommendations on the variation selectors or the default emoji style, so what is there is a fairly a raw draft. (but we are making progress; see https://plus.google.com/+MarkDavis/posts/MLqEc79yN22). Personally, I think that if a character is in the recommended list for emoji, then: - if the default style is text, we must have variation selectors. - if the default style is emoji, then we should have variation selectors if it is in common use with a non-emoji presentation (typical for characters that have been in Unicode for a long time). Mark *? Il meglio ? l?inimico del bene ?* On Thu, Dec 18, 2014 at 12:09 PM, Andrea Giammarchi < andrea.giammarchi at gmail.com> wrote: > > Thanks Mark, I mean not listened anywhere here: > http://unicode.org/Public/UNIDATA/StandardizedVariants.txt > > I'd expect to find the following there: > > 00A9 FE0E; text style; # COPY RIGHT MARK > 00A9 FE0F; emoji style; # COPY RIGHT MARK > > > for the simple reason that 00A9 is listed as emoji: > http://www.unicode.org/Public/UNIDATA/EmojiSources.txt > > Apparently there's no place that says FE0F should affect 00A9, neither a > place that states the opposite: 00A9 FE0E as text. > > Are my expectations wrong or should these chars handled any differently > from other emoji ? > > Thanks > > > On Thu, Dec 18, 2014 at 11:03 AM, Mark Davis [image: ?]? < > mark at macchiato.com> wrote: >> >> >> On Thu, Dec 18, 2014 at 11:31 AM, Andrea Giammarchi < >> andrea.giammarchi at gmail.com> wrote: >>> >>> standard variant sensitive >> >> >> ?It is not clear what you mean by "standard variant sensitive"?. Can you >> elaborate? >> >> >> >> Mark >> >> *? Il meglio ? l?inimico del bene ?* >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: emoji_u2615.png Type: image/png Size: 1890 bytes Desc: not available URL: From andrea.giammarchi at gmail.com Thu Dec 18 05:42:04 2014 From: andrea.giammarchi at gmail.com (Andrea Giammarchi) Date: Thu, 18 Dec 2014 11:42:04 +0000 Subject: =?UTF-8?B?UmU6IChSKSwgKGMpIGFuZCDihKI=?= In-Reply-To: References: Message-ID: Thanks Mark, I know that emoji != present in there but I need these formatted files for static analysis so I've just copied and pasted from a source file what I use to perform some validation check. (as side note, if these are incomplete I have problems ... any more updated source with same format?) However, back to the topic, I agree with your last two points and this is exactly the case: - the default style is text ... because - these have been in Unicode forever ( even standard ASCII for two of them ) so I'd expect, since somehow part of the emoji family, to have them compatible with variation selectors If this will be the common agreement, how long could it take to be effectively in the tr51 document ? Thanks again for all links and details Best Regards On Thu, Dec 18, 2014 at 11:25 AM, Mark Davis ?? wrote: > > Note that emoji ? present in > http://www.unicode.org/Public/UNIDATA/EmojiSources.txt > > It would probably be useful to read through > http://www.unicode.org/reports/tr51/, which is where we are working on > various aspects of emoji, in your case especially > > - http://www.unicode.org/reports/tr51/#Identification > - http://www.unicode.org/reports/tr51/#Presentation_Style > > There are charts attached to the TR that can also be reviewed (and > commented on), such as > http://www.unicode.org/Public/emoji/1.0/text-style.html > > If you have feedback on the data (either supporting what is there, or > recommending changes), you can submit your feedback via a link to Feedback > (found at the top, and in the review notes for each of the sections). > > > We haven't yet made firm recommendations on the variation selectors or the > default emoji style, so what is there is a fairly a raw draft. (but we are > making progress; see https://plus.google.com/+MarkDavis/posts/MLqEc79yN22 > ). > > Personally, I think that if a character is in the recommended list for > emoji, then: > > - if the default style is text, we must have variation selectors. > - if the default style is emoji, then we should have variation > selectors if it is in common use with a non-emoji presentation (typical for > characters that have been in Unicode for a long time). > > > > Mark > > *? Il meglio ? l?inimico del bene ?* > > On Thu, Dec 18, 2014 at 12:09 PM, Andrea Giammarchi < > andrea.giammarchi at gmail.com> wrote: >> >> Thanks Mark, I mean not listened anywhere here: >> http://unicode.org/Public/UNIDATA/StandardizedVariants.txt >> >> I'd expect to find the following there: >> >> 00A9 FE0E; text style; # COPY RIGHT MARK >> 00A9 FE0F; emoji style; # COPY RIGHT MARK >> >> >> for the simple reason that 00A9 is listed as emoji: >> http://www.unicode.org/Public/UNIDATA/EmojiSources.txt >> >> Apparently there's no place that says FE0F should affect 00A9, neither a >> place that states the opposite: 00A9 FE0E as text. >> >> Are my expectations wrong or should these chars handled any differently >> from other emoji ? >> >> Thanks >> >> >> On Thu, Dec 18, 2014 at 11:03 AM, Mark Davis [image: ?]? < >> mark at macchiato.com> wrote: >>> >>> >>> On Thu, Dec 18, 2014 at 11:31 AM, Andrea Giammarchi < >>> andrea.giammarchi at gmail.com> wrote: >>>> >>>> standard variant sensitive >>> >>> >>> ?It is not clear what you mean by "standard variant sensitive"?. Can >>> you elaborate? >>> >>> >>> >>> Mark >>> >>> *? Il meglio ? l?inimico del bene ?* >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: emoji_u2615.png Type: image/png Size: 1890 bytes Desc: not available URL: From leoboiko at namakajiri.net Thu Dec 18 05:42:13 2014 From: leoboiko at namakajiri.net (Leonardo Boiko) Date: Thu, 18 Dec 2014 09:42:13 -0200 Subject: =?UTF-8?B?UmU6IChSKSwgKGMpIGFuZCDihKI=?= In-Reply-To: References: Message-ID: For the record, the emoji selection issue is also affecting the Google Talk/Hangouts web client, where U+2122 (trademark, ?), U+00AE (registered, ?), U+00A9 (copyright, ?), and U+2194 (left right arrow, ?) seem to be treated as emoji and displayed in funky blue: http://namakajiri.net/pics/screenshots/gmail_emouni.png There are probably more I haven't discovered. 2014-12-18 8:31 GMT-02:00 Andrea Giammarchi : > > Hello there, > I wonder if it's by accident that 00AE, 00A9, and 2122 are not listed as > standard variant sensitive chars. > > OSX seems to threat them as such, so adding FE0F will force them to be an > image, but I know there are few quirks in this behavior and I wonder if > there should be an exception. > > Thanks for any clarification on this. > > Best Regards > > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.rosenne at gmail.com Thu Dec 18 06:14:07 2014 From: jonathan.rosenne at gmail.com (Jonathan Rosenne) Date: Thu, 18 Dec 2014 14:14:07 +0200 Subject: =?utf-8?Q?RE:_=28R=29=2C_=28c=29_and_=E2=84=A2?= In-Reply-To: References: Message-ID: <002a01d01abc$20ee8220$62cb8660$@gmail.com> To pick a nit, it should be COPYRIGHT rather than COPY RIGHT. Best Regards, Jonathan Rosenne From: Unicode [mailto:unicode-bounces at unicode.org] On Behalf Of Andrea Giammarchi Sent: Thursday, December 18, 2014 1:10 PM To: Mark Davis ?? Cc: Unicode Public Subject: Re: (R), (c) and ? Thanks Mark, I mean not listened anywhere here: http://unicode.org/Public/UNIDATA/StandardizedVariants.txt I'd expect to find the following there: 00A9 FE0E; text style; # COPY RIGHT MARK 00A9 FE0F; emoji style; # COPY RIGHT MARK for the simple reason that 00A9 is listed as emoji: http://www.unicode.org/Public/UNIDATA/EmojiSources.txt Apparently there's no place that says FE0F should affect 00A9, neither a place that states the opposite: 00A9 FE0E as text. Are my expectations wrong or should these chars handled any differently from other emoji ? Thanks On Thu, Dec 18, 2014 at 11:03 AM, Mark Davis ?? wrote: On Thu, Dec 18, 2014 at 11:31 AM, Andrea Giammarchi wrote: standard variant sensitive ?It is not clear what you mean by "standard variant sensitive"?. Can you elaborate? Mark ? Il meglio ? l?inimico del bene ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.giammarchi at gmail.com Thu Dec 18 06:28:33 2014 From: andrea.giammarchi at gmail.com (Andrea Giammarchi) Date: Thu, 18 Dec 2014 12:28:33 +0000 Subject: =?UTF-8?B?UmU6IChSKSwgKGMpIGFuZCDihKI=?= In-Reply-To: <002a01d01abc$20ee8220$62cb8660$@gmail.com> References: <002a01d01abc$20ee8220$62cb8660$@gmail.com> Message-ID: yeahright :D On Thu, Dec 18, 2014 at 12:14 PM, Jonathan Rosenne < jonathan.rosenne at gmail.com> wrote: > > To pick a nit, it should be COPYRIGHT rather than COPY RIGHT. > > > > Best Regards, > > > > Jonathan Rosenne > > > > *From:* Unicode [mailto:unicode-bounces at unicode.org] *On Behalf Of *Andrea > Giammarchi > *Sent:* Thursday, December 18, 2014 1:10 PM > *To:* Mark Davis [image: ?]? > *Cc:* Unicode Public > *Subject:* Re: (R), (c) and ? > > > > Thanks Mark, I mean not listened anywhere here: > http://unicode.org/Public/UNIDATA/StandardizedVariants.txt > > > > I'd expect to find the following there: > > > > 00A9 FE0E; text style; # COPY RIGHT MARK > > 00A9 FE0F; emoji style; # COPY RIGHT MARK > > > > for the simple reason that 00A9 is listed as emoji: > > http://www.unicode.org/Public/UNIDATA/EmojiSources.txt > > > > Apparently there's no place that says FE0F should affect 00A9, neither a > place that states the opposite: 00A9 FE0E as text. > > > > Are my expectations wrong or should these chars handled any differently > from other emoji ? > > > > Thanks > > > > > > On Thu, Dec 18, 2014 at 11:03 AM, Mark Davis [image: ?]? < > mark at macchiato.com> wrote: > > > > On Thu, Dec 18, 2014 at 11:31 AM, Andrea Giammarchi < > andrea.giammarchi at gmail.com> wrote: > > standard variant sensitive > > > > ?It is not clear what you mean by "standard variant sensitive"?. Can you > elaborate? > > > > > > Mark > > > > *? Il meglio ? l?inimico del bene ?* > > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: emoji_u2615.png Type: image/png Size: 1890 bytes Desc: not available URL: From richard.wordingham at ntlworld.com Thu Dec 18 06:54:24 2014 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Thu, 18 Dec 2014 12:54:24 +0000 Subject: (R), (c) and =?UTF-8?B?4oSi?= In-Reply-To: References: Message-ID: <20141218125424.365bbc24@JRWUBU2> On Thu, 18 Dec 2014 09:42:13 -0200 Leonardo Boiko wrote: > For the record, the emoji selection issue is also affecting the Google > Talk/Hangouts web client, where U+2122 (trademark, ?), U+00AE > (registered, ?), U+00A9 (copyright, ?), and U+2194 (left right arrow, > ?) seem to be treated as emoji and displayed in funky blue: > > http://namakajiri.net/pics/screenshots/gmail_emouni.png Is there any reason why one should wish to select between sober and funky displays of U+2122 TRADEMARK, U+00AE REGISTERED, and U+00A9 COPYRIGHT character instance by character instance? I confess I don't understand why U+2194 should be subjected to special treatment. Richard. From everson at evertype.com Thu Dec 18 09:19:21 2014 From: everson at evertype.com (Michael Everson) Date: Thu, 18 Dec 2014 10:19:21 -0500 Subject: emoji are clearly the current meme fad In-Reply-To: <5492691A.1060703@it.aoyama.ac.jp> References: <54906D85.5090900@ix.netcom.com> <1BE269FB-D03F-4B7D-B328-B7A1A633F7A1@evertype.com> <5492691A.1060703@it.aoyama.ac.jp> Message-ID: On 18 Dec 2014, at 00:41, Martin J. D?rst wrote: > On 2014/12/18 06:49, Michael Everson wrote: >> Clearly the plural of emoji is emojis. > > Not in Japanese, where there are no plural forms. I?m not concerned with the usage in Japanese. > The question of what it is/will be in English will be decided by usage, not by grammar. Bill Nye says emojis. I had coffee with someone yesterday who did the same. Seems usage has decided it, and ?not by grammar? is meaningless in this context, as pluralization of neologisms in English by adding -s is in fact grammar. Michael Everson * http://www.evertype.com/ From andrea.giammarchi at gmail.com Thu Dec 18 09:36:35 2014 From: andrea.giammarchi at gmail.com (Andrea Giammarchi) Date: Thu, 18 Dec 2014 15:36:35 +0000 Subject: =?UTF-8?B?UmU6IChSKSwgKGMpIGFuZCDihKI=?= In-Reply-To: <20141218125424.365bbc24@JRWUBU2> References: <20141218125424.365bbc24@JRWUBU2> Message-ID: I'd say highly subjective, generally speaking, but the problem here is that there are graphics representations of those and it's not clear when these should be preferred over just plain text representation. Hence my initial question if it was by accident that those chars got in (and yeah, maybe early mistake or something ... I hope we can fix them now in a more explicit way) Best Regards On Thu, Dec 18, 2014 at 12:54 PM, Richard Wordingham < richard.wordingham at ntlworld.com> wrote: > > On Thu, 18 Dec 2014 09:42:13 -0200 > Leonardo Boiko wrote: > > > For the record, the emoji selection issue is also affecting the Google > > Talk/Hangouts web client, where U+2122 (trademark, ?), U+00AE > > (registered, ?), U+00A9 (copyright, ?), and U+2194 (left right arrow, > > ?) seem to be treated as emoji and displayed in funky blue: > > > > http://namakajiri.net/pics/screenshots/gmail_emouni.png > > Is there any reason why one should wish to select between sober and > funky displays of U+2122 TRADEMARK, U+00AE REGISTERED, and U+00A9 > COPYRIGHT character instance by character instance? I confess I don't > understand why U+2194 should be subjected to special treatment. > > Richard. > > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Thu Dec 18 14:36:55 2014 From: doug at ewellic.org (Doug Ewell) Date: Thu, 18 Dec 2014 13:36:55 -0700 Subject: (R), (c) and =?UTF-8?Q?=3F?= Message-ID: <20141218133654.665a7a7059d7ee80bb4d670165c8327d.d172fa449c.wbe@email03.secureserver.net> Richard Wordingham wrote: > Is there any reason why one should wish to select between sober and > funky displays of U+2122 TRADEMARK, U+00AE REGISTERED, and U+00A9 > COPYRIGHT character instance by character instance? I have to agree. If these characters need special cute emoji representations in arbitrary text, is there such a thing as a character with a visible glyph that does not? -- Doug Ewell | Thornton, CO, USA | http://ewellic.org From wjgo_10009 at btinternet.com Tue Dec 23 11:57:33 2014 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Tue, 23 Dec 2014 17:57:33 +0000 (GMT) Subject: Unicode encoding policy Message-ID: <27378088.42099.1419357453536.JavaMail.defaultUser@defaultHost> Unicode encoding policy There is a document. http://www.unicode.org/L2/L2014/14250.htm Within the document, the following are interesting items. E.1.7 Emoji Additions: popular requests [Edberg, Davis, L2/14-272] Discussion. UTC took no action at this time. Later, in the same document is the following. E.1.7 Emoji Additions: popular requests [Edberg, Davis, L2/14-272R] [141-C6] Consensus: Add the block U+1F900..U+1F9FF Supplemental Symbols and Pictographs for Unicode version 8.0. The referenced document contains links to various requests and petitions for additional emoji characters. In the referenced document, within section C, is the following. 5. Are the proposed characters in current use by the user community? No ---- This appears to be a major change in encoding policy. This, in my opinion, is a welcome, progressive change in policy that allows new characters for use in a pure electronic technology to be added into regular Unicode without a requirement to first establish widespread use by using an encoding within a Unicode Private Use Area. I feel that it is now therefore possible to seek encoding of symbols, perhaps in abstract emoji format and semi-abstract emoji format, so as to implement a system for communication through the language barrier by whole localizable sentences, with that system designed by interested people without the need to produce any legacy data that is encoded using an encoding within a Unicode Private Use Area. A first draft petition could be produced and then later drafts developed by consensus and, when drafting has produced a document for an initial core system then a petition could be submitted to the Unicode Technical Committee. Once in use, the system could have additional symbols added to it, gradually, so as to expand its capabilities as needs are identified. So I am writing to ask if people on this mailing list would be interested in discussing and perhaps encouraging and participating in the development of this system please? William Overington 23 December 2014 -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Tue Dec 23 15:51:12 2014 From: doug at ewellic.org (Doug Ewell) Date: Tue, 23 Dec 2014 14:51:12 -0700 Subject: Unicode encoding policy Message-ID: <20141223145112.665a7a7059d7ee80bb4d670165c8327d.ed57b6552e.wbe@email03.secureserver.net> William_J_G Overington wrote: > 5. Are the proposed characters in current use by the user community? > No > ---- > This appears to be a major change in encoding policy. > This, in my opinion, is a welcome, progressive change in policy that > allows new characters for use in a pure electronic technology to be > added into regular Unicode without a requirement to first establish > widespread use by using an encoding within a Unicode Private Use Area. It is exactly the change I was worried about, the precedent I was afraid would be set. > I feel that it is now therefore possible to seek encoding of symbols, > perhaps in abstract emoji format and semi-abstract emoji format, so as > to implement a system for communication through the language barrier > by whole localizable sentences, with that system designed by > interested people without the need to produce any legacy data that is > encoded using an encoding within a Unicode Private Use Area. Sadly, I can no longer state with any confidence that such a proposal is out of scope for Unicode, as I tried to do for a decade or more. -- Doug Ewell | Thornton, CO, USA | http://ewellic.org From eik at iki.fi Tue Dec 23 16:02:06 2014 From: eik at iki.fi (Erkki I Kolehmainen) Date: Wed, 24 Dec 2014 00:02:06 +0200 Subject: VS: Unicode encoding policy In-Reply-To: <27378088.42099.1419357453536.JavaMail.defaultUser@defaultHost> References: <27378088.42099.1419357453536.JavaMail.defaultUser@defaultHost> Message-ID: <000001d01efc$179d96e0$46d8c4a0$@fi> Mr. Overington, The question of support for localizable sentences has been raised by you on several occasions. For a number of valid reasons, It has never received any noticeable support, let alone the kind of support that you are asking for now. Sincerely, Erkki I. Kolehmainen Tilkankatu 12 A 3, 00300 Helsinki, Finland Mob: +358400825943, Tel / Fax (by arr.): +358943682643 L?hett?j?: Unicode [mailto:unicode-bounces at unicode.org] Puolesta William_J_G Overington L?hetetty: 23. joulukuuta 2014 19:58 Vastaanottaja: unicode at unicode.org Aihe: Unicode encoding policy Unicode encoding policy There is a document. http://www.unicode.org/L2/L2014/14250.htm Within the document, the following are interesting items. E.1.7 Emoji Additions: popular requests [Edberg, Davis, L2/14-272] Discussion. UTC took no action at this time. Later, in the same document is the following. E.1.7 Emoji Additions: popular requests [Edberg, Davis, L2/14-272R] [141-C6] Consensus: Add the block U+1F900..U+1F9FF Supplemental Symbols and Pictographs for Unicode version 8.0. The referenced document contains links to various requests and petitions for additional emoji characters. In the referenced document, within section C, is the following. 5. Are the proposed characters in current use by the user community? No ---- This appears to be a major change in encoding policy. This, in my opinion, is a welcome, progressive change in policy that allows new characters for use in a pure electronic technology to be added into regular Unicode without a requirement to first establish widespread use by using an encoding within a Unicode Private Use Area. I feel that it is now therefore possible to seek encoding of symbols, perhaps in abstract emoji format and semi-abstract emoji format, so as to implement a system for communication through the language barrier by whole localizable sentences, with that system designed by interested people without the need to produce any legacy data that is encoded using an encoding within a Unicode Private Use Area. A first draft petition could be produced and then later drafts developed by consensus and, when drafting has produced a document for an initial core system then a petition could be submitted to the Unicode Technical Committee. Once in use, the system could have additional symbols added to it, gradually, so as to expand its capabilities as needs are identified. So I am writing to ask if people on this mailing list would be interested in discussing and perhaps encouraging and participating in the development of this system please? William Overington 23 December 2014 -------------- next part -------------- An HTML attachment was scrubbed... URL: From textexin at xencraft.com Tue Dec 23 18:50:25 2014 From: textexin at xencraft.com (Tex Texin) Date: Tue, 23 Dec 2014 16:50:25 -0800 Subject: Unicode encoding policy In-Reply-To: <000001d01efc$179d96e0$46d8c4a0$@fi> References: <27378088.42099.1419357453536.JavaMail.defaultUser@defaultHost> <000001d01efc$179d96e0$46d8c4a0$@fi> Message-ID: <007301d01f13$9bb42ae0$d31c80a0$@xencraft.com> True, however as William points out, apparently the rules have changed, so it isn?t unreasonable to ask again whether the rules now allow it, or if people that dismissed the idea in the past would now consider it. Personally, I think this is the wrong place for it, and as has been suggested numerous times, it makes sense to host the discussion elsewhere among interested parties. Although, I am not interested in the general case, there is a need for specialized cases. Just as some road sign symbols are near universal, there is a need for symbols for quick and universal communications in emergencies. Identifying places of safety or danger on a map, or for the injured to describe symptoms, pains, and the nature of their injury (or first aid workers to discuss victims? issues), or to describe the nature of a calamity (fire, landslide, bomb, attack, etc.), etc. William, You might consider identifying where there are needs for such universal text, and working with groups that would benefit, to get support for universal text symbols. tex From: Unicode [mailto:unicode-bounces at unicode.org] On Behalf Of Erkki I Kolehmainen Sent: Tuesday, December 23, 2014 2:02 PM To: wjgo_10009 at btinternet.com; unicode at unicode.org Subject: VS: Unicode encoding policy Mr. Overington, The question of support for localizable sentences has been raised by you on several occasions. For a number of valid reasons, It has never received any noticeable support, let alone the kind of support that you are asking for now. Sincerely, Erkki I. Kolehmainen Tilkankatu 12 A 3, 00300 Helsinki, Finland Mob: +358400825943, Tel / Fax (by arr.): +358943682643 L?hett?j?: Unicode [mailto:unicode-bounces at unicode.org] Puolesta William_J_G Overington L?hetetty: 23. joulukuuta 2014 19:58 Vastaanottaja: unicode at unicode.org Aihe: Unicode encoding policy Unicode encoding policy There is a document. http://www.unicode.org/L2/L2014/14250.htm Within the document, the following are interesting items. E.1.7 Emoji Additions: popular requests [Edberg, Davis, L2/14-272] Discussion. UTC took no action at this time. Later, in the same document is the following. E.1.7 Emoji Additions: popular requests [Edberg, Davis, L2/14-272R] [141-C6] Consensus: Add the block U+1F900..U+1F9FF Supplemental Symbols and Pictographs for Unicode version 8.0. The referenced document contains links to various requests and petitions for additional emoji characters. In the referenced document, within section C, is the following. 5. Are the proposed characters in current use by the user community? No ---- This appears to be a major change in encoding policy. This, in my opinion, is a welcome, progressive change in policy that allows new characters for use in a pure electronic technology to be added into regular Unicode without a requirement to first establish widespread use by using an encoding within a Unicode Private Use Area. I feel that it is now therefore possible to seek encoding of symbols, perhaps in abstract emoji format and semi-abstract emoji format, so as to implement a system for communication through the language barrier by whole localizable sentences, with that system designed by interested people without the need to produce any legacy data that is encoded using an encoding within a Unicode Private Use Area. A first draft petition could be produced and then later drafts developed by consensus and, when drafting has produced a document for an initial core system then a petition could be submitted to the Unicode Technical Committee. Once in use, the system could have additional symbols added to it, gradually, so as to expand its capabilities as needs are identified. So I am writing to ask if people on this mailing list would be interested in discussing and perhaps encouraging and participating in the development of this system please? William Overington 23 December 2014 -------------- next part -------------- An HTML attachment was scrubbed... URL: From duerst at it.aoyama.ac.jp Tue Dec 23 20:55:10 2014 From: duerst at it.aoyama.ac.jp (=?UTF-8?B?Ik1hcnRpbiBKLiBEw7xyc3Qi?=) Date: Wed, 24 Dec 2014 11:55:10 +0900 Subject: Unicode encoding policy In-Reply-To: <007301d01f13$9bb42ae0$d31c80a0$@xencraft.com> References: <27378088.42099.1419357453536.JavaMail.defaultUser@defaultHost> <000001d01efc$179d96e0$46d8c4a0$@fi> <007301d01f13$9bb42ae0$d31c80a0$@xencraft.com> Message-ID: <549A2B0E.7060909@it.aoyama.ac.jp> On 2014/12/24 09:50, Tex Texin wrote: > True, however as William points out, apparently the rules have changed, I hope the rules get clarified to clearly state that these are exceptions. > so it isn?t unreasonable to ask again whether the rules now allow it, or if people that dismissed the idea in the past would now consider it. > > > > Personally, I think this is the wrong place for it, and as has been suggested numerous times, it makes sense to host the discussion elsewhere among interested parties. > > > > Although, I am not interested in the general case, there is a need for specialized cases. Just as some road sign symbols are near universal, Actually not. I have been driving (and taking drivers' licences tests) in Switzerland, Japan, and the US. There are lots of similarities, but it'd be difficult for me to come up with an example where they are all identical (up to glyph/design differences). Please see for yourself e.g. at: https://en.wikipedia.org/wiki/Road_signs_in_Switzerland http://www.japandriverslicense.com/japanese-road-signs.asp https://en.wikipedia.org/wiki/Road_signs_in_the_United_States In the US, there are also differences by state. > there is a need for symbols for quick and universal communications in emergencies. Identifying places of safety or danger on a map, or for the injured to describe symptoms, pains, and the nature of their injury (or first aid workers to discuss victims? issues), or to describe the nature of a calamity (fire, landslide, bomb, attack, etc.), etc. Such symbols mostly already exist. For a quick and easy introduction, see e.g. http://www.iso.org/iso/graphical-symbols_booklet.pdf. If use of such symbols is found in running text, or if there is a strong need to use them in running text, some of these might be added to Unicode in the future. But they wouldn't be things invented out of the blue for marketing purposes, they would be well established already. > William, You might consider identifying where there are needs for such universal text, and working with groups that would benefit, to get support for universal text symbols. So the first order of business for William (or others) should be to investigate what's already around. Regards, Martin. From asmusf at ix.netcom.com Wed Dec 24 00:08:32 2014 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Tue, 23 Dec 2014 22:08:32 -0800 Subject: Unicode encoding policy In-Reply-To: <20141223145112.665a7a7059d7ee80bb4d670165c8327d.ed57b6552e.wbe@email03.secureserver.net> References: <20141223145112.665a7a7059d7ee80bb4d670165c8327d.ed57b6552e.wbe@email03.secureserver.net> Message-ID: <549A5860.90704@ix.netcom.com> On 12/23/2014 1:51 PM, Doug Ewell wrote: > William_J_G Overington > wrote: > >> 5. Are the proposed characters in current use by the user community? >> No >> ---- >> This appears to be a major change in encoding policy. >> This, in my opinion, is a welcome, progressive change in policy that >> allows new characters for use in a pure electronic technology to be >> added into regular Unicode without a requirement to first establish >> widespread use by using an encoding within a Unicode Private Use Area. > It is exactly the change I was worried about, the precedent I was afraid > would be set. Requiring long-term use of characters at an alternate code location always struck me as counter-productive, because it becomes disruptive at the point where some character finally has been established. In contrast to true "experimental" use. Therefore, recognizing that for some code points there can be critical mass of implementation support straight from the moment of publication is useful. This is definitely not the same as saying that any idea, however half-baked, of a new symbol should be encoded 'on-spec' to see whether it garners usage. The "critical mass" of support is now assumed for currency symbols, some special symbols like emoji, and should be granted to additional types of symbols, punctuations and letters, whenever there is an "authority" that controls normative orthography or notation. Whether this is for an orthography reform in some country or addition to the standard math symbols supported by AMS journals, such external adoption can signify immediate "critical need" and "critical mass of adoption" for the relevant characters. In these case, to require years of PUA code usage is, to repeat, counterproductive. It doesn't alter the fact that the codes will eventually be needed (unless one were to confidently expect failure of some reform) and only leads to the creation of data in the meantime that have to be converted or cannot be accessed reliably. A clear-cut recognition by the UTC (and WG2) of this particular dynamic (beyond currency codes) would be helpful -- particularly as Unicode has matured to the point of being the only game in town. The current methodology of researching typeset data is well suited to the encoding of existing or historic practice, but ill-suited to dealing with ongoing development of scripts and symbol sets. Taking this new stance makes it easier to contrast it with hobbyists, enthusiasts and individual tinkerers attempts at inventing a better world through symbols or new letters. These latter cases lack both "critical need" as well as "critical mass" unless they are first adopted by much larger (and/or more authoritative) groups of users. There is an inherent risk that large groups of users can follow "fads" that require certain symbols that see huge usage for a while and then get abandoned. While this is hard to predict, it is not that different from historical changes in writing systems - even if the trends there played out over longer time frames. A./ > >> I feel that it is now therefore possible to seek encoding of symbols, >> perhaps in abstract emoji format and semi-abstract emoji format, so as >> to implement a system for communication through the language barrier >> by whole localizable sentences, with that system designed by >> interested people without the need to produce any legacy data that is >> encoded using an encoding within a Unicode Private Use Area. > Sadly, I can no longer state with any confidence that such a proposal is > out of scope for Unicode, as I tried to do for a decade or more. > > -- > Doug Ewell | Thornton, CO, USA | http://ewellic.org > > > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode > From asmusf at ix.netcom.com Wed Dec 24 00:46:49 2014 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Tue, 23 Dec 2014 22:46:49 -0800 Subject: Unicode encoding policy In-Reply-To: <007301d01f13$9bb42ae0$d31c80a0$@xencraft.com> References: <27378088.42099.1419357453536.JavaMail.defaultUser@defaultHost> <000001d01efc$179d96e0$46d8c4a0$@fi> <007301d01f13$9bb42ae0$d31c80a0$@xencraft.com> Message-ID: <549A6159.4000902@ix.netcom.com> An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Wed Dec 24 02:14:05 2014 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Wed, 24 Dec 2014 08:14:05 +0000 (GMT) Subject: Unicode encoding policy In-Reply-To: <007301d01f13$9bb42ae0$d31c80a0$@xencraft.com> References: <27378088.42099.1419357453536.JavaMail.defaultUser@defaultHost> <000001d01efc$179d96e0$46d8c4a0$@fi> <007301d01f13$9bb42ae0$d31c80a0$@xencraft.com> Message-ID: <2682219.3123.1419408845180.JavaMail.defaultUser@defaultHost> Hi Tex Thank you for replying. The following four simulations are about seeking information, through the language barrier, about relatives and friends after a disaster. http://www.users.globalnet.co.uk/~ngo/locse027_four_simulations.pdf http://www.users.globalnet.co.uk/~ngo/library.htm I am now adapting the designs of the symbols to be more emoji-like in appearance, namely a 1:1 aspect ration and designed for clear viewing on a mobile device. William 24 December 2014 ----Original message---- >From : textexin at xencraft.com Date : 24/12/2014 - 00:50 (GMTST) To : eik at iki.fi, wjgo_10009 at btinternet.com, unicode at unicode.org Subject : RE: Unicode encoding policy True, however as William points out, apparently the rules have changed, so it isn?t unreasonable to ask again whether the rules now allow it, or if people that dismissed the idea in the past would now consider it. Personally, I think this is the wrong place for it, and as has been suggested numerous times, it makes sense to host the discussion elsewhere among interested parties. Although, I am not interested in the general case, there is a need for specialized cases. Just as some road sign symbols are near universal, there is a need for symbols for quick and universal communications in emergencies. Identifying places of safety or danger on a map, or for the injured to describe symptoms, pains, and the nature of their injury (or first aid workers to discuss victims? issues), or to describe the nature of a calamity (fire, landslide, bomb, attack, etc.), etc. William, You might consider identifying where there are needs for such universal text, and working with groups that would benefit, to get support for universal text symbols. tex -------------- next part -------------- An HTML attachment was scrubbed... URL: From neil at tonal.clara.co.uk Thu Dec 25 08:14:27 2014 From: neil at tonal.clara.co.uk (Neil Harris) Date: Thu, 25 Dec 2014 14:14:27 +0000 Subject: Admuncher javascript on Unicode site Message-ID: <549C1BC3.3010706@tonal.clara.co.uk> I've just noticed that loading the web page http://www.unicode.org/L2/L2014/14250.htm loads a script from "interceptedby.admuncher.com" This seems pretty peculiar to me. Is this intended? Neil Harris From rick at unicode.org Thu Dec 25 11:47:21 2014 From: rick at unicode.org (Rick McGowan) Date: Thu, 25 Dec 2014 09:47:21 -0800 Subject: Admuncher javascript on Unicode site In-Reply-To: <549C1BC3.3010706@tonal.clara.co.uk> References: <549C1BC3.3010706@tonal.clara.co.uk> Message-ID: <549C4DA9.6020006@unicode.org> Thank you for the report. This is an error on my part: saving an HTML file from a browser window on my own machine while running an ad blocker. I usually don't do that. I will correct this file and update it as soon as I have an opportunity. Regards, Rick On 12/25/2014 6:14 AM, Neil Harris wrote: > I've just noticed that loading the web page > > http://www.unicode.org/L2/L2014/14250.htm > > loads a script from "interceptedby.admuncher.com" > > This seems pretty peculiar to me. Is this intended? > > Neil Harris > From asmusf at ix.netcom.com Thu Dec 25 17:31:53 2014 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Thu, 25 Dec 2014 15:31:53 -0800 Subject: Admuncher javascript on Unicode site In-Reply-To: <549C1BC3.3010706@tonal.clara.co.uk> References: <549C1BC3.3010706@tonal.clara.co.uk> Message-ID: <549C9E69.30300@ix.netcom.com> Neil, I don't think that is true for anyone but you. The "ghostery" add-on for FF does not show anything being loaded, and the page source also does not included the string "admuncher". Is this an add-on that you are using? A./ On 12/25/2014 6:14 AM, Neil Harris wrote: > I've just noticed that loading the web page > > http://www.unicode.org/L2/L2014/14250.htm > > loads a script from "interceptedby.admuncher.com" > > This seems pretty peculiar to me. Is this intended? > > Neil Harris > > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode > From shervinafshar at gmail.com Thu Dec 25 18:28:04 2014 From: shervinafshar at gmail.com (Shervin Afshar) Date: Thu, 25 Dec 2014 16:28:04 -0800 Subject: Admuncher javascript on Unicode site In-Reply-To: <549C9E69.30300@ix.netcom.com> References: <549C1BC3.3010706@tonal.clara.co.uk> <549C9E69.30300@ix.netcom.com> Message-ID: I think Rick fixed the issue in the meantime. ? Shervin On Thu, Dec 25, 2014 at 3:31 PM, Asmus Freytag wrote: > Neil, > > I don't think that is true for anyone but you. > > The "ghostery" add-on for FF does not show anything being loaded, and the > page source also does not included the string "admuncher". > > Is this an add-on that you are using? > > A./ > > On 12/25/2014 6:14 AM, Neil Harris wrote: > >> I've just noticed that loading the web page >> >> http://www.unicode.org/L2/L2014/14250.htm >> >> loads a script from "interceptedby.admuncher.com" >> >> This seems pretty peculiar to me. Is this intended? >> >> Neil Harris >> >> _______________________________________________ >> Unicode mailing list >> Unicode at unicode.org >> http://unicode.org/mailman/listinfo/unicode >> >> > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Mon Dec 29 04:51:05 2014 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Mon, 29 Dec 2014 10:51:05 +0000 (GMT) Subject: Unicode encoding policy In-Reply-To: <000001d01efc$179d96e0$46d8c4a0$@fi> References: <27378088.42099.1419357453536.JavaMail.defaultUser@defaultHost> <000001d01efc$179d96e0$46d8c4a0$@fi> Message-ID: <7217620.7898.1419850265102.JavaMail.defaultUser@defaultHost> Erkki I. Kolehmainen wrote as follows: > The question of support for localizable sentences has been raised by you on several occasions. For a number of valid reasons, It has never received any noticeable support, let alone the kind of support that you are asking for now. Well, it is true that there has been little interest, though one man kindly translated the early sentences into Swedish, which has been of great help, and he also suggested an additional sentence, which is now part of the system. The lack of interest has always puzzled me, I had thought that with so many people on this mailing list who are interested in languages and communication, including many people who have a native language other than English, that there would be great interest in trying to produce a useful system. Now that there is the precedent of the encoding of the unicorn, perhaps things will be different as there is now the prospect of being able to develop and standardize a non-proprietary system that can be put into place for people to use without needing to first achieve either a widespread non-standardized implementation or a change in the rules just for this system.. Regarding your claim about valid reasons. Could you possibly say what you consider to be the valid reasons please? William Overington 29 December 2014 -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Mon Dec 29 12:32:59 2014 From: doug at ewellic.org (Doug Ewell) Date: Mon, 29 Dec 2014 11:32:59 -0700 Subject: Unicode encoding policy In-Reply-To: <549A5860.90704@ix.netcom.com> References: <20141223145112.665a7a7059d7ee80bb4d670165c8327d.ed57b6552e.wbe@email03.secureserver.net> <549A5860.90704@ix.netcom.com> Message-ID: <5D04559180924E12B9C02E93DB063BFD@DougEwell> Asmus Freytag wrote: > The "critical mass" of support is now assumed for currency symbols, > some special symbols like emoji, and should be granted to additional > types of symbols, punctuations and letters, whenever there is an > "authority" that controls normative orthography or notation. > > Whether this is for an orthography reform in some country or addition > to the standard math symbols supported by AMS journals, such external > adoption can signify immediate "critical need" and "critical mass of > option" for the relevant characters. To me, it is remarkable that the "critical mass of support" argument that is applied, entirely appropriately, to new currency symbols (however misguided the motives for such might be) and math symbols and characters for people's names, is now also applied to BURRITO and UNICORN FACE. But then, I remember when folks used to cite the WG2 "Principles and Procedures" document for examples of what was and was not a good candidate for encoding. That seems so long ago now. -- Doug Ewell | Thornton, CO, USA | http://ewellic.org ? From doug at ewellic.org Mon Dec 29 13:00:27 2014 From: doug at ewellic.org (Doug Ewell) Date: Mon, 29 Dec 2014 12:00:27 -0700 Subject: Unicode encoding policy In-Reply-To: References: Message-ID: William_J_G Overington wrote: > The lack of interest has always puzzled me, I had thought that with so > many people on this mailing list who are interested in languages and > communication, including many people who have a native language other > than English, that there would be great interest in trying to produce > a useful system. I had a similar discussion some time ago with a member of this list regarding encoding of flags. It's an interesting idea which I think deserves some thought, but it's not character encoding; and therefore it doesn't belong in Unicode, or so I would have supposed. I make no claim here about whether localizable sentences are interesting or deserving of thought. I only explain why I, interested in language and communication, don't believe Unicode is the proper venue for them. > Regarding your claim about valid reasons. > > Could you possibly say what you consider to be the valid reasons > please? I'm not Erkki, but what I would have said, with my old-fashioned view of character encoding, is: because it's not character encoding. -- Doug Ewell | Thornton, CO, USA | http://ewellic.org ? From asmusf at ix.netcom.com Mon Dec 29 13:46:34 2014 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Mon, 29 Dec 2014 11:46:34 -0800 Subject: Unicode encoding policy In-Reply-To: <5D04559180924E12B9C02E93DB063BFD@DougEwell> References: <20141223145112.665a7a7059d7ee80bb4d670165c8327d.ed57b6552e.wbe@email03.secureserver.net> <549A5860.90704@ix.netcom.com> <5D04559180924E12B9C02E93DB063BFD@DougEwell> Message-ID: <54A1AF9A.903@ix.netcom.com> On 12/29/2014 10:32 AM, Doug Ewell wrote: > Asmus Freytag wrote: > >> The "critical mass" of support is now assumed for currency symbols, >> some special symbols like emoji, and should be granted to additional >> types of symbols, punctuations and letters, whenever there is an >> "authority" that controls normative orthography or notation. >> >> Whether this is for an orthography reform in some country or addition >> to the standard math symbols supported by AMS journals, such external >> adoption can signify immediate "critical need" and "critical mass of >> option" for the relevant characters. > > To me, it is remarkable that the "critical mass of support" argument > that is applied, entirely appropriately, to new currency symbols > (however misguided the motives for such might be) and math symbols and > characters for people's names, is now also applied to BURRITO and > UNICORN FACE. > Does it - in principle - matter what a symbol is used for? If millions of happy users choose to communicate by peppering their messages with BURRITO and UNICORN FACE is that any less worthy of standardization than if thousands (or hundreds) of linguists use some arcane letterform to mark pronunciation differences between neighboring dialects on the Scandinavian peninsula? The "critical mass" argument does not (and should not) make value judgements, but instead focus on whether the infrastructure exists to make a character code widely available pretty much directly after publication, and whether there is implicit or explicit demand that would guarantee that such code is actually widely used the minute it comes available. For currency symbols, or for a new letter form demanded by a new or revised, but standard, orthography, the demand is created by some "authority" creating a requirement for conforming users. Because of that, the evaluation of the "critical mass" requirement is straightforward. Emoji lack an "authority", but they do not lack demand. For better or for worse, they have grabbed significant mind share; the number of news reports, blogs, social media posts, shared videos and what not that were devoted to Emoji simply dwarfs anything reported on currency symbols in a comparative time frame. With tracking applications devoted to them, anyone can convince themselves, in real time, that the entire repertoire is being used, even, as appropriate for such a collection, with a clear differentiation by frequency. Nevertheless, the indication is clear that any emoji that will be added by the relevant vendors is going to be used as soon as it comes available. Further, as no vendor has a closed ecosystem, to be usable requires agreement on how they are coded. The critical question, and I fully understand that this gives you pause, is one of selection. There are hundreds, if not thousands of potential additions to the emoji collection, some fear the set is, in principle, endless. Lacking an "authority" how does one come to a principled agreement on encoding any emoji now, rather than later. One would run an experiment, which is to say, create an alternate environment where users can use non-standard emoji and then the Uni-scientists in white lab coats could count the frequency of usage and promote the cream off the top to standardized codes. Or one could run an experiment where one defines a small number of slots, say 40, and opens them up for public discussion, and proceeds on that basis. Yes, that would turn the UTC into the "authority". My personal take is that the former approach is inappropriate for something that is in high demand and actively supported; the latter I can accept, provisionally, as an experiment to try to deal with an evolving system. Because of the ability to track, in real time, the use or non-use of any of the new additions it would be a true experiment, the outcome of which can be accurately measured. If it should lead to the standardization of few dozen symbols that prove not as popular as predicted, then we would conclude a failure of the experiment, and retire this process. Otherwise, I'd have no problem cautiously continuing with it. > But then, I remember when folks used to cite the WG2 "Principles and > Procedures" document for examples of what was and was not a good > candidate for encoding. That seems so long ago now. The P&P, like most by-laws and constitutions, are living documents. In this case, they try to capture best practice, without taking from the UTC (or WG2) the ability to deal with new or changed situations. The degree to which emoji have captured the popular imagination is unprecedented. It means the game has changed. Let's give the UTC the space to work out appropriate coping mechanisms. A./ PS: this does not mean that, for all other types of code points, the existing wording on the P&P can simply be disregarded. In fact, the end result will be to see them updated with additional criteria explicitly geared towards the kind of high-profile use case we are discussing here. From verdy_p at wanadoo.fr Wed Dec 31 01:36:18 2014 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Wed, 31 Dec 2014 08:36:18 +0100 Subject: Unicode encoding policy In-Reply-To: <54A1AF9A.903@ix.netcom.com> References: <20141223145112.665a7a7059d7ee80bb4d670165c8327d.ed57b6552e.wbe@email03.secureserver.net> <549A5860.90704@ix.netcom.com> <5D04559180924E12B9C02E93DB063BFD@DougEwell> <54A1AF9A.903@ix.netcom.com> Message-ID: One important factor is also stability: some symbols may get a temporary interest and then raidly abandoned for a new flavor, hardly related to the previously encoded one. Stability is also a need when UTC resources and work time is limited to focus in things that have been already waited for long (even if there were some difficult discussions, notably when trying to deal with variants and different usage patterns, or in more complex situations discovered with difficulties like text layout; or creation of distinctive of contextual ligatures, or when discussing about some critical character properties such as word boundaries, or expected specific alignments with other characters including with some other scripts). For that the UTC has a useful tool: the roadmap which attempts to organize the standardization work by topics and communities of interest, in order to avoid duplicate discussions or create coherent proposals that will also resist other future additions. Emojis however exist inly in relation to themselves, and their coherence really comes from their adoption on a range of devices or OSes and common applications. Large vendors (like Google for Android, Apple for iOS, and Microsoft for Windows, but also some wellknown websites connected to many others like Yahoo, Twitter or Facebook and their supported applications running on various OSes and devices, or Baidu in China, or Mail.ru in Russia, are also desiring to open their own sets to offer support to users communicating from devices/OSes made by other vendors inclujding in other countries. There could be also other "killer" apps amde available on various OSes and devices which could benefir from this standardization, such as keyboard extensions for smartphones/tablets, or sets of generic icons commnly needed for user interfaces (e.g. the icons that appear in Gmail for rich text editing or for managing emails and folders; people want to be able to use similar looking icons even if their exact design change specifically, simply because websites and support services will frequently reference them and people will want to discuss about their use in varous contexts; the same is true about typical icons found on popular navigation maps). People will understand those icons/symbols and will use them because they understand clearly what they mean in similar kind of usages. Those symbols are good candidate for standardization indepedendantly of their site-specific or device specific look (which can also evolve across versions, such as the symbols for buttons at the bottom of Android displays: having a standardized character for these evolving icons can also help application authors to describe their own UI and how to use them on a larger range of devices and versions: users will see the appropriate icon for their own local device in its current brand and version, but support pages do not need to be rewritten/modified to show different screenshots; these visible icons will also work if users have installed a different UI theme or if these icons are relocated elsewhere than what is displayed in basic screenshots made on a few devices in some old versions of their specific UI); the need for this icons is the same across all these devices and versions for similar functions. So we have icons/symbols with similar "spirit" across a large range of devices for basic functions: telephone handset to place a call or to reply, or to close a communication. In fact this is the same kind of things that have been used since long for icons for controlling all audio devices : play, stop, rewind, forward, pause, power up, power off, enter sleep mode, wake up, mute, volume up/down, icons for activating/deactivating Wifi or Bluetooth, icons for the headset or the radio, ejecting a media; start recording... Look also on a wide range of remote TV controlers. Note all of them are using distinctive glyphs, some are just differentiated by colors such as the red/yellow/green/blue buttons used in Teletext remote controlers (in my opinion color is not a requirement, and this could also be buttons with readable labels in a box, if need for accessibility is a demand: this has been recently standardized for tinting facial emojis by humane skin color, with an interesting proposed alternate representation where color can also be represented by a non ligatured monchromatic glyph). In all these cases, the demand for it and their use in various contexts where they can be tuned locally to match user expectations, is an excellent reason for standardizing them without breaking their intended meaning in those specific tuning contexts. Other interesting sets are those standardized on road signs, or warnin signs on various products (they are frequently international because their meaning is legally imperative for road users) or because their informative meaning is about services found almost everywhere in the world (e.g. luggage disposal, taxis, toilets, shower, hospital, parking, lunch places, vehicle categories, tolls on motorways, TV sets, ... Some of them are very country specific (such as the "red carot" used in France signaling tobacco resellers). Those symbols are not just present on roads or on maps but will be found aso on published tourism guides or as indicators in websites showing coherent sets of generic services such as hotels or campings to list their additional local services. Those sets initially are inventions but as soon as their usage expands and their meaning starts being widely understood in a country, they will leak to other places with minor variants. But they all have in common that they were initially not encoded as characters, their experimentation and use developed and then came the time for standardizing more of less these variants, nationally or internationally. Then they started being used also in other contexts for which they were not initially meant (e.g. the STOP side). They also had already several authorities regulating their use in specific contexts in which they became mandatory or highly recommanded. The UTC may now encode them.: usage is demonstrated, there's stabiity, there's already an authority supporting them and ready to accept their use by everyone. 2014-12-29 20:46 GMT+01:00 Asmus Freytag : > On 12/29/2014 10:32 AM, Doug Ewell wrote: > >> Asmus Freytag wrote: >> >> The "critical mass" of support is now assumed for currency symbols, >>> some special symbols like emoji, and should be granted to additional >>> types of symbols, punctuations and letters, whenever there is an >>> "authority" that controls normative orthography or notation. >>> >>> Whether this is for an orthography reform in some country or addition >>> to the standard math symbols supported by AMS journals, such external >>> adoption can signify immediate "critical need" and "critical mass of >>> option" for the relevant characters. >>> >> >> To me, it is remarkable that the "critical mass of support" argument that >> is applied, entirely appropriately, to new currency symbols (however >> misguided the motives for such might be) and math symbols and characters >> for people's names, is now also applied to BURRITO and UNICORN FACE. >> >> Does it - in principle - matter what a symbol is used for? If millions > of happy users choose to communicate by peppering their messages with > BURRITO and UNICORN FACE is that any less worthy of standardization than if > thousands (or hundreds) of linguists use some arcane letterform to mark > pronunciation differences between neighboring dialects on the Scandinavian > peninsula? > > The "critical mass" argument does not (and should not) make value > judgements, but instead focus on whether the infrastructure exists to make > a character code widely available pretty much directly after publication, > and whether there is implicit or explicit demand that would guarantee that > such code is actually widely used the minute it comes available. > > For currency symbols, or for a new letter form demanded by a new or > revised, but standard, orthography, the demand is created by some > "authority" creating a requirement for conforming users. Because of that, > the evaluation of the "critical mass" requirement is straightforward. > > Emoji lack an "authority", but they do not lack demand. For better or for > worse, they have grabbed significant mind share; the number of news > reports, blogs, social media posts, shared videos and what not that were > devoted to Emoji simply dwarfs anything reported on currency symbols in a > comparative time frame. With tracking applications devoted to them, anyone > can convince themselves, in real time, that the entire repertoire is being > used, even, as appropriate for such a collection, with a clear > differentiation by frequency. > > Nevertheless, the indication is clear that any emoji that will be added by > the relevant vendors is going to be used as soon as it comes available. > Further, as no vendor has a closed ecosystem, to be usable requires > agreement on how they are coded. > > The critical question, and I fully understand that this gives you pause, > is one of selection. There are hundreds, if not thousands of potential > additions to the emoji collection, some fear the set is, in principle, > endless. Lacking an "authority" how does one come to a principled agreement > on encoding any emoji now, rather than later. > > One would run an experiment, which is to say, create an alternate > environment where users can use non-standard emoji and then the > Uni-scientists in white lab coats could count the frequency of usage and > promote the cream off the top to standardized codes. > > Or one could run an experiment where one defines a small number of slots, > say 40, and opens them up for public discussion, and proceeds on that > basis. Yes, that would turn the UTC into the "authority". > > My personal take is that the former approach is inappropriate for > something that is in high demand and actively supported; the latter I can > accept, provisionally, as an experiment to try to deal with an evolving > system. Because of the ability to track, in real time, the use or non-use > of any of the new additions it would be a true experiment, the outcome of > which can be accurately measured. If it should lead to the standardization > of few dozen symbols that prove not as popular as predicted, then we would > conclude a failure of the experiment, and retire this process. Otherwise, > I'd have no problem cautiously continuing with it. > > But then, I remember when folks used to cite the WG2 "Principles and >> Procedures" document for examples of what was and was not a good candidate >> for encoding. That seems so long ago now. >> > > The P&P, like most by-laws and constitutions, are living documents. In > this case, they try to capture best practice, without taking from the UTC > (or WG2) the ability to deal with new or changed situations. > > The degree to which emoji have captured the popular imagination is > unprecedented. It means the game has changed. Let's give the UTC the space > to work out appropriate coping mechanisms. > > A./ > > PS: this does not mean that, for all other types of code points, the > existing wording on the P&P can simply be disregarded. In fact, the end > result will be to see them updated with additional criteria explicitly > geared towards the kind of high-profile use case we are discussing here. > > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at macchiato.com Wed Dec 31 04:53:02 2014 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=) Date: Wed, 31 Dec 2014 11:53:02 +0100 Subject: Unicode encoding policy In-Reply-To: <54A1AF9A.903@ix.netcom.com> References: <20141223145112.665a7a7059d7ee80bb4d670165c8327d.ed57b6552e.wbe@email03.secureserver.net> <549A5860.90704@ix.netcom.com> <5D04559180924E12B9C02E93DB063BFD@DougEwell> <54A1AF9A.903@ix.netcom.com> Message-ID: Nicely put, Asmus! Mark *? Il meglio ? l?inimico del bene ?* On Mon, Dec 29, 2014 at 8:46 PM, Asmus Freytag wrote: > On 12/29/2014 10:32 AM, Doug Ewell wrote: > >> Asmus Freytag wrote: >> >> The "critical mass" of support is now assumed for currency symbols, >>> some special symbols like emoji, and should be granted to additional >>> types of symbols, punctuations and letters, whenever there is an >>> "authority" that controls normative orthography or notation. >>> >>> Whether this is for an orthography reform in some country or addition >>> to the standard math symbols supported by AMS journals, such external >>> adoption can signify immediate "critical need" and "critical mass of >>> option" for the relevant characters. >>> >> >> To me, it is remarkable that the "critical mass of support" argument that >> is applied, entirely appropriately, to new currency symbols (however >> misguided the motives for such might be) and math symbols and characters >> for people's names, is now also applied to BURRITO and UNICORN FACE. >> >> Does it - in principle - matter what a symbol is used for? If millions > of happy users choose to communicate by peppering their messages with > BURRITO and UNICORN FACE is that any less worthy of standardization than if > thousands (or hundreds) of linguists use some arcane letterform to mark > pronunciation differences between neighboring dialects on the Scandinavian > peninsula? > > The "critical mass" argument does not (and should not) make value > judgements, but instead focus on whether the infrastructure exists to make > a character code widely available pretty much directly after publication, > and whether there is implicit or explicit demand that would guarantee that > such code is actually widely used the minute it comes available. > > For currency symbols, or for a new letter form demanded by a new or > revised, but standard, orthography, the demand is created by some > "authority" creating a requirement for conforming users. Because of that, > the evaluation of the "critical mass" requirement is straightforward. > > Emoji lack an "authority", but they do not lack demand. For better or for > worse, they have grabbed significant mind share; the number of news > reports, blogs, social media posts, shared videos and what not that were > devoted to Emoji simply dwarfs anything reported on currency symbols in a > comparative time frame. With tracking applications devoted to them, anyone > can convince themselves, in real time, that the entire repertoire is being > used, even, as appropriate for such a collection, with a clear > differentiation by frequency. > > Nevertheless, the indication is clear that any emoji that will be added by > the relevant vendors is going to be used as soon as it comes available. > Further, as no vendor has a closed ecosystem, to be usable requires > agreement on how they are coded. > > The critical question, and I fully understand that this gives you pause, > is one of selection. There are hundreds, if not thousands of potential > additions to the emoji collection, some fear the set is, in principle, > endless. Lacking an "authority" how does one come to a principled agreement > on encoding any emoji now, rather than later. > > One would run an experiment, which is to say, create an alternate > environment where users can use non-standard emoji and then the > Uni-scientists in white lab coats could count the frequency of usage and > promote the cream off the top to standardized codes. > > Or one could run an experiment where one defines a small number of slots, > say 40, and opens them up for public discussion, and proceeds on that > basis. Yes, that would turn the UTC into the "authority". > > My personal take is that the former approach is inappropriate for > something that is in high demand and actively supported; the latter I can > accept, provisionally, as an experiment to try to deal with an evolving > system. Because of the ability to track, in real time, the use or non-use > of any of the new additions it would be a true experiment, the outcome of > which can be accurately measured. If it should lead to the standardization > of few dozen symbols that prove not as popular as predicted, then we would > conclude a failure of the experiment, and retire this process. Otherwise, > I'd have no problem cautiously continuing with it. > > But then, I remember when folks used to cite the WG2 "Principles and >> Procedures" document for examples of what was and was not a good candidate >> for encoding. That seems so long ago now. >> > > The P&P, like most by-laws and constitutions, are living documents. In > this case, they try to capture best practice, without taking from the UTC > (or WG2) the ability to deal with new or changed situations. > > The degree to which emoji have captured the popular imagination is > unprecedented. It means the game has changed. Let's give the UTC the space > to work out appropriate coping mechanisms. > > A./ > > PS: this does not mean that, for all other types of code points, the > existing wording on the P&P can simply be disregarded. In fact, the end > result will be to see them updated with additional criteria explicitly > geared towards the kind of high-profile use case we are discussing here. > > _______________________________________________ > Unicode mailing list > Unicode at unicode.org > http://unicode.org/mailman/listinfo/unicode > -------------- next part -------------- An HTML attachment was scrubbed... URL: