From davidj_faulks at yahoo.ca Sat Feb 6 08:11:27 2016 From: davidj_faulks at yahoo.ca (David Faulks) Date: Sat, 6 Feb 2016 14:11:27 +0000 (UTC) Subject: Uranian Astrology Symbols References: <1705355011.185822.1454767887083.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <1705355011.185822.1454767887083.JavaMail.yahoo@mail.yahoo.com> Hello, I'm investigating the possibility of adding more astrology symbols to Unicode. There is a branch of Western Astrology known as ?Uranium Astrology?, or the ?Hamburg School?, which among other things uses a set of 8 ?astrological planets? (Cupido, Hades, Zeus, Kronos, Apollon, Admetos, Vulcanus, Poseidon). These ?planets? have well defined symbols. Here are some sites on Uranian Astrology: http://theuranianastrologer.com/ http://uraniansociety.com/ http://arlenekramer.net/uranian.asp http://www.uranian-institute.org/ http://eastrologer.net/uranian-astrology/ https://uranianastrologybooks.com/ The last one reveals that there are many published books on this type of astrology. However, blindly buying books just on the chance they might contain in-text examples of these symbols?to use for examples in the proposal?is not something I feel inclined to do. Therefore, I am hoping I can use pdf examples found on the internet, such as ... http://uraniansociety.com/USIG_articles/article_history_of_uranian_astrology_michael_feist.pdf (page 13) http://www.astrology-x-files.com/report/Johnny%20Carson-Asteroids.pdf (scattered use of symbols) http://www.witte-verlag.com/media/djcatalog/TNE_1870-2070-Pages_4_5_7_199.pdf (not a very good example) http://www.tonybonin.de/IQ-Jauch.PDF (Page 10) http://holestoheavens.com/wp-content/uploads/2012/02/astro-copy.pdf (page 2 lists a bunch of symbols) http://www.rojn-info.com/images/1172753867/urephem2004.pdf (tabular data) http://ridoux.fr/spip/IMG/pdf/-31.pdf (tables inside charts, not a very good example) http://ridoux.fr/spip/IMG/pdf/-33.pdf (better tables, like on page 7) In particular, I have found this page : http://www.astrax.de/download.html, which contains many downloads for what seems to be a German astrology magazine, and a quick check reveals that at least most of them contain in-text examples of the uranian planet symbols. This magazine might even have been printed ? I can't really tell, since I don't speak german. Also, I've found a description of a TEX package which has them: http://ansuz.sooke.bc.ca/astrology/starfont/starfont.pdf I would like some advice and input on whether this is okay, and if so, which block should receive these symbols. Also, Eris: http://www.moreplutos.com/AstroJournal-SeptOct2014_Eris-corr-opt.pdf (The symbol used there can be unified with U+29EC) From asmusf at ix.netcom.com Sat Feb 6 15:54:11 2016 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sat, 6 Feb 2016 13:54:11 -0800 Subject: Uranian Astrology Symbols In-Reply-To: <1705355011.185822.1454767887083.JavaMail.yahoo@mail.yahoo.com> References: <1705355011.185822.1454767887083.JavaMail.yahoo.ref@mail.yahoo.com> <1705355011.185822.1454767887083.JavaMail.yahoo@mail.yahoo.com> Message-ID: <56B66B83.1020007@ix.netcom.com> An HTML attachment was scrubbed... URL: From davidj_faulks at yahoo.ca Sun Feb 7 14:20:03 2016 From: davidj_faulks at yahoo.ca (David Faulks) Date: Sun, 7 Feb 2016 20:20:03 +0000 (UTC) Subject: Uranian Astrology Symbols References: <425299532.510655.1454876403563.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <425299532.510655.1454876403563.JavaMail.yahoo@mail.yahoo.com> > On Sun, 2/7/16, Asmus Freytag wrote: >>On 2/7/2016 4:02 AM, David Faulks wrote: >> 29EC ? WHITE CIRCLE WITH DOWN ARROW is in >> *Miscellaneous Mathematical Symbols-B* and has the >> category Sm, and all of the fonts I have which display it use >> a glyph identical to the unicode code charts. The Eris >> symbol?the one people are using?has a circle relativly >> smaller, but I thought that that was not considered a good >> enough reason to justify a new codepoint. > Yes and no. For mathematical fonts, it's often important > that different circles relate in size. How does Eris relate to > Earth in?astrological fonts? Is there a clear relation, whether > same size or?one always being smaller? Imagine what would > happen for a font that covers both mathematical use and > astrology? > Would a designer be forced to choose which user > community to ?accommodate? This is somewhat difficult to judge. I don't think astrologers would find the current glyph for U+29EC unacceptable, but the glyph typically being used has the circle smaller than Mars, Venus, and either of the two Earth symbols. Sometimes, an oval is used instead of a circle. However, some styles for astrology symbols have very large circles. > So, depending on the facts of how this symbol is used, > there may well be good reasons to not equate it with the > mathematical character - but that also means you'll need to > understand what the latter was encoded for (which you can > find by searching the document register). I've found the proposals (from 2000), but many symbols there have no explained use, including U+29EC. If the members of this mailing list think a proposal including a separate Eris symbol is acceptable, I will include it in my proposal. Along with, perhaps, some additional symbols... >A./ David From chris.jacobs at xs4all.nl Sun Feb 7 15:02:54 2016 From: chris.jacobs at xs4all.nl (Chris Jacobs) Date: Sun, 07 Feb 2016 22:02:54 +0100 Subject: Uranian Astrology Symbols In-Reply-To: <425299532.510655.1454876403563.JavaMail.yahoo@mail.yahoo.com> References: <425299532.510655.1454876403563.JavaMail.yahoo.ref@mail.yahoo.com> <425299532.510655.1454876403563.JavaMail.yahoo@mail.yahoo.com> Message-ID: <104b51c4753a1fd5920912bfd32f8661@xs4all.nl> David Faulks schreef op 2016-02-07 21:20: > > If the members of this mailing list think a proposal including a > separate Eris symbol is acceptable, I will include it in my proposal. > > Along with, perhaps, some additional symbols... > >> A./ > > David Seems there is no agreement what the Eris symbol should look like. This website gives four different shapes, not counting the Discordian one. http://www.zanestein.com/Trans-pluto.htm#UB313 Chris From davidj_faulks at yahoo.ca Sun Feb 7 15:51:48 2016 From: davidj_faulks at yahoo.ca (David Faulks) Date: Sun, 7 Feb 2016 21:51:48 +0000 (UTC) Subject: Uranian Astrology Symbols References: <67654298.510268.1454881908960.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <67654298.510268.1454881908960.JavaMail.yahoo@mail.yahoo.com> (making sure my response goes to the mailing list) > On Sun, 2/7/16, Chris Jacobs wrote: > Seems there is no agreement what the Eris symbol should > look like. This website gives four different shapes, not counting > the Discordian one. > http://www.zanestein.com/Trans-pluto.htm#UB313 > Chris The situation seems to have settled down now. I have looked at plenty of astrological charts, and the circle/oval with downwards arrow is all over the place, including the covers of books. None of the charts used Zane Stein's symbol, one early chart used the ?Hand of Eris?, and one Polish chart used the ?Polish Symbol?. All of the others used the circle with downwards arrow, and I have read it described as ?now-standardized?. David From frederic.grosshans at gmail.com Sun Feb 7 17:09:33 2016 From: frederic.grosshans at gmail.com (=?UTF-8?Q?Fr=c3=a9d=c3=a9ric_Grosshans?=) Date: Mon, 8 Feb 2016 00:09:33 +0100 Subject: =?UTF-8?Q?Shouldn=e2=80=99t_the_proposed_U+23FF_OBSERVER_EYE_SYMBOL?= =?UTF-8?Q?_be_an_emoji_=3f?= Message-ID: <56B7CEAD.2040107@gmail.com> Dear Unicode list readers (cc Simon Griffee, Rick McGowan), I have some problems with the proposed *U+23FF OBSERVER EYE SYMBOL (named so in the pipeline http://www.unicode.org/alloc/Pipeline.html and in the Draft additional repertoire for ISO/IEC 10646:2016 (5th edition) CD.2 http://www.unicode.org/L2/L2015/15339-n4705.pdf) As far as I understood, this character is intended to be added to Unicode to represent the eye which is frequently represented in optics schematics, to represent the observer. Simon Griffee as proposed this symbol in L2/15-031R (http://www.unicode.org/L2/L2015/15031r-observer.pdf) with some more examples provided by Rick McGowan in L2/15-095 (http://www.unicode.org/L2/L2015/15095-observer-examples.pdf). In a few words (more details below), I think this character is actually used beyond optics should be encoded as an emoji with properties (and aspect) similar to ?? U+1F441 EYE, with a name like EYE SIDE VIEW. I also think it would be better if it were moved to an emoji block (1F900?1F9FF Supplemental Symbols and Pictographs ?) I intend to write and submit a formal document later, and I write this mail in order to gather advices on the best way to advance further. Fr?d?ric === The details of my objections == I agree with Simon Griffee is a standard symbol used in optics an related fields, and it is attested from the 16th to 21st century. It is clearly needed, and I have seen other characters, like e.g. ?U+2222 SPHERICAL ANGLE, used on diagrams to replace it. However, I have not seen intermixed with plain text, and I don?t find the example of L2/15-031 convincing but I?m not sure whether this kind of criterion is relevant in the current ?emoji-era?. I think this symbol is better represented by an emoji named EYE SIDE VIEW (or a similar name) I. Other common representation of the observer in optical context are encoded as emojis. Three examples are ?? ?? ? ??U+1F3A5 MOVIE CAMERA, (e.g. fig 8 of http://arxiv.org/abs/1502.03809 , http://alexrodgers.co.uk/wp-content/uploads/2014/08/raytracing.png ) ?? U+1F441 EYE (e.g. fig K page 124 of http://www.e-rara.ch/zut/content/titleinfo/290294, or http://653fb62b3a129d296422-3019ba142970aa3e5db9c4ca20cb2da4.r64.cf1.rackcdn.com/images/Nioo9nlDeaYP.878x0.Z-Z96KYq.jpg) ? U+263A WHITE SMILING FACE (e.g. http://hevi.info/img/dissertation-images/MSc_Dissertation_Umut_ERTURK_0703851_Ray_Tracing_On_Cell_html_7af52a91.png) II. This symbol is often used together with other emoji-like symbols on schematics For example ?? U+1F323 WHITE SUN, ??U+1F4A1 ELECTRIC LIGHT BULB, but also ?? U+1F334 PALM TREE http://sciences-physiques.ac-dijon.fr/archives/astronomie/Mirages/images/Mirage_chaud1.gif III. In centuries-old printed publications as well on recent website, it appears both in a really schematic way (like ?) as well as in a detailed graphical drawing, including lashes and brows. This remembers the text vs emoji variants induced by VS15 and VS16 on emojis. IV. The symbol itself is an eye seen from the side, hence the name EYE SIDE VIEW I propose. I think the name OBSERVER SYMBOL is not adapted, since this symbol is sometimes used with other semantics, and a disunification is probably not worth the trouble. Some examples of other meanings include a) The eye itself http://thumb1.shutterstock.com/display_pic_with_logo/169/169,1197707547,8/stock-vector-eye-drops-symbol-7816558.jpg http://www.bigstockphoto.com/fr/image-21796349/stock-vector-contact-lens-and-eye-symbol-sign-and-button b) Ophtalmology https://en.wikisource.org/wiki/Portal:Medicine c) Sight http://johncrowhurst.me/wp-content/uploads/2011/04/istockphoto_5622861-five-senses-icons11.jpg https://ehumanbiofield.wikispaces.com/file/view/istockphoto_2307885_senses.jpg/32748849/istockphoto_2307885_senses.jpg d) The ?eye of the mind?, as in this 14th century book reproduced here https://twitter.com/Jean_no/status/613387284356964352 e) If another eye-looking symbol is encoded, one can be almost sure it will be used as emoji, and it is probably safer to anticipate this use. From asmusf at ix.netcom.com Sun Feb 7 17:37:49 2016 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Sun, 7 Feb 2016 15:37:49 -0800 Subject: Uranian Astrology Symbols In-Reply-To: <104b51c4753a1fd5920912bfd32f8661@xs4all.nl> References: <425299532.510655.1454876403563.JavaMail.yahoo.ref@mail.yahoo.com> <425299532.510655.1454876403563.JavaMail.yahoo@mail.yahoo.com> <104b51c4753a1fd5920912bfd32f8661@xs4all.nl> Message-ID: <56B7D54D.6070703@ix.netcom.com> On 2/7/2016 1:02 PM, Chris Jacobs wrote: > > > David Faulks schreef op 2016-02-07 21:20: >> > >> If the members of this mailing list think a proposal including a >> separate Eris symbol is acceptable, I will include it in my proposal. >> >> Along with, perhaps, some additional symbols... >> >>> A./ >> >> David > > Seems there is no agreement what the Eris symbol should look like. > This website gives four different shapes, not counting the Discordian > one. > http://www.zanestein.com/Trans-pluto.htm#UB313 > > Chris > Unicode does not so much encode concepts. Neither does it (normally) attempt to encode for precise shapes. What it tries to do is to encode elements that are sufficient to represent text. If there are many different conventions for representing a concept, that's similar to different spellings. Unicode normally supplies all the element and lets the users choose the spelling. The big exception is when the unit of spelling (for example, in normal text, that would be a letter) itself has a range of appearances. So, the question here would be: are these different shapes of the same symbol, or different symbols used for the same purpose. From the website I would think we have at least 4 distinct symbols. Two of the shapes look like they might be alternate representations of the same symbol. A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmus-inc at ix.netcom.com Sun Feb 7 18:57:15 2016 From: asmus-inc at ix.netcom.com (Asmus Freytag (t)) Date: Sun, 7 Feb 2016 16:57:15 -0800 Subject: =?UTF-8?Q?Re:_Shouldn=e2=80=99t_the_proposed_U+23FF_OBSERVER_EYE_SY?= =?UTF-8?Q?MBOL_be_an_emoji_=3f?= In-Reply-To: <56B7CEAD.2040107@gmail.com> References: <56B7CEAD.2040107@gmail.com> Message-ID: <56B7E7EB.8090607@ix.netcom.com> An HTML attachment was scrubbed... URL: From jtauber at jtauber.com Mon Feb 8 11:10:55 2016 From: jtauber at jtauber.com (James Tauber) Date: Mon, 8 Feb 2016 11:10:55 -0600 Subject: precomposed polytonic Greek characters with macrons and other diacritics Message-ID: The Greek Extended block includes precomposed characters for vowels with all known combinations of accents, breathing and iota subscript. It also includes precomposed characters for the vowels alpha, iota and upsilon with macron. (Those three vowels are ambiguously short or long hence the need to mark length in some contexts). However, there is no precomposition of vowels with accents and/or breathing PLUS macron. (Vowels with iota subscript are always long so don't need a macron to indicate length). This isn't normally an issue in running polytonic Greek text where vowel length is rarely shown but is does occur in lexicons, grammars, etc. I'm wondering what potential objections / problems I should be aware of before trying to put together a proposal for these extra precomposed characters to be included. I wrote a blog post about this issue more broadly (not all of which has to do with Unicode) but which still might be of interest: http://jktauber.com/2016/01/28/polytonic-greek-unicode-is-still-not-perfect/ James -------------- next part -------------- An HTML attachment was scrubbed... URL: From otto.stolz at uni-konstanz.de Mon Feb 8 11:26:44 2016 From: otto.stolz at uni-konstanz.de (Otto Stolz) Date: Mon, 8 Feb 2016 18:26:44 +0100 Subject: transliteration of mjagkij znak (Cyrillic soft sign) Message-ID: <56B8CFD4.1070105@uni-konstanz.de> Hello, I am wondering how U+02B9 MOFIFIER LETTER PRIME made its way into the Unicode repertoire, and how it acquired its comment ?transliteration of mjagkij znak (Cyrillic soft sign: palatalization)?. ISO/R 9:1954 through ISO/R 9:1986 map the mjagkij znak ??? to the apostrophe, and so does DIN 1460:1982. The latter clearly depicts the apostrophe that later became U+02BC, while I am not sure whether also ISO/R 9 does so or rather depicts a glyph like U+0027. (All of these standards predate Unicode, so they just depict glyphs.) ISO/R 9:1995 maps the mjagkij znak ??? to the prime, particularly to the modifier letter U+02B9, in accordance with the comment in the Unicode charts. Unicode archeologists, can you shed some light on the history of both U+02B9 and the mjagkij znak? And linguists, can you tell me how the mjagkij znak is transliterated normally, as an apostrophe or as a prime? Thanks for any comments, Otto From doug at ewellic.org Mon Feb 8 12:30:49 2016 From: doug at ewellic.org (Doug Ewell) Date: Mon, 08 Feb 2016 11:30:49 -0700 Subject: precomposed polytonic Greek characters with macrons and other diacritics Message-ID: <20160208113049.665a7a7059d7ee80bb4d670165c8327d.c746940bc1.wbe@email03.secureserver.net> James Tauber wrote: > I'm wondering what potential objections / problems I should be aware > of before trying to put together a proposal for these extra > precomposed characters to be included. It sounds from the blog post that the basic rationale for adding precomposed characters is that existing fonts, input methods, and other tools don't always work correctly with the combining sequences. I suppose one potential challenge you might face is to explain why the following FAQ items, though phrased in terms of Latin base letters, don't apply equally to Greek: http://www.unicode.org/faq/char_combmark.html#11 http://www.unicode.org/faq/char_combmark.html#12b -- Doug Ewell | http://ewellic.org | Thornton, CO ???? From jtauber at jtauber.com Mon Feb 8 12:47:30 2016 From: jtauber at jtauber.com (James Tauber) Date: Mon, 8 Feb 2016 12:47:30 -0600 Subject: precomposed polytonic Greek characters with macrons and other diacritics In-Reply-To: <20160208113049.665a7a7059d7ee80bb4d670165c8327d.c746940bc1.wbe@email03.secureserver.net> References: <20160208113049.665a7a7059d7ee80bb4d670165c8327d.c746940bc1.wbe@email03.secureserver.net> Message-ID: On Mon, Feb 8, 2016 at 12:30 PM, Doug Ewell wrote: > James Tauber wrote: > > > I'm wondering what potential objections / problems I should be aware > > of before trying to put together a proposal for these extra > > precomposed characters to be included. > > It sounds from the blog post that the basic rationale for adding > precomposed characters is that existing fonts, input methods, and other > tools don't always work correctly with the combining sequences. > > I suppose one potential challenge you might face is to explain why the > following FAQ items, though phrased in terms of Latin base letters, > don't apply equally to Greek: > > http://www.unicode.org/faq/char_combmark.html#11 > http://www.unicode.org/faq/char_combmark.html#12b > Yes, I read those FAQs and hesitated before even posting because of them. The Greek Extended block already somewhat contradicts that by having the precomposed characters it does but I presume that was largely for legacy reasons and existing font encodings. There's no doubt the font and input methods can be improved right now regardless of any change to Unicode. That said, I still have questions around relative ordering of combining characters and also interaction of combining characters and precomposed characters. At the very least I'd like to put together some best practices for those dealing with polytonic Greek, even before I go to font foundries and keyboard software developers. Even with all this, though, my own work includes accentuation and syllabification algorithms, all of which are made more cumbersome by the lack of precomposed characters indicating vowel length. I'm currently leaning towards adding a layer of "character" processing on top of Python 3's otherwise decent support that effectively treats the relevant character sequences as single characters even if they aren't (and can't be precomposed). I'd be interested if others have tackled similar issues outside of Greek. James -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus.icu at gmail.com Mon Feb 8 13:10:20 2016 From: markus.icu at gmail.com (Markus Scherer) Date: Mon, 8 Feb 2016 11:10:20 -0800 Subject: precomposed polytonic Greek characters with macrons and other diacritics In-Reply-To: References: <20160208113049.665a7a7059d7ee80bb4d670165c8327d.c746940bc1.wbe@email03.secureserver.net> Message-ID: On Mon, Feb 8, 2016 at 10:47 AM, James Tauber wrote: > Even with all this, though, my own work includes accentuation and > syllabification algorithms, all of which are made more cumbersome by the > lack of precomposed characters indicating vowel length. I'm currently > leaning towards adding a layer of "character" processing on top of Python > 3's otherwise decent support that effectively treats the relevant character > sequences as single characters even if they aren't (and can't be > precomposed). > I suggest you normalize the text (NFC or NFD), and then look for "grapheme clusters". http://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries In C++ and Java, you could use an ICU BreakIterator for the latter. markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From leob at mailcom.com Mon Feb 8 13:34:14 2016 From: leob at mailcom.com (Leo Broukhis) Date: Mon, 8 Feb 2016 11:34:14 -0800 Subject: Enclosing BANKNOTE emoji? Message-ID: There are ?? U+01F4B4 Banknote With Yen Sign ?? U+01F4B5 Banknote With Dollar Sign ?? U+01F4B6 Banknote With Euro Sign ?? U+01F4B7 Banknote With Pound Sign This is clearly an incomplete set. It makes sense to have a generic "enclosing banknote" emoji character which, when combined with a currency sign, would produce the corresponding banknote, to forestall requests for individual emoji for banknotes with remaining currency signs. Leo From davidj_faulks at yahoo.ca Mon Feb 8 13:44:27 2016 From: davidj_faulks at yahoo.ca (David Faulks) Date: Mon, 8 Feb 2016 19:44:27 +0000 (UTC) Subject: Uranian Astrology Symbols References: <1475789143.886571.1454960667619.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <1475789143.886571.1454960667619.JavaMail.yahoo@mail.yahoo.com> > On Sun, 2/7/16, Asmus Freytag wrote: > Subject: Re: Uranian Astrology Symbols > To: "Chris Jacobs" , "David Faulks" > Cc: "Unicode Mailing List" >Received: Sunday, February 7, 2016, 6:37 PM >>On 2/7/2016 1:02 PM, Chris Jacobs wrote: [ text cut ] >> This website gives four different shapes, not counting >> the Discordian one. >> http://www.zanestein.com/Trans-pluto.htm#UB313 >> >> Chris [ text cut ] > So, the question here would be: are these different shapes of the > same symbol, or different symbols used for the same purpose. > From the website I would think we have at least 4 > distinct symbols. Two of the shapes look like they might be > alternate representations of the same symbol. In addition to Eris, there is also a related issue for Pluto. The encoding of U+26E2 ?, separate from U+2645 ?, for Uranus, seems to set a precedent, and there are at least 3 extra symbols for Pluto that are in use. This has been discussed before. Should these (or at least the most common one) be encoded as well? > A./ David From liz at dijkmat.nl Mon Feb 8 13:29:35 2016 From: liz at dijkmat.nl (Elizabeth Mattijsen) Date: Mon, 8 Feb 2016 20:29:35 +0100 Subject: precomposed polytonic Greek characters with macrons and other diacritics In-Reply-To: References: <20160208113049.665a7a7059d7ee80bb4d670165c8327d.c746940bc1.wbe@email03.secureserver.net> Message-ID: > On 08 Feb 2016, at 20:10, Markus Scherer wrote: > > On Mon, Feb 8, 2016 at 10:47 AM, James Tauber wrote: > Even with all this, though, my own work includes accentuation and syllabification algorithms, all of which are made more cumbersome by the lack of precomposed characters indicating vowel length. I'm currently leaning towards adding a layer of "character" processing on top of Python 3's otherwise decent support that effectively treats the relevant character sequences as single characters even if they aren't (and can't be precomposed). > > I suggest you normalize the text (NFC or NFD), and then look for "grapheme clusters". http://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries > > In C++ and Java, you could use an ICU BreakIterator for the latter. Might I suggest looking at Rakudo Perl 6?s implementation of NFG (Normalization Form Grapheme) which will generate synthetic codepoints on the fly under the hood. For an introduction, see http://jnthn.net/papers/2015-spw-nfg.pdf Liz From leob at mailcom.com Mon Feb 8 17:33:50 2016 From: leob at mailcom.com (Leo Broukhis) Date: Mon, 8 Feb 2016 15:33:50 -0800 Subject: Enclosing BANKNOTE emoji? In-Reply-To: References:

Message-ID: I don't see why it is an "emoji exception", and I don't see any implementation issues given that replacing pairs of regional indicator symbols with the corresponding flags already works on many platforms. The rationale for COMBINING BANKNOTE is specifically to avoid the need for individual banknote emoji for every extant or future currency. On Mon, Feb 8, 2016 at 12:29 PM, Roozbeh Pournader wrote: > What's usually ignored in these discussions is how hard it is to actually > implement such "new" mechanisms implemented in software. I would be against > such a new mechanism, simply because it's yet another emoji "exception". > > If you think there's a need for such emojis (banknote with other > currencies), please write a proposal for the UTC. I for one would consider > a proposal with an atomic BANKNOTE WITH RIAL SIGN in a much more positive > light than one for a COMBINING BANKNOTE. > > On Mon, Feb 8, 2016 at 11:34 AM, Leo Broukhis wrote: > >> There are >> >> ?? U+01F4B4 Banknote With Yen Sign >> ?? U+01F4B5 Banknote With Dollar Sign >> ?? U+01F4B6 Banknote With Euro Sign >> ?? U+01F4B7 Banknote With Pound Sign >> >> This is clearly an incomplete set. It makes sense to have a generic >> "enclosing banknote" emoji character which, when combined with a >> currency sign, would produce the corresponding banknote, to forestall >> requests for individual emoji for banknotes with remaining currency >> signs. >> >> Leo >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtauber at jtauber.com Mon Feb 8 17:59:10 2016 From: jtauber at jtauber.com (James Tauber) Date: Mon, 8 Feb 2016 17:59:10 -0600 Subject: precomposed polytonic Greek characters with macrons and other diacritics In-Reply-To: References: <20160208113049.665a7a7059d7ee80bb4d670165c8327d.c746940bc1.wbe@email03.secureserver.net>

Message-ID: On Mon, Feb 8, 2016 at 1:29 PM, Elizabeth Mattijsen wrote: > > On 08 Feb 2016, at 20:10, Markus Scherer wrote: > > > > On Mon, Feb 8, 2016 at 10:47 AM, James Tauber > wrote: > > Even with all this, though, my own work includes accentuation and > syllabification algorithms, all of which are made more cumbersome by the > lack of precomposed characters indicating vowel length. I'm currently > leaning towards adding a layer of "character" processing on top of Python > 3's otherwise decent support that effectively treats the relevant character > sequences as single characters even if they aren't (and can't be > precomposed). > > > > I suggest you normalize the text (NFC or NFD), and then look for > "grapheme clusters". > http://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries > > > > In C++ and Java, you could use an ICU BreakIterator for the latter. > > Might I suggest looking at Rakudo Perl 6?s implementation of NFG > (Normalization Form Grapheme) which will generate synthetic codepoints on > the fly under the hood. > > For an introduction, see http://jnthn.net/papers/2015-spw-nfg.pdf > Thanks very much, I'll look into this. Having done a Python implementation of the UCA, I'm quite looking forward to doing more Unicode tools for Python. James -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at kli.org Mon Feb 8 18:26:00 2016 From: mark at kli.org (Mark E. Shoulson) Date: Mon, 8 Feb 2016 19:26:00 -0500 Subject: precomposed polytonic Greek characters with macrons and other diacritics In-Reply-To: References: <20160208113049.665a7a7059d7ee80bb4d670165c8327d.c746940bc1.wbe@email03.secureserver.net> Message-ID: <56B93218.5050401@kli.org> On 02/08/2016 01:47 PM, James Tauber wrote: > > I'd be interested if others have tackled similar issues outside of Greek. > > James > > Keep in mind that in pointed Hebrew (or Arabic (or for that matter Devanagari)), practically every letter is like this, since each vowel is a diacritical, from a typographical point of view. Though perhaps not considered in the same way that Greek considers its accented letters. ~mark From everson at evertype.com Mon Feb 8 19:47:17 2016 From: everson at evertype.com (Michael Everson) Date: Tue, 9 Feb 2016 01:47:17 +0000 Subject: transliteration of mjagkij znak (Cyrillic soft sign) In-Reply-To: <56B8CFD4.1070105@uni-konstanz.de> References: <56B8CFD4.1070105@uni-konstanz.de> Message-ID: <8E675D4C-0F35-4FBC-8AD6-3FEE8197472E@evertype.com> It?s what I was taught as the scientific romanization for Russian and Slavic in general. Michael Everson * http://www.evertype.com/ From asmus-inc at ix.netcom.com Mon Feb 8 19:59:55 2016 From: asmus-inc at ix.netcom.com (Asmus Freytag (t)) Date: Mon, 8 Feb 2016 17:59:55 -0800 Subject: transliteration of mjagkij znak (Cyrillic soft sign) In-Reply-To: <8E675D4C-0F35-4FBC-8AD6-3FEE8197472E@evertype.com> References: <56B8CFD4.1070105@uni-konstanz.de> <8E675D4C-0F35-4FBC-8AD6-3FEE8197472E@evertype.com> Message-ID: <56B9481B.2030109@ix.netcom.com> An HTML attachment was scrubbed... URL: From duerst at it.aoyama.ac.jp Mon Feb 8 20:26:36 2016 From: duerst at it.aoyama.ac.jp (=?UTF-8?Q?Martin_J._D=c3=bcrst?=) Date: Tue, 9 Feb 2016 11:26:36 +0900 Subject: precomposed polytonic Greek characters with macrons and other diacritics In-Reply-To: References: Message-ID: <56B94E5C.7020101@it.aoyama.ac.jp> On 2016/02/09 02:10, James Tauber wrote: > http://jktauber.com/2016/01/28/polytonic-greek-unicode-is-still-not-perfect/ Hello James, I read your article. I just wanted to point out that in your problem 3, the two sequences aren't normalized because if the acute accent is first, that would be considered as a different character, namely with the macron *on top of* the accent. Regards, Martin. From ruland at luckymail.com Mon Feb 8 20:39:38 2016 From: ruland at luckymail.com (Charlie Ruland) Date: Tue, 9 Feb 2016 03:39:38 +0100 Subject: transliteration of mjagkij znak (Cyrillic soft sign) In-Reply-To: <56B9481B.2030109@ix.netcom.com> References: <56B8CFD4.1070105@uni-konstanz.de> <8E675D4C-0F35-4FBC-8AD6-3FEE8197472E@evertype.com> <56B9481B.2030109@ix.netcom.com> Message-ID: <56B9516A.8090607@luckymail.com> Am 09.02.2016 schrieb Asmus Freytag (t): > On 2/8/2016 5:47 PM, Michael Everson wrote: >> It?s what I was taught as the scientific romanization for Russian and Slavic in general. >> >> Michael Everson *http://www.evertype.com/ >> >> >> > Source? > > A./ Look at tables 27.1 (p. 348) and 27.2 (p. 351) of Paul Cubberley?s /The Slavic Alphabets/ (=Peter T. Daniels and William Bright (eds.): /The Word?s Writing Systems/, pp. 346?355). Obviously the soft sign is transliterated as a prime , and the hard sign as a double prime . Also note that [g?] is Romanized as which can hardly be considered an apostrophe above . Charlie -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmus-inc at ix.netcom.com Mon Feb 8 23:31:13 2016 From: asmus-inc at ix.netcom.com (Asmus Freytag (t)) Date: Mon, 8 Feb 2016 21:31:13 -0800 Subject: transliteration of mjagkij znak (Cyrillic soft sign) In-Reply-To: <56B9516A.8090607@luckymail.com> References: <56B8CFD4.1070105@uni-konstanz.de> <8E675D4C-0F35-4FBC-8AD6-3FEE8197472E@evertype.com> <56B9481B.2030109@ix.netcom.com> <56B9516A.8090607@luckymail.com> Message-ID: <56B979A1.6060700@ix.netcom.com> An HTML attachment was scrubbed... URL: From mark at macchiato.com Tue Feb 9 00:25:59 2016 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=) Date: Tue, 9 Feb 2016 07:25:59 +0100 Subject: Enclosing BANKNOTE emoji? In-Reply-To: References: Message-ID: I would suggest that you first gather statistics and present statistics on how often the current combinations are used compared to other emoji, eg by consulting sources such as: http://www.emojixpress.com/stats/ or http://emojitracker.com/ Mark On Mon, Feb 8, 2016 at 8:34 PM, Leo Broukhis wrote: > There are > > ?? U+01F4B4 Banknote With Yen Sign > ?? U+01F4B5 Banknote With Dollar Sign > ?? U+01F4B6 Banknote With Euro Sign > ?? U+01F4B7 Banknote With Pound Sign > > This is clearly an incomplete set. It makes sense to have a generic > "enclosing banknote" emoji character which, when combined with a > currency sign, would produce the corresponding banknote, to forestall > requests for individual emoji for banknotes with remaining currency > signs. > > Leo > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leob at mailcom.com Tue Feb 9 04:00:55 2016 From: leob at mailcom.com (Leo Broukhis) Date: Tue, 9 Feb 2016 02:00:55 -0800 Subject: Enclosing BANKNOTE emoji? In-Reply-To: References: Message-ID: Thank you for the links, quite mesmerizing! On emojitracker.com (cumulative counts, but only on twitter, AFAICS), U+1F4B5 ($) had quite a respectable count of 2932622 (well above the middle of the page, around 70%ile), U+1F4B7 (pound) had 514536 (around 30%ile), and U+1F4B4 and U+1F4B6 had around 353K and 388K resp. (around 20%ile, but 10x more than the lowest counts, and about the same frequency as various individual clock faces). It is quite evident that the dollar banknote emoji serves as a stand-in for at least half a dozen of various currencies. On Mon, Feb 8, 2016 at 10:25 PM, Mark Davis ?? wrote: > I would suggest that you first gather statistics and present statistics on > how often the current combinations are used compared to other emoji, eg by > consulting sources such as: > > http://www.emojixpress.com/stats/ > or > http://emojitracker.com/ > > Mark > > On Mon, Feb 8, 2016 at 8:34 PM, Leo Broukhis wrote: > >> There are >> >> ?? U+01F4B4 Banknote With Yen Sign >> ?? U+01F4B5 Banknote With Dollar Sign >> ?? U+01F4B6 Banknote With Euro Sign >> ?? U+01F4B7 Banknote With Pound Sign >> >> This is clearly an incomplete set. It makes sense to have a generic >> "enclosing banknote" emoji character which, when combined with a >> currency sign, would produce the corresponding banknote, to forestall >> requests for individual emoji for banknotes with remaining currency >> signs. >> >> Leo >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidj_faulks at yahoo.ca Tue Feb 9 07:27:05 2016 From: davidj_faulks at yahoo.ca (David Faulks) Date: Tue, 9 Feb 2016 13:27:05 +0000 (UTC) Subject: More Astrology Symbols References: <404271703.1156845.1455024425861.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <404271703.1156845.1455024425861.JavaMail.yahoo@mail.yahoo.com> I feel pretty confident in proposing the Uranian Planet symbols, but I am now wondering how far I can go. Astrological symbols are mostly used in charts. Rarely, you will also see a tabular listing of aspects, positions, or midpoints accompanying the chart, and these will have symbols. Even more rarely, astrologers will discuss or mention aspects using symbols instead of words. However, many astrology programs used to produce charts nowadays can also produce tables (in image format because of the symbols) automatically, so any symbol appearing in charts can potentially appear in tables (text). These tables can rarely be found in PDFs ( http://www.tonybonin.de/IQ-Jauch.PDF has a good example on page 9 and 10 ), but you can also find somewhat similar tables embedded inside images on the internet (easy to find using Google image search) There are plenty of extra symbols I've seen in charts, but for which I otherwise lack text examples (except for one or two)?in use?, as opposed to merely showing what they are. Transpluto : An ?Astrological Planet? invented in 1972, also called ?Isis?, ?Bacchus?, and so on. Has a well defined symbol. I do have one example from a table, but the other examples are just for showing the symbol or from charts. Vulcan : This hypothetical intra-mercurian planet may have been disproved by General Relativity, but that has not stopped some astrologers from using it to this day. The symbol is simple enough, but I haven't foundanything to unify it with. Sedna: The only trans-Neptunian object other than Pluto and Eris that has a symbol that astrologers commonly use. People have devised symbols for the other Dwarf Plants and some of the smaller TNO's, but I have not seen them in charts (even when I looked). The following images : https://wegoastrology.files.wordpress.com/2014/10/sfpage.jpg http://www.the-dreamweaver.net/portal/images/Lunar%20eclipse%202013%20apr%2025.png have some info outside the chart proper that include the Sedna symbol. Extra Asteroids: Astrologers have devised symbols for asteroids other than Ceres, Pallas, Juna, and Vesta, but the only ones I've seen in charts are Hygeia, Astraea, Lillith, and Sappho. The Sappho symbol is usually identical to U+26A2 DOUBLED FEMALE SIGN (unless you replace the circles with hearts, which is probably just a stylistic variation), but the others are not in Unicode. ?Waldemath?s Moon? aka ?Dark Moon Lilith?: Not to be confused with Black Moon Lilith. This is an ?Astrological Moon? of Earth. There is no need for a separate symbol, since it looks like U+2205 EMPTY SET or U+2300 DIAMETER SIGN. Centaurs : Small Planetoids that orbit between the orbits of Jupiter and Neptune. Chiron (? U+26B7) is one of them, so when other such objects began to be discovered in the 90's, some astrologers started using them. The only ones I have actually come across in charts are symbols for Pholus and Nessus. Finally, there is some confusion caused by the orbit of the Moon. Astrology uses virtual points calculated from this orbit ( ? ? ? ). Thanks to the sun and the barycentre, the orbit of the moon is rather wobbly, and before the 90's, astrologers typically (but not always) used an approximation. With the advent of astrology software and downloadable NASA/JPL information, accurate virtual points became easy. Versions of ? and ? with ?T? inside them can be used to indicate the ?true? nodes. Also, there is ?, a reversed glyph is sometimes used to indicate the ?True? Black Moon Lilith. I have seen charts with both the regular (mean) and reversed (true) Liliths. David From everson at evertype.com Tue Feb 9 07:43:25 2016 From: everson at evertype.com (Michael Everson) Date: Tue, 9 Feb 2016 13:43:25 +0000 Subject: transliteration of mjagkij znak (Cyrillic soft sign) In-Reply-To: <56B979A1.6060700@ix.netcom.com> References: <56B8CFD4.1070105@uni-konstanz.de> <8E675D4C-0F35-4FBC-8AD6-3FEE8197472E@evertype.com> <56B9481B.2030109@ix.netcom.com> <56B9516A.8090607@luckymail.com> <56B979A1.6060700@ix.netcom.com> Message-ID: On 9 Feb 2016, at 05:31, Asmus Freytag (t) wrote: > Without scouring the book I don't know whether there's another place in it where something's unquestioningly the prime. In that case we could figure out whether its appearance is simply the way that font does it. Alternatively, if making double prime look different from two single primes, perhaps that's common enough across fonts, and would help to lay any doubts to rest - but so far, what I see is a spacing acute. Well, Asmus, it isn?t one. We linguists have been taught it?s the prime. https://en.wikipedia.org/wiki/Prime_(symbol)#Use_in_linguistics Michael Everson * http://www.evertype.com/ From mheijdra at Princeton.EDU Tue Feb 9 08:14:51 2016 From: mheijdra at Princeton.EDU (Martin Heijdra) Date: Tue, 9 Feb 2016 14:14:51 +0000 Subject: transliteration of mjagkij znak (Cyrillic soft sign) In-Reply-To: References: <56B8CFD4.1070105@uni-konstanz.de> <8E675D4C-0F35-4FBC-8AD6-3FEE8197472E@evertype.com> <56B9481B.2030109@ix.netcom.com> <56B9516A.8090607@luckymail.com> <56B979A1.6060700@ix.netcom.com> Message-ID: <0001012FBBD4FE40857959B0B65DE95B6E5EC59E@CSGMBX202W.pu.win.princeton.edu> And so it is, also in the library world both before and after Unicode: for miagkii znak the prime is prescribed. The prime is also prescribed for some uses for standard transliteration in Tibetan and Hebrew/Arabic/Persian/Pushto: See:e.g. the relevant tables on https://www.loc.gov/catdir/cpso/roman.html: Tibetan: When two full forms of letters are stacked, as in Sanskritized Tibetan, there is no need to indicate the stacking. However, in the two cases noted here a modified letter prime should be inserted between the two consonants for the purpose of disambiguation. ??? t?sa ?? tsa ??? n?ya ?? nya Hebrew: A single prime ( ? ) is placed between two letters representing two distinct consonantal sounds when the combination might otherwise be read as a digraph. his?hid Persian: When the affix and the word with which it is connected grammatically are written separately in Persian, the two are separated in romanization by a single prime ( ? ). kh?nah?h? Martin Heijdra -----Original Message----- From: Unicode [mailto:unicode-bounces at unicode.org] On Behalf Of Michael Everson Sent: Tuesday, February 09, 2016 8:43 AM To: Unicode Discussion Subject: Re: transliteration of mjagkij znak (Cyrillic soft sign) On 9 Feb 2016, at 05:31, Asmus Freytag (t) > wrote: > Without scouring the book I don't know whether there's another place in it where something's unquestioningly the prime. In that case we could figure out whether its appearance is simply the way that font does it. Alternatively, if making double prime look different from two single primes, perhaps that's common enough across fonts, and would help to lay any doubts to rest - but so far, what I see is a spacing acute. Well, Asmus, it isn?t one. We linguists have been taught it?s the prime. https://en.wikipedia.org/wiki/Prime_(symbol)#Use_in_linguistics Michael Everson * http://www.evertype.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtauber at jtauber.com Tue Feb 9 09:13:04 2016 From: jtauber at jtauber.com (James Tauber) Date: Tue, 9 Feb 2016 09:13:04 -0600 Subject: precomposed polytonic Greek characters with macrons and other diacritics In-Reply-To: <56B94E5C.7020101@it.aoyama.ac.jp> References: <56B94E5C.7020101@it.aoyama.ac.jp> Message-ID: On Mon, Feb 8, 2016 at 8:26 PM, Martin J. D?rst wrote: > On 2016/02/09 02:10, James Tauber wrote: > >> >> http://jktauber.com/2016/01/28/polytonic-greek-unicode-is-still-not-perfect/ >> > > Hello James, > > I read your article. I just wanted to point out that in your problem 3, > the two sequences aren't normalized because if the acute accent is first, > that would be considered as a different character, namely with the macron > *on top of* the accent. Thanks. I've updated the post to clarify it's not a problem with Unicode per se. James -------------- next part -------------- An HTML attachment was scrubbed... URL: From unicode at acjs.net Tue Feb 9 05:18:33 2016 From: unicode at acjs.net (ACJ Unicode) Date: Tue, 9 Feb 2016 12:18:33 +0100 Subject: Case for letters j and J with acute Message-ID: <56B9CB09.8060906@acjs.net> Hello, This is my first time posting here, so please forgive me if I don?t get all the ethics right. I would like to make a case for an aspect of my native language (Dutch) that has always been problematic in the digital realm. Some context: I?m a (typo)graphic designer with a background in interaction design. In the Dutch language, acute accents are used to indicate stressed vowels. [1] Also in the Dutch language, the digraph IJ (lowercase ij) is considered a separate letter and a vowel. [2] Hence, when putting emphasis on a word that contains ij, one would put acute accents over the i and the j. [3] This is taught in writing in primary school in the Netherlands (or at least it was 30 years ago), but this practice is often abandoned soon afterwards, probably because of the technical difficulty. The only way to achieve this digitally appears to have LATIN SMALL LETTER I WITH ACUTE (U+00ED) be followed by LATIN SMALL LETTER DOTLESS J (U+0237) /and/ COMBINING ACUTE ACCENT (U+0301). This poses several problems: * It makes casual user input highly impractical; * it adds complexity to automating the process of adding emphasis to vowels; * technical support is understandably lacking; * it makes it virtually impossible for type designers to address properly and consistently. To me, the obvious solution to these problems would be to at least add the following characters to the Unicode standard: * LATIN SMALL LETTER J WITH ACUTE; * LATIN CAPITAL LETTER J WITH ACUTE. For completeness sake, one could also make a case for the following: * LATIN SMALL LIGATURE IJ WITH ACUTES; * LATIN CAPITAL LIGATURE IJ WITH ACUTES... but since the use of the original Unicode ligatures is already discouraged, we could probably go without those. Sincerely, Alexander Dekker deidee [1] https://en.wikipedia.org/wiki/Acute_accent#Stress [2] https://en.wikipedia.org/wiki/IJ_(digraph) [3] https://en.wikipedia.org/wiki/IJ_(digraph)#Stress -------------- next part -------------- An HTML attachment was scrubbed... URL: From everson at evertype.com Tue Feb 9 09:58:52 2016 From: everson at evertype.com (Michael Everson) Date: Tue, 9 Feb 2016 15:58:52 +0000 Subject: Case for letters j and J with acute In-Reply-To: <56B9CB09.8060906@acjs.net> References: <56B9CB09.8060906@acjs.net> Message-ID: <9BC33AA8-A390-4A5D-8C65-EC6DD7681372@evertype.com> On 9 Feb 2016, at 11:18, ACJ Unicode wrote: > This is taught in writing in primary school in the Netherlands (or at least it was 30 years ago), but this practice is often abandoned soon afterwards, probably because of the technical difficulty. The only way to achieve this digitally appears to have LATIN SMALL LETTER I WITH ACUTE (U+00ED) be followed by LATIN SMALL LETTER DOTLESS J (U+0237) and COMBINING ACUTE ACCENT (U+0301). It is a font rendering issue. A pre-composed j? will not be added to the standard. > ? It makes casual user input highly impractical; This is dependent on the keyboard layout, not the encoding. > ? it adds complexity to automating the process of adding emphasis to vowels; > ? technical support is understandably lacking; True, but for technical reasons pre-composed characters will NOT be added to the standard. > ? LATIN SMALL LETTER J WITH ACUTE; > ? LATIN CAPITAL LETTER J WITH ACUTE. This just won?t ever happen. > ? it makes it virtually impossible for type designers to address properly and consistently. Well, the specification should be ? (or i + combining acute) + j + combining acute. Neither dotless i nor dotless j would be correct. > For completeness sake, one could also make a case for the following: > > ? LATIN SMALL LIGATURE IJ WITH ACUTES; > ? LATIN CAPITAL LIGATURE IJ WITH ACUTES. Or ? (or ?) + combining double acute. Michael Everson * http://www.evertype.com/ From markus.icu at gmail.com Tue Feb 9 10:05:40 2016 From: markus.icu at gmail.com (Markus Scherer) Date: Tue, 9 Feb 2016 08:05:40 -0800 Subject: Case for letters j and J with acute In-Reply-To: <9BC33AA8-A390-4A5D-8C65-EC6DD7681372@evertype.com> References: <56B9CB09.8060906@acjs.net> <9BC33AA8-A390-4A5D-8C65-EC6DD7681372@evertype.com> Message-ID: On Tue, Feb 9, 2016 at 7:58 AM, Michael Everson wrote: > On 9 Feb 2016, at 11:18, ACJ Unicode wrote: > > > This is taught in writing in primary school in the Netherlands (or at > least it was 30 years ago), but this practice is often abandoned soon > afterwards, probably because of the technical difficulty. The only way to > achieve this digitally appears to have LATIN SMALL LETTER I WITH ACUTE > (U+00ED) be followed by LATIN SMALL LETTER DOTLESS J (U+0237) and COMBINING > ACUTE ACCENT (U+0301). > > It is a font rendering issue. A pre-composed j? will not be added to the > standard. > The regular 'j' has the Soft_Dotted property, which means that when you add a diacritic-above, the dot should go away. http://www.unicode.org/reports/tr44/#Soft_Dotted When the dot does not disappear, please submit an error report for the platform/browser you are using. > ? it adds complexity to automating the process of adding emphasis > to vowels; > > ? technical support is understandably lacking; > > True, but for technical reasons pre-composed characters will NOT be added > to the standard. > > > ? LATIN SMALL LETTER J WITH ACUTE; > > ? LATIN CAPITAL LETTER J WITH ACUTE. > > This just won?t ever happen. > Technical reasons include http://unicode.org/policies/stability_policy.html#Normalization markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From frederic.grosshans at gmail.com Tue Feb 9 10:05:46 2016 From: frederic.grosshans at gmail.com (=?UTF-8?Q?Fr=c3=a9d=c3=a9ric_Grosshans?=) Date: Tue, 9 Feb 2016 17:05:46 +0100 Subject: Case for letters j and J with acute In-Reply-To: <56B9CB09.8060906@acjs.net> References: <56B9CB09.8060906@acjs.net> Message-ID: <56BA0E5A.2020402@gmail.com> Le 09/02/2016 12:18, ACJ Unicode a ?crit : > [...] > To me, the obvious solution to these problems would be to at least add > the following characters to the Unicode standard: > > * LATIN SMALL LETTER J WITH ACUTE; > * LATIN CAPITAL LETTER J WITH ACUTE. > > [...] Adding new composition of existing characters in Unicode is not done anymore since the introduction of NFC and NFD in the 1990?s . You should read http://www.unicode.org/faq/char_combmark.html#11 and following. Cheers, Fr?d?ric From frederic.grosshans at gmail.com Tue Feb 9 10:16:10 2016 From: frederic.grosshans at gmail.com (=?UTF-8?Q?Fr=c3=a9d=c3=a9ric_Grosshans?=) Date: Tue, 9 Feb 2016 17:16:10 +0100 Subject: Case for letters j and J with acute In-Reply-To: <9BC33AA8-A390-4A5D-8C65-EC6DD7681372@evertype.com> References: <56B9CB09.8060906@acjs.net> <9BC33AA8-A390-4A5D-8C65-EC6DD7681372@evertype.com> Message-ID: <56BA10CA.7080507@gmail.com> Le 09/02/2016 16:58, Michael Everson a ?crit : >> For completeness sake, one could also make a case for the following: >> > >> > ? LATIN SMALL LIGATURE IJ WITH ACUTES; >> > ? LATIN CAPITAL LIGATURE IJ WITH ACUTES. > Or ? (or ?) + combining double acute. The rendering of these in a standard font (????) is usually quite bad. While non ligated character should render correctly (I?J?i?j?). Fr?d?ric From leob at mailcom.com Tue Feb 9 10:19:48 2016 From: leob at mailcom.com (Leo Broukhis) Date: Tue, 9 Feb 2016 08:19:48 -0800 Subject: Enclosing BANKNOTE emoji? In-Reply-To: References:

Message-ID: A caveat about using emojitracker.com : it doesn't count newer emoji yet (e.g. U+1F37E bottle with popping cork is absent), thus, when they are added, their counts will be skewed. Leo On Tue, Feb 9, 2016 at 2:00 AM, Leo Broukhis wrote: > Thank you for the links, quite mesmerizing! > > On emojitracker.com (cumulative counts, but only on twitter, AFAICS), > U+1F4B5 ($) had quite a respectable count of 2932622 (well above the middle > of the page, around 70%ile), U+1F4B7 (pound) had 514536 (around 30%ile), > and U+1F4B4 and U+1F4B6 had around 353K and 388K resp. (around 20%ile, but > 10x more than the lowest counts, and about the same frequency as various > individual clock faces). > > It is quite evident that the dollar banknote emoji serves as a stand-in > for at least half a dozen of various currencies. > > On Mon, Feb 8, 2016 at 10:25 PM, Mark Davis ?? wrote: > >> I would suggest that you first gather statistics and present statistics >> on how often the current combinations are used compared to other emoji, eg >> by consulting sources such as: >> >> http://www.emojixpress.com/stats/ >> or >> http://emojitracker.com/ >> >> Mark >> >> On Mon, Feb 8, 2016 at 8:34 PM, Leo Broukhis wrote: >> >>> There are >>> >>> ?? U+01F4B4 Banknote With Yen Sign >>> ?? U+01F4B5 Banknote With Dollar Sign >>> ?? U+01F4B6 Banknote With Euro Sign >>> ?? U+01F4B7 Banknote With Pound Sign >>> >>> This is clearly an incomplete set. It makes sense to have a generic >>> "enclosing banknote" emoji character which, when combined with a >>> currency sign, would produce the corresponding banknote, to forestall >>> requests for individual emoji for banknotes with remaining currency >>> signs. >>> >>> Leo >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at macchiato.com Tue Feb 9 10:51:04 2016 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=) Date: Tue, 9 Feb 2016 17:51:04 +0100 Subject: Enclosing BANKNOTE emoji? In-Reply-To: References:

Message-ID: Look at http://www.emojixpress.com/stats/. The stats are different, since they collect data from keyboards not twitter posts, but they have a nice button to view only the news emoji. (The numbers on the new ones will be smaller, just because it takes time for systems to support them, and people to start using them. However, they bear out my predication that the most popular would be the eyes-rolling face). Mark On Tue, Feb 9, 2016 at 5:19 PM, Leo Broukhis wrote: > A caveat about using emojitracker.com : it doesn't count newer emoji yet > (e.g. U+1F37E bottle with popping cork is absent), thus, when they are > added, their counts will be skewed. > > Leo > > On Tue, Feb 9, 2016 at 2:00 AM, Leo Broukhis wrote: > >> Thank you for the links, quite mesmerizing! >> >> On emojitracker.com (cumulative counts, but only on twitter, AFAICS), >> U+1F4B5 ($) had quite a respectable count of 2932622 (well above the middle >> of the page, around 70%ile), U+1F4B7 (pound) had 514536 (around 30%ile), >> and U+1F4B4 and U+1F4B6 had around 353K and 388K resp. (around 20%ile, but >> 10x more than the lowest counts, and about the same frequency as various >> individual clock faces). >> >> It is quite evident that the dollar banknote emoji serves as a stand-in >> for at least half a dozen of various currencies. >> >> On Mon, Feb 8, 2016 at 10:25 PM, Mark Davis ?? >> wrote: >> >>> I would suggest that you first gather statistics and present statistics >>> on how often the current combinations are used compared to other emoji, eg >>> by consulting sources such as: >>> >>> http://www.emojixpress.com/stats/ >>> or >>> http://emojitracker.com/ >>> >>> Mark >>> >>> On Mon, Feb 8, 2016 at 8:34 PM, Leo Broukhis wrote: >>> >>>> There are >>>> >>>> ?? U+01F4B4 Banknote With Yen Sign >>>> ?? U+01F4B5 Banknote With Dollar Sign >>>> ?? U+01F4B6 Banknote With Euro Sign >>>> ?? U+01F4B7 Banknote With Pound Sign >>>> >>>> This is clearly an incomplete set. It makes sense to have a generic >>>> "enclosing banknote" emoji character which, when combined with a >>>> currency sign, would produce the corresponding banknote, to forestall >>>> requests for individual emoji for banknotes with remaining currency >>>> signs. >>>> >>>> Leo >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus.icu at gmail.com Tue Feb 9 13:29:38 2016 From: markus.icu at gmail.com (Markus Scherer) Date: Tue, 9 Feb 2016 11:29:38 -0800 Subject: Case for letters j and J with acute In-Reply-To: <56B9CB09.8060906@acjs.net> References: <56B9CB09.8060906@acjs.net> Message-ID: On Tue, Feb 9, 2016 at 3:18 AM, ACJ Unicode wrote: > [3] https://en.wikipedia.org/wiki/IJ_(digraph)#Stress > This says "in Unicode it is possible to combine characters into a *j* with an acute accent ? "b???na" ? though this might not be supported or rendered correctly by some fonts or systems. This *??* is the result of the combination of the dotless *?* (U+0237) and the combining acute accent ? (U+0301)." which I am pretty sure is wrong. It should read "in Unicode it is possible to combine characters into a *j* with an acute accent ? "b?j?na" ? though this might not be supported or rendered correctly by some fonts or systems. This *j?* is the result of the combination of the regular *j* and the combining acute accent ? (U+0301)." Could someone with Wikipedia edit experience please fix this? (3 edits in the sentence) markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From verdy_p at wanadoo.fr Tue Feb 9 13:38:15 2016 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Tue, 9 Feb 2016 20:38:15 +0100 Subject: Case for letters j and J with acute In-Reply-To: <56BA10CA.7080507@gmail.com> References: <56B9CB09.8060906@acjs.net> <9BC33AA8-A390-4A5D-8C65-EC6DD7681372@evertype.com> <56BA10CA.7080507@gmail.com> Message-ID: 2016-02-09 17:16 GMT+01:00 Fr?d?ric Grosshans : > Le 09/02/2016 16:58, Michael Everson a ?crit : > >> For completeness sake, one could also make a case for the following: >>> > >>> > ? LATIN SMALL LIGATURE IJ WITH ACUTES; >>> > ? LATIN CAPITAL LIGATURE IJ WITH ACUTES. >>> >> Or ? (or ?) + combining double acute. >> > The rendering of these in a standard font (????) is usually quite bad. > While non ligated character should render correctly (I?J?i?j?). This is only a font problem, not an Unicode problem. For me the IJ (or ij) with combining double accent is correct. Tell this to font authors so they fix their common fonts in later versions (here Microsoft, Adobe, Apple and Google, possibly others, should be hearing your issue for popular OS'es and applications). -------------- next part -------------- An HTML attachment was scrubbed... URL: From verdy_p at wanadoo.fr Tue Feb 9 13:48:32 2016 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Tue, 9 Feb 2016 20:48:32 +0100 Subject: Case for letters j and J with acute In-Reply-To: References: <56B9CB09.8060906@acjs.net> Message-ID: Fixed it in Wikipedia (I used "canonically equivalent" and linked it to the relevant article, instead of the imprecise expression "the result of"). 2016-02-09 20:29 GMT+01:00 Markus Scherer : > On Tue, Feb 9, 2016 at 3:18 AM, ACJ Unicode wrote: > >> [3] https://en.wikipedia.org/wiki/IJ_(digraph)#Stress >> > > This says "in Unicode it is > possible to combine characters > into a *j* with an > acute accent ? "b???na" ? though this might not be supported or rendered > correctly by some fonts or > systems. This *??* is the result of the combination of the dotless *?* (U+0237) > and the combining acute accent ? (U+0301)." > > which I am pretty sure is wrong. It should read "in Unicode > it is possible to combine > characters into a *j* with > an acute accent ? "b?j?na" ? though this might not be supported or > rendered correctly by some fonts > or systems. This *j?* is > the result of the combination of the regular *j* and the combining acute > accent ? (U+0301)." > > Could someone with Wikipedia edit experience please fix this? (3 edits in > the sentence) > > markus > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mheijdra at Princeton.EDU Tue Feb 9 14:47:05 2016 From: mheijdra at Princeton.EDU (Martin Heijdra) Date: Tue, 9 Feb 2016 20:47:05 +0000 Subject: Case for letters j and J with acute In-Reply-To: References: <56B9CB09.8060906@acjs.net> <9BC33AA8-A390-4A5D-8C65-EC6DD7681372@evertype.com> <56BA10CA.7080507@gmail.com> Message-ID: <0001012FBBD4FE40857959B0B65DE95B6E5ED23F@CSGMBX202W.pu.win.princeton.edu> Actually, current use (e.g. the Brill font made by John Hudson) says: [cid:image001.png at 01D16351.1F1BC730] The double acute is for languages such as Hungarian etc. \ Martin Heijdra From: Unicode [mailto:unicode-bounces at unicode.org] On Behalf Of Philippe Verdy Sent: Tuesday, February 09, 2016 2:38 PM To: Fr?d?ric Grosshans Cc: unicode Unicode Discussion Subject: Re: Case for letters j and J with acute 2016-02-09 17:16 GMT+01:00 Fr?d?ric Grosshans >: Le 09/02/2016 16:58, Michael Everson a ?crit : For completeness sake, one could also make a case for the following: > > ? LATIN SMALL LIGATURE IJ WITH ACUTES; > ? LATIN CAPITAL LIGATURE IJ WITH ACUTES. Or ? (or ?) + combining double acute. The rendering of these in a standard font (????) is usually quite bad. While non ligated character should render correctly (I?J?i?j?). This is only a font problem, not an Unicode problem. For me the IJ (or ij) with combining double accent is correct. Tell this to font authors so they fix their common fonts in later versions (here Microsoft, Adobe, Apple and Google, possibly others, should be hearing your issue for popular OS'es and applications). -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 12976 bytes Desc: image001.png URL: From davidj_faulks at yahoo.ca Tue Feb 9 15:23:36 2016 From: davidj_faulks at yahoo.ca (David Faulks) Date: Tue, 9 Feb 2016 21:23:36 +0000 (UTC) Subject: Case for letters j and J with acute References: <1700995758.1356675.1455053016022.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <1700995758.1356675.1455053016022.JavaMail.yahoo@mail.yahoo.com> >On Tue, 2/9/16, Philippe Verdy wrote: > This is only a font problem, not an Unicode problem. For > me the IJ (or ij) with combining double accent is correct. > Tell this to font authors so they fix their common fonts in > later versions (here Microsoft, Adobe, Apple and Google, > possibly others, should be hearing your issue for popular > OS'es and applications). Perhaps Unicode could create a ?default position? property for combining characters, and encourage OpenType and other font engines to adopt it for automatic use when no other font information is provided. Adoption would take a while, but I cannot help but think that otherwise, this issue will never go away. David From leob at mailcom.com Tue Feb 9 15:33:58 2016 From: leob at mailcom.com (Leo Broukhis) Date: Tue, 9 Feb 2016 13:33:58 -0800 Subject: Case for letters j and J with acute In-Reply-To: <9BC33AA8-A390-4A5D-8C65-EC6DD7681372@evertype.com> References: <56B9CB09.8060906@acjs.net> <9BC33AA8-A390-4A5D-8C65-EC6DD7681372@evertype.com> Message-ID: It isn't just a font rendering issue. U+0133 LATIN SMALL LIGATURE IJ doesn't have Soft_Dotted property according to http://www.unicode.org/Public/UCD/latest/ucd/PropList.txt On Tue, Feb 9, 2016 at 7:58 AM, Michael Everson wrote: > On 9 Feb 2016, at 11:18, ACJ Unicode wrote: > >> This is taught in writing in primary school in the Netherlands (or at least it was 30 years ago), but this practice is often abandoned soon afterwards, probably because of the technical difficulty. The only way to achieve this digitally appears to have LATIN SMALL LETTER I WITH ACUTE (U+00ED) be followed by LATIN SMALL LETTER DOTLESS J (U+0237) and COMBINING ACUTE ACCENT (U+0301). > > It is a font rendering issue. A pre-composed j? will not be added to the standard. > >> ? It makes casual user input highly impractical; > > This is dependent on the keyboard layout, not the encoding. > >> ? it adds complexity to automating the process of adding emphasis to vowels; >> ? technical support is understandably lacking; > > True, but for technical reasons pre-composed characters will NOT be added to the standard. > >> ? LATIN SMALL LETTER J WITH ACUTE; >> ? LATIN CAPITAL LETTER J WITH ACUTE. > > This just won?t ever happen. > >> ? it makes it virtually impossible for type designers to address properly and consistently. > > Well, the specification should be ? (or i + combining acute) + j + combining acute. Neither dotless i nor dotless j would be correct. > >> For completeness sake, one could also make a case for the following: >> >> ? LATIN SMALL LIGATURE IJ WITH ACUTES; >> ? LATIN CAPITAL LIGATURE IJ WITH ACUTES. > > Or ? (or ?) + combining double acute. > > Michael Everson * http://www.evertype.com/ > > From kent.karlsson14 at telia.com Tue Feb 9 15:34:03 2016 From: kent.karlsson14 at telia.com (Kent Karlsson) Date: Tue, 09 Feb 2016 22:34:03 +0100 Subject: Case for letters j and J with acute In-Reply-To: <9BC33AA8-A390-4A5D-8C65-EC6DD7681372@evertype.com> Message-ID: Den 2016-02-09 16:58, skrev "Michael Everson" : > Well, the specification should be ? (or i + combining acute) + j + > combining acute. Neither dotless i nor dotless j would be correct. While true, using the latter (the dotless ones) tend to render better than the dotted ones. (I.e., the Soft_dotted property is still not well supported.) > Or IJ (or ij) + combining double acute. While I agree that that maybe SHOULD be fine, the ij character has not been given the Soft_dotted property. Although, as a different matter, using the ij character tends to make automatic case mapping work better for the ij in Dutch... /Kent K From kenwhistler at att.net Tue Feb 9 15:36:10 2016 From: kenwhistler at att.net (Ken Whistler) Date: Tue, 9 Feb 2016 13:36:10 -0800 Subject: Case for letters j and J with acute In-Reply-To: <1700995758.1356675.1455053016022.JavaMail.yahoo@mail.yahoo.com> References: <1700995758.1356675.1455053016022.JavaMail.yahoo.ref@mail.yahoo.com> <1700995758.1356675.1455053016022.JavaMail.yahoo@mail.yahoo.com> Message-ID: <56BA5BCA.7060509@att.net> On 2/9/2016 1:23 PM, David Faulks wrote: > Perhaps Unicode could create a ?default position? property for combining characters, and encourage OpenType and other font engines to adopt it for automatic use when no other font information is provided. Adoption would take a while, but I cannot help but think that otherwise, this issue will never go away. > > It does. General_Category=Mn and ccc=230 indicates that a character is a non-spacing mark positioned *above* its base. Attempting to get more precise that that with a *character* property would be a mistake. Such interaction in detail between a mark and its base is an attribute of glyphs and their design, and properly belongs to the realm of rendering and fonts. --Ken From asmus-inc at ix.netcom.com Tue Feb 9 16:19:34 2016 From: asmus-inc at ix.netcom.com (Asmus Freytag (t)) Date: Tue, 9 Feb 2016 14:19:34 -0800 Subject: Case for letters j and J with acute In-Reply-To: <56BA5BCA.7060509@att.net> References: <1700995758.1356675.1455053016022.JavaMail.yahoo.ref@mail.yahoo.com> <1700995758.1356675.1455053016022.JavaMail.yahoo@mail.yahoo.com> <56BA5BCA.7060509@att.net> Message-ID: <56BA65F6.10808@ix.netcom.com> An HTML attachment was scrubbed... URL: From leob at mailcom.com Tue Feb 9 16:46:51 2016 From: leob at mailcom.com (Leo Broukhis) Date: Tue, 9 Feb 2016 14:46:51 -0800 Subject: Enclosing BANKNOTE emoji? In-Reply-To: References:

Message-ID: The emojiexpress.com site is useful to check which new emoji or combinations people actually use, but the stats are likely skewed by only measuring input from one platform. Another way to look at the emojitracker.com stats: 339M people in the Eurozone : 389K uses of Euro emoji 126M people in Japan : 354K uses of Yen emoji 140M people in UK + Turkey (likely users of the Pound emoji as a stand-in for Lira) : 515K uses of pound emoji The total is 605M people : 1258K uses of non-dollar emoji Assuming the same average frequency of use, 2933K uses of the dollar emoji would be produced by 1411M people, out of which us + canada + mexico + australia (500M) + other countries using $ as (part of) the sign for their currency are way less than a half. This means that substantially more than 500M people are using the dollar emoji by default, instead of emoji of their national currencies. Assuming a lesser frequency of use will result in a greater estimate of the affected population. Leo On Tue, Feb 9, 2016 at 8:51 AM, Mark Davis ?? wrote: > Look at http://www.emojixpress.com/stats/. The stats are different, since > they collect data from keyboards not twitter posts, but they have a nice > button to view only the news emoji. > > (The numbers on the new ones will be smaller, just because it takes time > for systems to support them, and people to start using them. However, they > bear out my predication that the most popular would be the eyes-rolling > face). > > Mark > > On Tue, Feb 9, 2016 at 5:19 PM, Leo Broukhis wrote: > >> A caveat about using emojitracker.com : it doesn't count newer emoji yet >> (e.g. U+1F37E bottle with popping cork is absent), thus, when they are >> added, their counts will be skewed. >> >> Leo >> >> On Tue, Feb 9, 2016 at 2:00 AM, Leo Broukhis wrote: >> >>> Thank you for the links, quite mesmerizing! >>> >>> On emojitracker.com (cumulative counts, but only on twitter, AFAICS), >>> U+1F4B5 ($) had quite a respectable count of 2932622 (well above the middle >>> of the page, around 70%ile), U+1F4B7 (pound) had 514536 (around 30%ile), >>> and U+1F4B4 and U+1F4B6 had around 353K and 388K resp. (around 20%ile, but >>> 10x more than the lowest counts, and about the same frequency as various >>> individual clock faces). >>> >>> It is quite evident that the dollar banknote emoji serves as a stand-in >>> for at least half a dozen of various currencies. >>> >>> On Mon, Feb 8, 2016 at 10:25 PM, Mark Davis ?? >>> wrote: >>> >>>> I would suggest that you first gather statistics and present statistics >>>> on how often the current combinations are used compared to other emoji, eg >>>> by consulting sources such as: >>>> >>>> http://www.emojixpress.com/stats/ >>>> or >>>> http://emojitracker.com/ >>>> >>>> Mark >>>> >>>> On Mon, Feb 8, 2016 at 8:34 PM, Leo Broukhis wrote: >>>> >>>>> There are >>>>> >>>>> ?? U+01F4B4 Banknote With Yen Sign >>>>> ?? U+01F4B5 Banknote With Dollar Sign >>>>> ?? U+01F4B6 Banknote With Euro Sign >>>>> ?? U+01F4B7 Banknote With Pound Sign >>>>> >>>>> This is clearly an incomplete set. It makes sense to have a generic >>>>> "enclosing banknote" emoji character which, when combined with a >>>>> currency sign, would produce the corresponding banknote, to forestall >>>>> requests for individual emoji for banknotes with remaining currency >>>>> signs. >>>>> >>>>> Leo >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenwhistler at att.net Tue Feb 9 17:01:08 2016 From: kenwhistler at att.net (Ken Whistler) Date: Tue, 9 Feb 2016 15:01:08 -0800 Subject: Case for letters j and J with acute In-Reply-To: <56BA65F6.10808@ix.netcom.com> References: <1700995758.1356675.1455053016022.JavaMail.yahoo.ref@mail.yahoo.com> <1700995758.1356675.1455053016022.JavaMail.yahoo@mail.yahoo.com> <56BA5BCA.7060509@att.net> <56BA65F6.10808@ix.netcom.com> Message-ID: <56BA6FB4.3020402@att.net> Asmus, On 2/9/2016 2:19 PM, Asmus Freytag (t) wrote: > On 2/9/2016 1:36 PM, Ken Whistler wrote: >> >> >> On 2/9/2016 1:23 PM, David Faulks wrote: >>> Perhaps Unicode could create a ?default position? property for >>> combining characters, and encourage OpenType and other font engines >>> to adopt it for automatic use when no other font information is >>> provided. Adoption would take a while, but I cannot help but think >>> that otherwise, this issue will never go away. >>> >>> >> >> It does. General_Category=Mn and ccc=230 indicates that a character is >> a non-spacing mark positioned *above* its base. >> >> Attempting to get more precise that that with a *character* property >> would >> be a mistake. Such interaction in detail between a mark and its base is >> an attribute of glyphs and their design, and properly belongs to the >> realm >> of rendering and fonts. > > What about GC=Mn and ccc=0? The *overwhelming* majority of those are for Indic scripts. > > For those, an actual positional property would make sense. And, ta da!, we have one: http://www.unicode.org/Public/8.0.0/ucd/IndicPositionalCategory.txt That also encompasses the positional classes for gc=Mc, as well as gc=Mn. > > It wouldn't need to be overly specific. It isn't -- it is designed (and being used) for Indic rendering engines. The outliers which are gc=Mn and ccc=0 but which are not covered by IndicPositionalCategory.txt include: CGJ and variation selectors and one shorthand control: irrelevant, because these aren't displayable marks. Thaana vowels Miao tone marks: irrelevant, because Miao has a very idiosyncratic encoding. Signwriting marks: Irrelevant, because Signwriting has a very idiosyncratic encoding. And I don't think adding a new positional property just to keep track of the fact that two Thaana vowels display below their consonant instead of on top makes sense. If it came to that, Thaana could just be added to IndicPositionalCategory.txt, instead. > > For example, for Unibook, I allow a convention to supply this > information to place a glyph in relation to the dotted circle; it's > described in the help file. There are some special wrinkles there, > because the values are tweaks that get applied to known fonts (that > just happen to not do the right thing when combined with an the > standard dotted circle in the charts). Just adapt IndicPositionalCategory.txt for Unibook, and you've got what you need. --Ken > > However, this approach would seem to indicate that such a scheme is > possible and with just a few values sufficiently differentiated to be > of practical use (= immensely improve on the fallback). > > A./ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmus-inc at ix.netcom.com Tue Feb 9 17:45:20 2016 From: asmus-inc at ix.netcom.com (Asmus Freytag (t)) Date: Tue, 9 Feb 2016 15:45:20 -0800 Subject: Case for letters j and J with acute In-Reply-To: <56BA6FB4.3020402@att.net> References: <1700995758.1356675.1455053016022.JavaMail.yahoo.ref@mail.yahoo.com> <1700995758.1356675.1455053016022.JavaMail.yahoo@mail.yahoo.com> <56BA5BCA.7060509@att.net> <56BA65F6.10808@ix.netcom.com> <56BA6FB4.3020402@att.net> Message-ID: <56BA7A10.1000604@ix.netcom.com> On 2/9/2016 3:01 PM, Ken Whistler wrote: > Just adapt IndicPositionalCategory.txt for Unibook, and you've got > what you need. I see. Not quite as simple; Unibook needs overrides that are specifically able to correct bad fonts, not just "dumb" ones. We may want to honor some part of the positioning. But it would be interesting to see whether we ended up duplicating the IPC values more or less. Next chance I get. A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From petercon at microsoft.com Wed Feb 10 00:26:31 2016 From: petercon at microsoft.com (Peter Constable) Date: Wed, 10 Feb 2016 06:26:31 +0000 Subject: Enclosing BANKNOTE emoji? In-Reply-To: References:

Message-ID: I wish emojitracker had an option to see cumulative stats spanning only the last (say) 7 days, rather than (I assume) all time. This would be more representative of current usage, fixing the problem of recent introductions. Also, comparing the recent and long-term stats would highlight shifting trends. Peter From: Unicode [mailto:unicode-bounces at unicode.org] On Behalf Of Leo Broukhis Sent: Tuesday, February 9, 2016 2:47 PM To: Mark Davis ?? Cc: unicode Unicode Discussion Subject: Re: Enclosing BANKNOTE emoji? The emojiexpress.com site is useful to check which new emoji or combinations people actually use, but the stats are likely skewed by only measuring input from one platform. Another way to look at the emojitracker.com stats: 339M people in the Eurozone : 389K uses of Euro emoji 126M people in Japan : 354K uses of Yen emoji 140M people in UK + Turkey (likely users of the Pound emoji as a stand-in for Lira) : 515K uses of pound emoji The total is 605M people : 1258K uses of non-dollar emoji Assuming the same average frequency of use, 2933K uses of the dollar emoji would be produced by 1411M people, out of which us + canada + mexico + australia (500M) + other countries using $ as (part of) the sign for their currency are way less than a half. This means that substantially more than 500M people are using the dollar emoji by default, instead of emoji of their national currencies. Assuming a lesser frequency of use will result in a greater estimate of the affected population. Leo On Tue, Feb 9, 2016 at 8:51 AM, Mark Davis ?? > wrote: Look at http://www.emojixpress.com/stats/. The stats are different, since they collect data from keyboards not twitter posts, but they have a nice button to view only the news emoji. (The numbers on the new ones will be smaller, just because it takes time for systems to support them, and people to start using them. However, they bear out my predication that the most popular would be the eyes-rolling face). Mark On Tue, Feb 9, 2016 at 5:19 PM, Leo Broukhis > wrote: A caveat about using emojitracker.com : it doesn't count newer emoji yet (e.g. U+1F37E bottle with popping cork is absent), thus, when they are added, their counts will be skewed. Leo On Tue, Feb 9, 2016 at 2:00 AM, Leo Broukhis > wrote: Thank you for the links, quite mesmerizing! On emojitracker.com (cumulative counts, but only on twitter, AFAICS), U+1F4B5 ($) had quite a respectable count of 2932622 (well above the middle of the page, around 70%ile), U+1F4B7 (pound) had 514536 (around 30%ile), and U+1F4B4 and U+1F4B6 had around 353K and 388K resp. (around 20%ile, but 10x more than the lowest counts, and about the same frequency as various individual clock faces). It is quite evident that the dollar banknote emoji serves as a stand-in for at least half a dozen of various currencies. [https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif] On Mon, Feb 8, 2016 at 10:25 PM, Mark Davis ?? > wrote: I would suggest that you first gather statistics and present statistics on how often the current combinations are used compared to other emoji, eg by consulting sources such as: http://www.emojixpress.com/stats/ or http://emojitracker.com/ Mark On Mon, Feb 8, 2016 at 8:34 PM, Leo Broukhis > wrote: There are ?? U+01F4B4 Banknote With Yen Sign ?? U+01F4B5 Banknote With Dollar Sign ?? U+01F4B6 Banknote With Euro Sign ?? U+01F4B7 Banknote With Pound Sign This is clearly an incomplete set. It makes sense to have a generic "enclosing banknote" emoji character which, when combined with a currency sign, would produce the corresponding banknote, to forestall requests for individual emoji for banknotes with remaining currency signs. Leo -------------- next part -------------- An HTML attachment was scrubbed... URL: From jknappen at web.de Wed Feb 10 03:38:04 2016 From: jknappen at web.de (=?UTF-8?Q?=22J=C3=B6rg_Knappen=22?=) Date: Wed, 10 Feb 2016 10:38:04 +0100 Subject: Aw: Re: Enclosing BANKNOTE emoji? In-Reply-To: References:

, Message-ID: An HTML attachment was scrubbed... URL: From tim at shilohmediainc.com Wed Feb 10 03:58:08 2016 From: tim at shilohmediainc.com (Tim) Date: Wed, 10 Feb 2016 20:58:08 +1100 Subject: Unicode line break issue Message-ID: I have a problem with Unicode in RTF. The Syriac unicode set characters that I am using seem to be breaking characters. If I follow the unicode character word group with \~ as a non breaking space, then the word will still break on the last unicode character. Is there a way that I can stop this from happening? Can I do this by using some character before or after? Can I change this by using some placeholder character other than "?" I have tried to use other placeholder characters other than "?" after the unicode number "\u1808?" but it seems that I haven't found the correct placeholder character to stop the line break at this point. Here is a sample of the data string that I need to keep together: \u1823?\u1836?\u1810?\u1808?\~{\cf11\~S10762}\~{\cf2\~Book} The unicode character word breaks after \u1808? even with the nonbreaking space \~ however the letters in the Syriac word do not break. What placeholder character can I use (or other characters) to prevent a line break after \u1808?, or is there another way that I can code this so that it will stay together? Any thoughts and help that you may be able to offer is appreciated, -------------- next part -------------- An HTML attachment was scrubbed... URL: From qsjn4ukr at gmail.com Thu Feb 11 08:05:30 2016 From: qsjn4ukr at gmail.com (QSJN 4 UKR) Date: Thu, 11 Feb 2016 16:05:30 +0200 Subject: transliteration of mjagkij znak (Cyrillic soft sign) In-Reply-To: <56B8CFD4.1070105@uni-konstanz.de> References: <56B8CFD4.1070105@uni-konstanz.de> Message-ID: I can show an example of use both, prime (as soft sign) and apostroph (hemisoft) in Cyrilic-based phonetic transcription (Orthoepic Dictionary of Ukrainian, http://padaread.com/?book=84816&pg=6 http://padaread.com/?book=84816&pg=7) From ritt.ks at gmail.com Thu Feb 11 08:36:51 2016 From: ritt.ks at gmail.com (Konstantin Ritt) Date: Thu, 11 Feb 2016 18:36:51 +0400 Subject: transliteration of mjagkij znak (Cyrillic soft sign) In-Reply-To: References: <56B8CFD4.1070105@uni-konstanz.de> Message-ID: In Ukrainian, for example, both ??? and ?`? are used. ??? is used for softer pronounce of the preceding consonant ( ???????? ), whilst ?`? is used for splitting them, like if they were the first letter in a word, even when the next vowel sounds soft otherwise ( ???`??????? -- the last ??? sounds softer the former one ). Regards, Konstantin 2016-02-11 18:05 GMT+04:00 QSJN 4 UKR : > I can show an example of use both, prime (as soft sign) and apostroph > (hemisoft) in Cyrilic-based phonetic transcription (Orthoepic > Dictionary of Ukrainian, http://padaread.com/?book=84816&pg=6 > http://padaread.com/?book=84816&pg=7) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qsjn4ukr at gmail.com Thu Feb 11 08:38:45 2016 From: qsjn4ukr at gmail.com (QSJN 4 UKR) Date: Thu, 11 Feb 2016 16:38:45 +0200 Subject: transliteration of mjagkij znak (Cyrillic soft sign) In-Reply-To: <56B8CFD4.1070105@uni-konstanz.de> References: <56B8CFD4.1070105@uni-konstanz.de> Message-ID: Prime for soft sign transliteration used to avoid ambiguty: apostroph is used for apostroph itself, common sign in Ukrainian or Belarusian. From asmus-inc at ix.netcom.com Thu Feb 11 09:59:25 2016 From: asmus-inc at ix.netcom.com (Asmus Freytag (t)) Date: Thu, 11 Feb 2016 07:59:25 -0800 Subject: transliteration of mjagkij znak (Cyrillic soft sign) In-Reply-To: References: <56B8CFD4.1070105@uni-konstanz.de> Message-ID: <56BCAFDD.3030307@ix.netcom.com> An HTML attachment was scrubbed... URL: From davidj_faulks at yahoo.ca Sun Feb 14 17:36:37 2016 From: davidj_faulks at yahoo.ca (David Faulks) Date: Sun, 14 Feb 2016 23:36:37 +0000 (UTC) Subject: Copyleft Symbol References: <2061096046.3293196.1455492997304.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <2061096046.3293196.1455492997304.JavaMail.yahoo@mail.yahoo.com> Hello, This subject has been discussed before, but I am somehwat uncertain about something: If the copyleft (reversed ?) symbol was proposed for encoding, with examples (from PDF files) showing it being used in a similar way to the copyright ? symbol, it is likely to be accepted for encoding? Thanks for any opinions. David From asmus-inc at ix.netcom.com Sun Feb 14 18:53:33 2016 From: asmus-inc at ix.netcom.com (Asmus Freytag (t)) Date: Sun, 14 Feb 2016 16:53:33 -0800 Subject: Copyleft Symbol In-Reply-To: <2061096046.3293196.1455492997304.JavaMail.yahoo@mail.yahoo.com> References: <2061096046.3293196.1455492997304.JavaMail.yahoo.ref@mail.yahoo.com> <2061096046.3293196.1455492997304.JavaMail.yahoo@mail.yahoo.com> Message-ID: <56C1218D.3080605@ix.netcom.com> An HTML attachment was scrubbed... URL: From everson at evertype.com Sun Feb 14 19:18:04 2016 From: everson at evertype.com (Michael Everson) Date: Mon, 15 Feb 2016 01:18:04 +0000 Subject: Copyleft Symbol In-Reply-To: <56C1218D.3080605@ix.netcom.com> References: <2061096046.3293196.1455492997304.JavaMail.yahoo.ref@mail.yahoo.com> <2061096046.3293196.1455492997304.JavaMail.yahoo@mail.yahoo.com> <56C1218D.3080605@ix.netcom.com> Message-ID: <69FE572B-7286-4D18-994E-E5EFAE469871@evertype.com> On 15 Feb 2016, at 00:53, Asmus Freytag (t) wrote: > > On 2/14/2016 3:36 PM, David Faulks wrote: >> Hello, >> >> This subject has been discussed before, but I am somehwat uncertain about something: >> >> If the copyleft (reversed ?) symbol was proposed for encoding, with examples (from PDF files) showing it being used in a similar way to the copyright ? symbol, it is likely to be accepted for encoding? >> > > The key issue is whether this usage is "established". > > Showing that it has been used a few times is less useful than a good estimate of how widely it is used. No emoji for bacon was ever shown in use. People just wanted it. Michael Everson * http://www.evertype.com/ From asmus-inc at ix.netcom.com Sun Feb 14 21:02:52 2016 From: asmus-inc at ix.netcom.com (Asmus Freytag (t)) Date: Sun, 14 Feb 2016 19:02:52 -0800 Subject: Copyleft Symbol In-Reply-To: <69FE572B-7286-4D18-994E-E5EFAE469871@evertype.com> References: <2061096046.3293196.1455492997304.JavaMail.yahoo.ref@mail.yahoo.com> <2061096046.3293196.1455492997304.JavaMail.yahoo@mail.yahoo.com> <56C1218D.3080605@ix.netcom.com> <69FE572B-7286-4D18-994E-E5EFAE469871@evertype.com> Message-ID: <56C13FDC.7050809@ix.netcom.com> An HTML attachment was scrubbed... URL: From tuvalkin at gmail.com Sun Feb 14 21:42:52 2016 From: tuvalkin at gmail.com (=?UTF-8?Q?Ant=c3=b3nio_Martins-Tuv=c3=a1lkin?=) Date: Mon, 15 Feb 2016 03:42:52 +0000 Subject: Copyleft Symbol In-Reply-To: <56C1218D.3080605@ix.netcom.com> References: <2061096046.3293196.1455492997304.JavaMail.yahoo.ref@mail.yahoo.com> <2061096046.3293196.1455492997304.JavaMail.yahoo@mail.yahoo.com> <56C1218D.3080605@ix.netcom.com> Message-ID: <56C1493C.9040405@gmail.com> On 2016.02.15 00:53, Asmus Freytag (t) wrote: > The key issue is whether this usage is "established". You can always make the case that what ever need is felt/expressed by a community is not enough. While it would be useless to point out that copyleft is more needed (i.e., if encoded would be used way more often) than 99% of the the whole reportoire of Unicode (like U+A66E, which is used in one single word, a weird one, too, and only optionally?), its usage is less massive than the symbols of the Creative Commons licences: the cc-ring symbol itself, and the symbols for its clauses: "share alike", "non-commercial", "attribution", and "no derivative works". See: http://en.wikipedia.org/wiki/Creative_Commons_license#Types_of_licenses I don?t miss these symbols terribly, but then again I never cared for the disunification (or non-unification) of "?" and "?", "?" and "?", and "?" and "?" ? so I calmly use instead "??" (copyleft), "??" (creative commons), "??" (share alke), "$?" (non-commercial), "???" (attribution), and "?" (no derivative works), in spite of the inadequate semantics. -- ____. Ant?nio MARTINS-Tuv?lkin | ()| N?o me invejo de quem tem|####| PT-2695-010 Bobadela LRS carros, parelhas e montes | +351 934 821 700, +351 212 463 477 s? me invejo de quem bebe | facebook.com/profile.php?id=744658416 a ?gua em todas as fontes | --------------------------------------------------------------------- De sable uma fonte e bordadura escaqueada de jalde e goles por timbre bandeira por mote o 1? verso acima e por grito de guerra "Mi rajtas!" --------------------------------------------------------------------- From asmus-inc at ix.netcom.com Sun Feb 14 23:33:22 2016 From: asmus-inc at ix.netcom.com (Asmus Freytag (t)) Date: Sun, 14 Feb 2016 21:33:22 -0800 Subject: Copyleft Symbol In-Reply-To: <56C1493C.9040405@gmail.com> References: <2061096046.3293196.1455492997304.JavaMail.yahoo.ref@mail.yahoo.com> <2061096046.3293196.1455492997304.JavaMail.yahoo@mail.yahoo.com> <56C1218D.3080605@ix.netcom.com> <56C1493C.9040405@gmail.com> Message-ID: <56C16322.1040709@ix.netcom.com> An HTML attachment was scrubbed... URL: From davidj_faulks at yahoo.ca Mon Feb 15 05:18:25 2016 From: davidj_faulks at yahoo.ca (David Faulks) Date: Mon, 15 Feb 2016 11:18:25 +0000 (UTC) Subject: Copyleft Symbol References: <254948940.3422556.1455535105174.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <254948940.3422556.1455535105174.JavaMail.yahoo@mail.yahoo.com> > Sun, 2/14/16, Asmus Freytag (t) wrote: > Subject: Re: Copyleft Symbol > To: unicode at unicode.org > Received: Sunday, February 14, 2016, 7:53 PM >> On 2/14/2016 3:36 PM, David Faulks wrote: < text cut> >> If the copyleft (reversed ?) symbol was proposed >> for encoding, with examples (from PDF files) >> showing it being used in a similar way to the >> copyright ? symbol, it is likely to be accepted >> for encoding? > The key issue is whether this usage is "established". > > Showing that it has been used a few times is less > useful than a good estimate of how widely it is > used. > > A./ An estimate is difficult, other than usage being rare. The symbol itself is widely known and here to stay (there was actually a discussion about encoding it back in 2000 on this mailing list). A google search for ?copyleft symbol? reveals many results (such as ?[ubuntu] Using copyleft symbol in text - Ubuntu Forums?), so I would say there is demand for this. The samples I have seem to be from people who want to make a statement via an anti-copyright message, are familiar with the term ?copyleft? and the associated symbol, are playful enough to want to use the symbol instead of a more formal message like creative commons (after all, the copyleft symbol has no legal standing),but are willing to go to the extra effort of using a non-standard symbol for a small message that might not even be noticed. David From chris.fynn at gmail.com Mon Feb 15 06:04:42 2016 From: chris.fynn at gmail.com (Christopher Fynn) Date: Mon, 15 Feb 2016 17:49:42 +0545 Subject: Copyleft Symbol In-Reply-To: <254948940.3422556.1455535105174.JavaMail.yahoo@mail.yahoo.com> References: <254948940.3422556.1455535105174.JavaMail.yahoo.ref@mail.yahoo.com> <254948940.3422556.1455535105174.JavaMail.yahoo@mail.yahoo.com> Message-ID: On 15/02/2016, David Faulks wrote: > .....(there was actually a discussion about encoding it back in 2000 on this mailing list). Presumably that indicates at least 15 years of usage - far longer than most emoji. - Chris From johannes at bergerhausen.com Mon Feb 15 06:07:37 2016 From: johannes at bergerhausen.com (Johannes Bergerhausen) Date: Mon, 15 Feb 2016 13:07:37 +0100 Subject: Copyleft Symbol In-Reply-To: <56C1218D.3080605@ix.netcom.com> References: <2061096046.3293196.1455492997304.JavaMail.yahoo.ref@mail.yahoo.com> <2061096046.3293196.1455492997304.JavaMail.yahoo@mail.yahoo.com> <56C1218D.3080605@ix.netcom.com> Message-ID: Am 15.02.2016 um 01:53 schrieb Asmus Freytag: > The key issue is whether this usage is "established". It is established as soon as it is part of Unicode :) > Showing that it has been used a few times is less useful than a good estimate of how widely it is used. iOS has about 1 billon active products; Android more than that?so i guess this are 2 billion possible users. Johannes From ken.shirriff at gmail.com Mon Feb 15 10:15:57 2016 From: ken.shirriff at gmail.com (Ken Shirriff) Date: Mon, 15 Feb 2016 08:15:57 -0800 Subject: Copyleft Symbol In-Reply-To: <2061096046.3293196.1455492997304.JavaMail.yahoo@mail.yahoo.com> References: <2061096046.3293196.1455492997304.JavaMail.yahoo.ref@mail.yahoo.com> <2061096046.3293196.1455492997304.JavaMail.yahoo@mail.yahoo.com> Message-ID: My advice: The most important thing is to have enough examples of the symbol in use in running text (i.e. not an icon or logo). Real published documents that demonstrate a user community are important. I recommend studying Unicode's Criteria for Encoding Symbols carefully. The rules for emoji are totally different, so saying "but emoji..." is meaningless. The proposal to add the power symbol to Unicode is a good proposal example that you can use as a model. As far as the copyleft symbol, it's well-defined (has a wikipedia page) and a web search shows demand for the symbol. It is used in running text and has semantic meaning. You found it goes back to 2000, so it's not a transient fad. I think a proposal would have a good chance of success if you can find a number of good examples of usage. This is my personal advice - I don't speak for anyone - but I've had a couple symbols accepted so these guidelines work for me. Ken On Sun, Feb 14, 2016 at 3:36 PM, David Faulks wrote: > Hello, > > This subject has been discussed before, but I am somehwat uncertain about > something: > > If the copyleft (reversed ?) symbol was proposed for encoding, with > examples (from PDF files) showing it being used in a similar way to the > copyright ? symbol, it is likely to be accepted for encoding? > > Thanks for any opinions. > > David > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at ewellic.org Mon Feb 15 11:32:01 2016 From: doug at ewellic.org (Doug Ewell) Date: Mon, 15 Feb 2016 10:32:01 -0700 Subject: Copyleft Symbol Message-ID: <20160215103201.665a7a7059d7ee80bb4d670165c8327d.2257360314.wbe@email03.secureserver.net> Asmus Freytag wrote: > with the non-standard symbols like the copyleft, there's the desire to > not encode stuff based on "passing activism". David Faulks wrote: > The samples I have seem to be from people who want to make a statement > via an anti-copyright message The lengthy thread from 2000, and the shorter one from 2012, show that the objections at those times fell into three main categories: (1) Lack of (sufficient) evidence of use as an element of running text, as opposed to a logo. There's an interesting passage on the FSF page "What is Copyleft?" about this symbol: "It is a legal mistake to use a backwards C in a circle instead of a copyright symbol. Copyleft is based legally on copyright, so the work should have a copyright notice. A copyright notice requires either the copyright symbol (a C in a circle) or the word 'Copyright'. [ ... ] A backwards C in a circle has no special legal significance, so it doesn't make a copyright notice." (2) Concern that the symbol was a passing fad. Christopher and Ken noted that the fact we are talking about it again 15 years later probably answers that concern. (3) The social-statement aspect. Ant?nio wrote in 2012, referring to the copyleft symbol plus the others he just cited (e.g. Creative Commons): "I am convinced that they were not accepted for encoding (if they were ever even formally proposed) due purely to ideological reasons." However, I checked the UTC document register going back to 2000 and could not find a proposal with the word "copyleft" in its title, so perhaps these have not been proposed after all. The recent acceptance by UTC of BITCOIN SIGN, which is also often perceived as a logo and also sometimes associated with a social movement, might indicate greater willingness of UTC to encode the copyleft symbol, even discounting the effects of the Emoji Revolution. But as always, at least for non-emoji characters, a formal proposal is probably mandatory. -- Doug Ewell | http://ewellic.org | Thornton, CO ???? From asmus-inc at ix.netcom.com Mon Feb 15 13:29:25 2016 From: asmus-inc at ix.netcom.com (Asmus Freytag (t)) Date: Mon, 15 Feb 2016 11:29:25 -0800 Subject: Copyleft Symbol In-Reply-To: <20160215103201.665a7a7059d7ee80bb4d670165c8327d.2257360314.wbe@email03.secureserver.net> References: <20160215103201.665a7a7059d7ee80bb4d670165c8327d.2257360314.wbe@email03.secureserver.net> Message-ID: <56C22715.7060406@ix.netcom.com> An HTML attachment was scrubbed... URL: From rwhlk142 at gmail.com Mon Feb 15 19:08:01 2016 From: rwhlk142 at gmail.com (Robert Wheelock) Date: Mon, 15 Feb 2016 20:08:01 -0500 Subject: Copyleft Symbol In-Reply-To: <56C22715.7060406@ix.netcom.com> References: <20160215103201.665a7a7059d7ee80bb4d670165c8327d.2257360314.wbe@email03.secureserver.net> <56C22715.7060406@ix.netcom.com> Message-ID: Hi! Shouldn?t the COPYLEFT SIGN be a small circled L?! It?s something to think about... Thank You! On Mon, Feb 15, 2016 at 2:29 PM, Asmus Freytag (t) wrote: > On 2/15/2016 9:32 AM, Doug Ewell wrote: > > Asmus Freytag wrote: > > > with the non-standard symbols like the copyleft, there's the desire to > not encode stuff based on "passing activism". > > David Faulks wrote: > > > The samples I have seem to be from people who want to make a statement > via an anti-copyright message > > The lengthy thread from 2000, and the shorter one from 2012, show that > the objections at those times fell into three main categories: > > (1) Lack of (sufficient) evidence of use as an element of running text, > as opposed to a logo. > > I take it that this has been addressed (modulo the usual difficulties > about proving that for > unencoded symbols). > > There's an interesting passage on the FSF page "What is Copyleft?" about > this symbol: > > "It is a legal mistake to use a backwards C in a circle instead of a > copyright symbol. Copyleft is based legally on copyright, so the work > should have a copyright notice. A copyright notice requires either the > copyright symbol (a C in a circle) or the word 'Copyright'. [ ... ] A > backwards C in a circle has no special legal significance, so it doesn't > make a copyright notice." > > > Unicode has always recognized usage over official status. So this should > not be an issue. > > > (2) Concern that the symbol was a passing fad. Christopher and Ken noted > that the fact we are talking about it again 15 years later probably > answers that concern. > > Very good point. > > > (3) The social-statement aspect. > > Ant?nio wrote in 2012, referring to the copyleft symbol plus the others > he just cited (e.g. Creative Commons): "I am convinced that they were > not accepted for encoding (if they were ever even formally proposed) due > purely to ideological reasons." However, I checked the UTC document > register going back to 2000 and could not find a proposal with the word > "copyleft" in its title, so perhaps these have not been proposed after > all. > > A proposal is needed, discussion on this list is useful only as far as a > proposer wants to get some suggestions on how to proceed. > > > The recent acceptance by UTC of BITCOIN SIGN, which is also often > perceived as a logo and also sometimes associated with a social > movement, might indicate greater willingness of UTC to encode the > copyleft symbol, even discounting the effects of the Emoji Revolution. > > But as always, at least for non-emoji characters, a formal proposal is > probably mandatory. > > Delete "probably". > > A./ > > > -- > Doug Ewell | http://ewellic.org | Thornton, CO ???? > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmus-inc at ix.netcom.com Mon Feb 15 20:29:04 2016 From: asmus-inc at ix.netcom.com (Asmus Freytag (t)) Date: Mon, 15 Feb 2016 18:29:04 -0800 Subject: Copyleft Symbol In-Reply-To: References: <20160215103201.665a7a7059d7ee80bb4d670165c8327d.2257360314.wbe@email03.secureserver.net> <56C22715.7060406@ix.netcom.com> Message-ID: <56C28970.4000902@ix.netcom.com> An HTML attachment was scrubbed... URL: From mats.gbproject at gmail.com Mon Feb 15 17:32:49 2016 From: mats.gbproject at gmail.com (Mats Blakstad) Date: Tue, 16 Feb 2016 00:32:49 +0100 Subject: Possible to add new precomposed characters for local language in Togo? Message-ID: I've worked to upload a keyboard for local languages in Togo to XKB project, it is a combination keyboard based on French keyboard and extended to make it possible to write all the local languages in Togo. However many of the languages have several tones and even use combined tones. However when I tried to update the composer to make it work it seems like the composer only can give back a precomposed character and not a string with combined characters. I now wonder, generally, is it best to add new precomposed characters to Unicode? Should there be a unicode symbol for each combination used? What is best practise? I ask because I see some unicodes are precomposed characters, I'm not sure why they are useful, but if they are maybe we also should add these? For reference here are the combinations needed, as you can see there are many! I've tried to check over, I don't think there exists precomposed characters for any of them. ? / epsilon = U025B : "??" LATIN SMALL LETTER EPSILON WITH ACUTE : "??" LATIN SMALL LETTER EPSILON WITH GRAVE : "??" LATIN SMALL LETTER EPSILON WITH CIRCUMFLEX : "??" LATIN SMALL LETTER EPSILON WITH CARON : "??" LATIN SMALL LETTER EPSILON WITH MACRON : "??" LATIN SMALL LETTER EPSILON WITH TILDE : "???" LATIN SMALL LETTER EPSILON WITH TILDE AND ACUTE : "???" LATIN SMALL LETTER EPSILON WITH TILDE AND GRAVE ? / EPSILON = U0190 : "??" LATIN CAPITAL LETTER EPSILON WITH ACUTE : "??" LATIN CAPITAL LETTER EPSILON WITH GRAVE : "??" LATIN CAPITAL LETTER EPSILON WITH CIRCUMFLEX : "??" LATIN CAPITAL LETTER EPSILON WITH CARON : "??" LATIN CAPITAL LETTER EPSILON WITH MACRON : "??" LATIN CAPITAL LETTER EPSILON WITH TILDE : "???" LATIN CAPITAL LETTER EPSILON WITH TILDE AND ACUTE : "???" LATIN CAPITAL LETTER EPSILON WITH TILDE AND GRAVE ? / iota = U0269 : "??" LATIN SMALL LETTER IOTA WITH ACUTE : "??" LATIN SMALL LETTER IOTA WITH GRAVE : "??" LATIN SMALL LETTER IOTA WITH CIRCUMFLEX : "??" LATIN SMALL LETTER IOTA WITH CARON : "??" LATIN SMALL LETTER IOTA WITH MACRON ? / IOTA = U0196 : "??" LATIN CAPITAL LETTER IOTA WITH ACUTE : "??" LATIN CAPITAL LETTER IOTA WITH GRAVE : "??" LATIN CAPITAL LETTER IOTA WITH CIRCUMFLEX : "??" LATIN CAPITAL LETTER IOTA WITH CARON : "??" LATIN CAPITAL LETTER IOTA WITH MACRON ? / open o = U0254 : "??" LATIN SMALL LETTER OPEN O WITH ACUTE : "??" LATIN SMALL LETTER OPEN O WITH GRAVE : "??" LATIN SMALL LETTER OPEN O WITH CIRCUMFLEX : "??" LATIN SMALL LETTER OPEN O WITH CARON : "??" LATIN SMALL LETTER OPEN O WITH MACRON : "??" LATIN SMALL LETTER OPEN O WITH TILDE : "???" LATIN SMALL LETTER OPEN O WITH TILDE AND ACUTE : "???" LATIN SMALL LETTER OPEN O WITH TILDE AND GRAVE ? / OPEN O = U0186 : "??" LATIN CAPITAL LETTER OPEN O WITH ACUTE : "??" LATIN CAPITAL LETTER OPEN O WITH GRAVE : "??" LATIN CAPITAL LETTER OPEN O WITH CIRCUMFLEX : "??" LATIN CAPITAL LETTER OPEN O WITH CARON : "??" LATIN CAPITAL LETTER OPEN O WITH MACRON : "??" LATIN CAPITAL LETTER OPEN O WITH TILDE : "???" LATIN CAPITAL LETTER OPEN O WITH TILDE AND ACUTE : "???" LATIN CAPITAL LETTER OPEN O WITH TILDE AND GRAVE ? / turned e = U01DD : "??" LATIN SMALL LETTER TURNED E WITH ACUTE : "??" LATIN SMALL LETTER TURNED E WITH GRAVE : "??" LATIN SMALL LETTER TURNED E WITH CIRCUMFLEX : "??" LATIN SMALL LETTER TURNED E WITH CARON : "??" LATIN SMALL LETTER TURNED E WITH MACRON : "??" LATIN SMALL LETTER TURNED E WITH TILDE : "???" LATIN SMALL LETTER TURNED E WITH TILDE AND ACUTE : "???" LATIN SMALL LETTER TURNED E WITH TILDE AND GRAVE ? / TURNED E = U018E : "??" LATIN CAPITAL LETTER TURNED E WITH ACUTE : "??" LATIN CAPITAL LETTER TURNED E WITH GRAVE : "??" LATIN CAPITAL LETTER TURNED E WITH CIRCUMFLEX : "??" LATIN CAPITAL LETTER TURNED E WITH CARON : "??" LATIN CAPITAL LETTER TURNED E WITH MACRON : "??" LATIN CAPITAL LETTER TURNED E WITH TILDE : "???" LATIN CAPITAL LETTER TURNED E WITH TILDE AND ACUTE : "???" LATIN CAPITAL LETTER TURNED E WITH TILDE AND GRAVE ? / v with hook = U028B : "??" LATIN SMALL LETTER V WITH HOOK WITH ACUTE : "??" LATIN SMALL LETTER V WITH HOOK WITH GRAVE : "??" LATIN SMALL LETTER V WITH HOOK WITH CIRCUMFLEX : "??" LATIN SMALL LETTER V WITH HOOK WITH CARON : "??" LATIN SMALL LETTER V WITH HOOK WITH MACRON ? / V WITH HOOK = U01B2 : "??" LATIN CAPITAL LETTER V WITH HOOK WITH ACUTE : "??" LATIN CAPITAL LETTER V WITH HOOK WITH GRAVE : "??" LATIN CAPITAL LETTER V WITH HOOK WITH CIRCUMFLEX : "??" LATIN CAPITAL LETTER V WITH HOOK WITH CARON : "??" LATIN CAPITAL LETTER V WITH HOOK WITH MACRON ? / upsilon = U028A : "??" LATIN SMALL LETTER UPSILON WITH ACUTE : "??" LATIN SMALL LETTER UPSILONK WITH GRAVE : "??" LATIN SMALL LETTER UPSILON WITH CIRCUMFLEX : "??" LATIN SMALL LETTER UPSILON WITH CARON : "??" LATIN SMALL LETTER UPSILON WITH MACRON ? / UPSILON = U01B1 : "??" LATIN CAPITAL LETTER UPSILON WITH ACUTE : "??" LATIN CAPITAL LETTER UPSILONK WITH GRAVE : "??" LATIN CAPITAL LETTER UPSILON WITH CIRCUMFLEX : "??" LATIN CAPITAL LETTER UPSILON WITH CARON : "??" LATIN CAPITAL LETTER UPSILON WITH MACRON a : "a??" LATIN SMALL LETTER A WITH TILDE AND ACUTE : "a??" LATIN SMALL LETTER A WITH TILDE AND GRAVE A : "A??" LATIN CAPITAL LETTER A WITH TILDE AND ACUTE : "A??" LATIN CAPITAL LETTER A WITH TILDE AND GRAVE e : "e??" LATIN SMALL LETTER E WITH TILDE AND ACUTE : "e??" LATIN SMALL LETTER E WITH TILDE AND GRAVE E : "E??" LATIN CAPITAL LETTER E WITH TILDE AND ACUTE : "E??" LATIN CAPITAL LETTER E WITH TILDE AND GRAVE i