From wjgo_10009 at btinternet.com Mon May 3 09:50:51 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Mon, 3 May 2021 15:50:51 +0100 (BST) Subject: Designs for language-independent emoji of correlative words Message-ID: <5449bf12.38c69.17932b5e2e2.Webtop.88@btinternet.com> I have started designing language-independent emoji of correlative words. The forum posts about this start with the fourth post on page 5 of the thread on Artwork for greetings cards. https://forum.affinity.serif.com/index.php?/topic/138654-artwork-for-greetings-cards/page/5/ William Overington Monday 3 May 2021 From jonathan.coxhead at gmail.com Tue May 4 00:07:12 2021 From: jonathan.coxhead at gmail.com (Jonathan Coxhead) Date: Mon, 3 May 2021 22:07:12 -0700 Subject: Designs for language-independent emoji of correlative words In-Reply-To: <5449bf12.38c69.17932b5e2e2.Webtop.88@btinternet.com> References: <5449bf12.38c69.17932b5e2e2.Webtop.88@btinternet.com> Message-ID: Isn?t this the Unicode mailing list? ?? > On May 3, 2021, at 8:05 AM, William_J_G Overington via Unicode wrote: > > ?I have started designing language-independent emoji of correlative words. > > The forum posts about this start with the fourth post on page 5 of the thread on Artwork for greetings cards. > > https://forum.affinity.serif.com/index.php?/topic/138654-artwork-for-greetings-cards/page/5/ > > William Overington > > Monday 3 May 2021 > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From indolering at gmail.com Tue May 11 18:02:00 2021 From: indolering at gmail.com (Zach Lym) Date: Tue, 11 May 2021 16:02:00 -0700 Subject: The Bestest Unicode API Message-ID: I have been encouraged to apply for funding to implement support for Unicode in Dafny. Dafny natively supports expressing statements about sets and contract programming and a toy implementation turned out to be a fairly rote translation of the Unicode spec. Dafny is also transpilation focused, so the primary interface must be highly functional and encoding neutral. I found Swift's Unicode implementation to be my favorite thus far. I would love pointers to other thoughtful designs and any feedback the list would like to share. Thank you, -Zach Lym -------------- next part -------------- An HTML attachment was scrubbed... URL: From haberg-1 at telia.com Wed May 12 03:24:31 2021 From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=) Date: Wed, 12 May 2021 10:24:31 +0200 Subject: Change of email address Message-ID: <15E90966-6600-49F2-8917-2C38B07547EF@telia.com> Mail from the list now is marked From instead of . Is that official? ?Sorting filters may fail, so it is good to know. From richard.wordingham at ntlworld.com Sun May 16 11:42:40 2021 From: richard.wordingham at ntlworld.com (Richard Wordingham) Date: Sun, 16 May 2021 17:42:40 +0100 Subject: MON NGA Message-ID: <20210516174240.15eeaf65@JRWUBU2> I'm having trouble with the character identity of U+105A MYANMAR LETTER MON NGA, in particular when followed by U+103A MYANMAR SIGN ASAT. Is there any guide to when it should be encoded differently to U+1004 MYANMAR LETTER NGA? Are there any rules on when the tail below is mandatory, optional or prohibited? Where shaping is at work, the Unicode charts do not attempt to help one to identify characters; rather they assume that the Platonic identifies of the characters is obvious. Unsurprisingly, this helpful assumption is sometimes false. Current practice on the Mon Wikipedia includes the rule that Mon text should have <1004, 103A> rather than <105A, 103A> so that the tail of MON NGA will not appear. The Thai Mon text in L2/20-163 also appears to have the rule that ASAT suppresses the tail. Is this rule of the Mon Wikipedia simply fighting a bad font, or does Mon text naturally mix U+1004 and U+105A? Richard. From 747.neutron at gmail.com Mon May 17 03:32:50 2021 From: 747.neutron at gmail.com (=?UTF-8?B?V8OhbmcgWWlmw6Fu?=) Date: Mon, 17 May 2021 17:32:50 +0900 Subject: Change of email address In-Reply-To: <15E90966-6600-49F2-8917-2C38B07547EF@telia.com> References: <15E90966-6600-49F2-8917-2C38B07547EF@telia.com> Message-ID: I also would like to know because I found new emails from the list delivered into my spam folder. Currently http://www.unicode.org/consortium/distlist.html still lists unicode at unicode.org while https://corp.unicode.org/mailman/listinfo/unicode prints unicode at corp.unicode.org. Did I miss an announcement? 2021?5?12?(?) 17:28 Hans ?berg via Unicode : > > Mail from the list now is marked From instead of . Is that official? ?Sorting filters may fail, so it is good to know. > > From rick at corp.unicode.org Tue May 18 16:45:43 2021 From: rick at corp.unicode.org (Rick McGowan) Date: Tue, 18 May 2021 14:45:43 -0700 Subject: Unicode.org mail system maintenance In-Reply-To: <6089D2AE.30209@unicode.org> References: <6089D2AE.30209@unicode.org> Message-ID: Hello everyone, On May 3 the Unicode Consortium did maintenance and reconfiguration of our e-mail service. Since that time, as some have noticed, the public lists such as unicode at unicode.org have been configured to display and externalize the underlying server name (corp.unicode.org). The actual server for these lists has not changed, but the subdomain is now explicitly shown in mail list configuration and list traffic. This is intended to continue into the future. Thank you for your patience, and I apologize for any recent confusion. Regards, From beebe at math.utah.edu Tue May 18 18:20:38 2021 From: beebe at math.utah.edu (Nelson H. F. Beebe) Date: Tue, 18 May 2021 17:20:38 -0600 Subject: Fast UTF-8 sequence validation Message-ID: I recently recorded a BibTeX entry in http://www.math.utah.edu/pub/tex/bib/unicode.html#Keiser:2021:VUL for a new paper that has just been published in a Wiley journal: Validating UTF-8 in less than one instruction per byte Software --- Practice and Experience 51(5) 950--964 May 2021 https://doi.org/10.1002/spe.2920 A preprint is available at https://arxiv.org/abs/2010.03090 The authors exploit vector instructions in recent AMD/Intel x86_64 and ARM v7 NEON processors to achieve high throughput that in some cases exceeds that of the Standard C library function memcpy() for mostly ASCII sequences, and for random UTF-8 sequences, runs at 1/4 to 1/2 the speed of memcpy(). C++ code implementing their work is freely available at https://github.com/lemire/validateutf8-experiments and the paper's references contain links to earlier papers on fast validation and transformation of Unicode character sequences. ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe at math.utah.edu - - 155 S 1400 E RM 233 beebe at acm.org beebe at computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - ------------------------------------------------------------------------------- From everson at evertype.com Tue May 18 18:20:51 2021 From: everson at evertype.com (Michael Everson) Date: Wed, 19 May 2021 00:20:51 +0100 Subject: Unicode.org mail system maintenance In-Reply-To: References: <6089D2AE.30209@unicode.org> Message-ID: <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> After decades of stability, this cosmetic change seems to be extraordinarily unfriendly to users. Is there a reason for it, or is it just propaganda underscoring the corporate nature of the Consortium? Michael Everson > On 18 May 2021, at 22:45, Rick McGowan via Unicode wrote: > > Hello everyone, > > On May 3 the Unicode Consortium did maintenance and reconfiguration of our e-mail service. Since that time, as some have noticed, the public lists such as unicode at unicode.org have been configured to display and externalize the underlying server name (corp.unicode.org). The actual server for these lists has not changed, but the subdomain is now explicitly shown in mail list configuration and list traffic. This is intended to continue into the future. > > Thank you for your patience, and I apologize for any recent confusion. > > Regards, > From kenwhistler at sonic.net Tue May 18 19:06:46 2021 From: kenwhistler at sonic.net (Ken Whistler) Date: Tue, 18 May 2021 17:06:46 -0700 Subject: Unicode.org mail system maintenance In-Reply-To: <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> Message-ID: <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> On 5/18/2021 4:20 PM, Michael Everson via Unicode wrote: > After decades of stability, this cosmetic change seems to be extraordinarily unfriendly to users. First of all, the change is *not* cosmetic. > > > Is there a reason for it, Yes. It is very long and very technical, having to do with OAuth 2.0 and its interaction with our mail service, and involves seeking a solution that doesn't end up with Gmail users having mail from unicode.org not being swallowed up and disappeared with no notification. > or is it just propaganda underscoring the corporate nature of the Consortium? The name of the particular subdomain that the mail service is running on predated any of this most recent set of required reconfigurations. The name of that subdomain was deliberately suppressed earlier, to keep the list delivery addresses consistent with earlier practice. That is no longer possible, for the technical reasons cited above. Please take your conspiracy mongering elsewhere. --Ken From 747.neutron at gmail.com Tue May 18 23:33:01 2021 From: 747.neutron at gmail.com (=?UTF-8?B?V8OhbmcgWWlmw6Fu?=) Date: Wed, 19 May 2021 13:33:01 +0900 Subject: Unicode.org mail system maintenance In-Reply-To: References: <6089D2AE.30209@unicode.org> Message-ID: Hi, > [...] the subdomain is now > explicitly shown in mail list configuration and list traffic. This is > intended to continue into the future. If so, would you mind updating this page http://www.unicode.org/consortium/distlist.html with new list addresses? Best regards, Wang Yifan 2021?5?19?(?) 6:50 Rick McGowan via Unicode : > > Hello everyone, > > On May 3 the Unicode Consortium did maintenance and reconfiguration of > our e-mail service. Since that time, as some have noticed, the public > lists such as unicode at unicode.org have been configured to display and > externalize the underlying server name (corp.unicode.org). The actual > server for these lists has not changed, but the subdomain is now > explicitly shown in mail list configuration and list traffic. This is > intended to continue into the future. > > Thank you for your patience, and I apologize for any recent confusion. > > Regards, > From everson at evertype.com Wed May 19 16:12:02 2021 From: everson at evertype.com (Michael Everson) Date: Wed, 19 May 2021 22:12:02 +0100 Subject: Unicode.org mail system maintenance In-Reply-To: <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> Message-ID: <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> On 19 May 2021, at 01:06, Ken Whistler via Unicode wrote: > > On 5/18/2021 4:20 PM, Michael Everson via Unicode wrote: >> After decades of stability, this cosmetic change seems to be extraordinarily unfriendly to users. > First of all, the change is *not* cosmetic. If you say so. >> Is there a reason for it, > Yes. It is very long and very technical, having to do with OAuth 2.0 and its interaction with our mail service, and involves seeking a solution that doesn't end up with Gmail users having mail from unicode.org not being swallowed up and disappeared with no notification. Well, again, if you say so. I?ve been using the internet and many different kinds of discussion forum mail services for decades and this is the first time I have heard of an organization being forced to change the mailing address in this way. >> or is it just propaganda underscoring the corporate nature of the Consortium? > > The name of the particular subdomain that the mail service is running on predated any of this most recent set of required reconfigurations. The name of that subdomain was deliberately suppressed earlier, to keep the list delivery addresses consistent with earlier practice. That is no longer possible, for the technical reasons cited above. If you say so. Again, I?ve used lots of mail software and haven?t heard that Gmail has some sort of bug or feature or unintended consequence that forces anybody to change their addresses. Sounds like a problem on the Google side rather than the Unicode side, but of course I couldn?t possibly be right about that. > Please take your conspiracy mongering elsewhere. There?s no need to be caustic about it. The new e-mail address isn?t @mail.unicode.org or @lists.unicode.org ? it?s @corp.unicode.org, which I?ve never seen before even if it has been suppressed for decades. And it?s not unreasonable for a person to read some sort of meaning into ?corp? without thinking it a ?conspiracy?. Michael From asmusf at ix.netcom.com Wed May 19 16:28:25 2021 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Wed, 19 May 2021 14:28:25 -0700 Subject: Unicode.org mail system maintenance In-Reply-To: <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> Message-ID: <57d4de9b-4b8b-f46e-001b-6df7c63ed440@ix.netcom.com> An HTML attachment was scrubbed... URL: From jameskass at code2001.com Wed May 19 23:58:39 2021 From: jameskass at code2001.com (James Kass) Date: Thu, 20 May 2021 04:58:39 +0000 Subject: Unicode.org mail system maintenance In-Reply-To: <57d4de9b-4b8b-f46e-001b-6df7c63ed440@ix.netcom.com> References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> <57d4de9b-4b8b-f46e-001b-6df7c63ed440@ix.netcom.com> Message-ID: <9cf38e38-5833-3acf-f069-f2ad3467d493@code2001.com> On 2021-05-19 9:28 PM, Asmus Freytag via Unicode wrote: > About the only thing that could have been done better is to announce such > changes before implementing them. And in enough detail, so those of us with > custom mail filters can update them. > > Other than that, life means change. Unicode stability policies don't extend to subdomain names. OAuth 2.0 appears to be about enhancing security.? Beefing up security is probably wise, given these interesting times in which we're living. https://oauth.net/2/ From stephan.stiller at gmail.com Thu May 20 01:52:28 2021 From: stephan.stiller at gmail.com (Stephan Stiller) Date: Thu, 20 May 2021 14:52:28 +0800 Subject: Unicode.org mail system maintenance In-Reply-To: <9cf38e38-5833-3acf-f069-f2ad3467d493@code2001.com> References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> <57d4de9b-4b8b-f46e-001b-6df7c63ed440@ix.netcom.com> <9cf38e38-5833-3acf-f069-f2ad3467d493@code2001.com> Message-ID: Dear all (and, long time no talk) The "Program Announced for IUC 45!" (20 May 2021, 04:25 Hong Kong time) message from Unicode (with the "corp") just ended up in my Gmail spam folder ... Best, Stephan Stiller On Thu, 20 May 2021, 13:02 James Kass via Unicode, wrote: > > > On 2021-05-19 9:28 PM, Asmus Freytag via Unicode wrote: > > About the only thing that could have been done better is to announce such > > changes before implementing them. And in enough detail, so those of us > with > > custom mail filters can update them. > > > > Other than that, life means change. > Unicode stability policies don't extend to subdomain names. > > OAuth 2.0 appears to be about enhancing security. Beefing up security > is probably wise, given these interesting times in which we're living. > > https://oauth.net/2/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From duerst at it.aoyama.ac.jp Thu May 20 02:40:35 2021 From: duerst at it.aoyama.ac.jp (=?UTF-8?Q?Martin_J=2e_D=c3=bcrst?=) Date: Thu, 20 May 2021 16:40:35 +0900 Subject: Unicode.org mail system maintenance In-Reply-To: <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> Message-ID: <5c0437fd-4981-a758-18a4-8242cb635f2b@it.aoyama.ac.jp> Hello everybody, On 2021-05-20 06:12, Michael Everson via Unicode wrote: > On 19 May 2021, at 01:06, Ken Whistler via Unicode wrote: >> >> On 5/18/2021 4:20 PM, Michael Everson via Unicode wrote: >>> After decades of stability, this cosmetic change seems to be extraordinarily unfriendly to users. >> First of all, the change is *not* cosmetic. > > If you say so. Given his past track record on all things Unicode, I have great faith in what Ken says. Given my own experience with email, I have to admit that Michael's arguments make a lot of sense to me. For some additional details, see below. >>> Is there a reason for it, >> Yes. It is very long and very technical, having to do with OAuth 2.0 and its interaction with our mail service, and involves seeking a solution that doesn't end up with Gmail users having mail from unicode.org not being swallowed up and disappeared with no notification. > > Well, again, if you say so. I?ve been using the internet and many different kinds of discussion forum mail services for decades and this is the first time I have heard of an organization being forced to change the mailing address in this way. > >>> or is it just propaganda underscoring the corporate nature of the Consortium? >> >> The name of the particular subdomain that the mail service is running on predated any of this most recent set of required reconfigurations. The name of that subdomain was deliberately suppressed earlier, to keep the list delivery addresses consistent with earlier practice. The MX record has been everybody's friend for this kind of situation for decades. Although I have heard many things about OAuth, I have my sincere doubts that it was designed to essentially eliminate the MX record, or that it accidentally lead to such a result. Also, in many settings, multiple MX records allow several hosts to be responsible for mail, either to share the load or to serve as a backup. I am *very* sure that OAuth wouldn't in any way interfere with this feature, because it's too important especially for the big providers (which includes Google). > That is no longer possible, for the technical reasons cited above. > If you say so. Again, I?ve used lots of mail software and haven?t heard that Gmail has some sort of bug or feature or unintended consequence that forces anybody to change their addresses. Sounds like a problem on the Google side rather than the Unicode side, but of course I couldn?t possibly be right about that. I think it's no unheard of that big providers such as Gmail exhibit such problems. The reasons are twofold: 1) the bigger, the more they can get away with it, and 2) they have to take each and every measure possible to get a grip on spam, which may lead them to take extreme measures. >> Please take your conspiracy mongering elsewhere. > > There?s no need to be caustic about it. The new e-mail address isn?t @mail.unicode.org or @lists.unicode.org ? it?s @corp.unicode.org, which I?ve never seen before even if it has been suppressed for decades. And it?s not unreasonable for a person to read some sort of meaning into ?corp? without thinking it a ?conspiracy?. > > Michael Regards, ? Martin. From c933103 at gmail.com Thu May 20 03:11:36 2021 From: c933103 at gmail.com (Phake Nick) Date: Thu, 20 May 2021 16:11:36 +0800 Subject: Unicode.org mail system maintenance In-Reply-To: <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> Message-ID: On 2021-05-19 Wed 08:09, Ken Whistler via Unicode wrote: > Yes. It is very long and very technical, having to do with OAuth 2.0 and > its interaction with our mail service, and involves seeking a solution > that doesn't end up with Gmail users having mail from unicode.org not > being swallowed up and disappeared with no notification. > Well then it seems the effort have failed as now almost all mail from corp.unicode.org are entering my GMail spam folder until I manually release them and label them as forum > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steffen at sdaoden.eu Thu May 20 09:34:36 2021 From: steffen at sdaoden.eu (Steffen Nurpmeso) Date: Thu, 20 May 2021 16:34:36 +0200 Subject: Unicode.org mail system maintenance In-Reply-To: <9cf38e38-5833-3acf-f069-f2ad3467d493@code2001.com> References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> <57d4de9b-4b8b-f46e-001b-6df7c63ed440@ix.netcom.com> <9cf38e38-5833-3acf-f069-f2ad3467d493@code2001.com> Message-ID: <20210520143436.E-hUr%steffen@sdaoden.eu> James Kass via Unicode wrote in <9cf38e38-5833-3acf-f069-f2ad3467d493 at code2001.com>: |On 2021-05-19 9:28 PM, Asmus Freytag via Unicode wrote: |> About the only thing that could have been done better is to announce such |> changes before implementing them. And in enough detail, so those \ |> of us with |> custom mail filters can update them. |> |> Other than that, life means change. |Unicode stability policies don't extend to subdomain names. | |OAuth 2.0 appears to be about enhancing security.? Beefing up security |is probably wise, given these interesting times in which we're living. | |https://oauth.net/2/ Yes, OAuth is a login mechanism (and a terrible one, if you ask me), and has nothing to do with sending mails to a mailing-list, but (maybe, luckily not here) for the first "hop" (your login to your mail provider / your own SMTP server, for example gmail.com). Unicode switched to using the mailman list manager years ago, which uses passwords -- for entering the per-user configuration, which i did once, i think, to avoid the monthly password reminder mail. (For such tasks i use a dedicated browser "account" that is separate from the normal browse account, and which lives on an encrypted partition, that is mounted only temporarily. (Mounting is an action in the Unix world of operating systems.)) I did not say anything at first because the mailing-list seems to be unloved for a long time, with (at least the free copies of) the archives shattered. In short: it is not an interactively scripted web forum that can be driven in the browser. (And unfortunately my daily work has nothing to do with making my own software Unicode aware.) The explanation that seems to have been given by administrators in response to a long-term Unicode worker was ridiculous. That is for sure. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) From wjgo_10009 at btinternet.com Thu May 20 10:26:35 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Thu, 20 May 2021 16:26:35 +0100 (BST) Subject: Technology developed for popular uses sometimes helps specialized uses Message-ID: <5bf6d262.13b0.1798a62b6dc.Webtop.107@btinternet.com> I find it interesting how technology developed for popular uses sometimes helps specialized uses. When emoji were first encoded in Unicode, Doug Ewell suggested that implementing support for emoji that are encoded in plane 1 would have the effect of helping all scripts encoded in plane 1 become better supported by the software developed to support emoji. That happened. Colour fonts were introduced due to emoji, yet they can be used for other applications too, for which colour fonts might never have become developed. I have now learned, from the agenda item Digitization Solutions for Indigenous Languages for the Internationalization & Unicode? Conference IUC 45, due to be held in October 2021, of the Language Digitization Initiative. I was wondering about keyboards and how the necessary keycaps would be produced. I found the following article. https://www.pcgamer.com/how-to-make-your-keyboard-beautiful-with-custom-keycaps/ I am wondering if the facilities established for making custom keycaps for people who want to design and have their own keycaps for games will become applied as an important facility for being able to get produced keycaps for the scripts of languages that will become encoded into Unicode. Also, will the auxiliary encoding space model being discussed for the encoding of QID emoji become used by The Unicode Technical Committee not only for emoji but also for a fast route way for encoding languages into Unicode, using a sequence behind the scenes of existing characters to represent each glyph used for the new language? That would, in my opinion, be good. William Overington Thursday 20 May 2021 From jameskass at code2001.com Thu May 20 17:49:10 2021 From: jameskass at code2001.com (James Kass) Date: Thu, 20 May 2021 22:49:10 +0000 Subject: Unicode.org mail system maintenance In-Reply-To: <20210520143436.E-hUr%steffen@sdaoden.eu> References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> <57d4de9b-4b8b-f46e-001b-6df7c63ed440@ix.netcom.com> <9cf38e38-5833-3acf-f069-f2ad3467d493@code2001.com> <20210520143436.E-hUr%steffen@sdaoden.eu> Message-ID: <37947ff5-6bfd-5b2e-7008-09c8ae4e5764@code2001.com> On 2021-05-20 2:34 PM, Steffen Nurpmeso via Unicode wrote: > The explanation that seems to have been given by > administrators in response to a long-term Unicode worker was > ridiculous. https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics 4.1.3.? Countermeasures ?? The complexity of implementing and managing pattern matching ?? correctly obviously causes security issues.? This document therefore ?? advises to simplify the required logic and configuration by using ?? exact redirect URI matching.? This means the authorization server ?? MUST compare the two URIs using simple string comparison as defined ?? in [RFC3986], Section 6.2.1. I'm no expert on OAuth, but it appears that current recommendations require an exact match for domain names.? Wildcard or partial domain name strings leave openings for attackers to exploit. Aside from the injection of "conspiracy", the response given seems accurate. From junicode at jcbradfield.org Fri May 21 07:50:04 2021 From: junicode at jcbradfield.org (Julian Bradfield) Date: Fri, 21 May 2021 13:50:04 +0100 (BST) Subject: Unicode.org mail system maintenance References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> <57d4de9b-4b8b-f46e-001b-6df7c63ed440@ix.netcom.com> <9cf38e38-5833-3acf-f069-f2ad3467d493@code2001.com> <20210520143436.E-hUr%steffen@sdaoden.eu> <37947ff5-6bfd-5b2e-7008-09c8ae4e5764@code2001.com> Message-ID: On 2021-05-20, James Kass via Unicode wrote: > https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics > > > 4.1.3.? Countermeasures > > ?? The complexity of implementing and managing pattern matching > ?? correctly obviously causes security issues.? This document therefore > ?? advises to simplify the required logic and configuration by using > ?? exact redirect URI matching.? This means the authorization server > ?? MUST compare the two URIs using simple string comparison as defined > ?? in [RFC3986], Section 6.2.1. > > > I'm no expert on OAuth, but it appears that current recommendations > require an exact match for domain names.? Wildcard or partial domain > name strings leave openings for attackers to exploit. But what on earth does this have to do with mailing lists? You don't use OAuth2 to connect to mail servers as an MTA, and Unicode shouldn't be using OAuth2 when we manage our subscriptions. What's completely unclear in the explanation is where OAuth2 enters into mailing list administration. From jameskass at code2001.com Fri May 21 15:27:20 2021 From: jameskass at code2001.com (James Kass) Date: Fri, 21 May 2021 20:27:20 +0000 Subject: Unicode.org mail system maintenance In-Reply-To: References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> <57d4de9b-4b8b-f46e-001b-6df7c63ed440@ix.netcom.com> <9cf38e38-5833-3acf-f069-f2ad3467d493@code2001.com> <20210520143436.E-hUr%steffen@sdaoden.eu> <37947ff5-6bfd-5b2e-7008-09c8ae4e5764@code2001.com> Message-ID: <7b5b7d38-57b2-6d68-ba6c-bfb85cbae6e6@code2001.com> On 2021-05-21 12:50 PM, Julian Bradfield via Unicode wrote: > But what on earth does this have to do with mailing lists? > You don't use OAuth2 to connect to mail servers as an MTA, and Unicode > shouldn't be using OAuth2 when we manage our subscriptions. > > What's completely unclear in the explanation is where OAuth2 enters > into mailing list administration. https://developers.google.com/gmail/api/auth/web-server "Requests to the Gmail API must be authorized using OAuth 2.0 credentials." From jameskass at code2001.com Sun May 23 02:28:15 2021 From: jameskass at code2001.com (James Kass) Date: Sun, 23 May 2021 07:28:15 +0000 Subject: Unicode.org mail system maintenance In-Reply-To: <7b5b7d38-57b2-6d68-ba6c-bfb85cbae6e6@code2001.com> References: <6089D2AE.30209@unicode.org> <7F66223E-5464-40E0-BF5A-9B3E62907F68@evertype.com> <204473e9-ea9b-d635-0865-ad4cacf74b96@sonic.net> <398AF8A4-C60C-4481-A366-B1810EF5E567@evertype.com> <57d4de9b-4b8b-f46e-001b-6df7c63ed440@ix.netcom.com> <9cf38e38-5833-3acf-f069-f2ad3467d493@code2001.com> <20210520143436.E-hUr%steffen@sdaoden.eu> <37947ff5-6bfd-5b2e-7008-09c8ae4e5764@code2001.com> <7b5b7d38-57b2-6d68-ba6c-bfb85cbae6e6@code2001.com> Message-ID: On 2021-05-21 8:27 PM, James Kass via Unicode wrote: > > https://developers.google.com/gmail/api/auth/web-server > > "Requests to the Gmail API must be authorized using OAuth 2.0 > credentials." A kind list member advised me off-list that my response didn't really apply to the questions. I apologize for adding to any confusion in this thread. It's not clear why an MTA would need to OAuthenticate with Gmail. But I'm happy to speculate that the Unicode administrators will not be blueprinting their security measures on a public forum. From copypaste at kittens.ph Mon May 24 11:01:21 2021 From: copypaste at kittens.ph (Fredrick Brennan) Date: Mon, 24 May 2021 12:01:21 -0400 Subject: ICU data (only .ucm mappings for now) in Rust Message-ID: <3153942.AZGIkJxqGE@laptop> Hello! I am writing a font editor called MFEK in Rust which required me to explore the ICU data. I didn't want to use libicu's C binding just to get this data, so after some trial and error I figured out how to include the data in a compressed form in the library, and added an in-memory index for it. The crate is called icu-data: https://docs.rs/icu-data/0.1.0/icu_data/[1] crates.io page: https://crates.io/crates/icu-data[2] For example, if you look at "glibc-IBM437-2.1.2", you get something back that starts like this: Encoding { metadata: {"mb_cur_max": "1", "mb_cur_min": "1", "subchar": "\\x1A", "uconv_class": "SBCS", "code_set_name": "IBM437"}, codepoints: [Codepoint { uni: '\u{0}', eq_type: Type0, bytestring: [0] }, Codepoint { uni: '\u{1}', eq_type: Type0, bytestring: [1] }, Codepoint { uni: '\u{2}', eq_type: Type0, bytestring: [2] }, Codepoint { uni: '\u{3}', eq_type: Type0, bytestring: [3] }, ... states: [] } I didn't have a need for anything but the UCM charset files at this time, but if anyone is interested in adding a module for the other ICU data this is a place to start. Best, Fred Brennan -------- [1] https://docs.rs/icu-data/0.1.0/icu_data/ [2] https://crates.io/crates/icu-data -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Thu May 27 14:16:12 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Thu, 27 May 2021 20:16:12 +0100 (BST) Subject: Using ZWNJ ZERO WIDTH NON-JOINER in practice Message-ID: <25513e9a.4da9.179af4173b7.Webtop.87@btinternet.com> Using ZWNJ ZERO WIDTH NON-JOINER in practice Today I had need to use ZWNJ U+200C in a practical application. https://www.unicode.org/charts/PDF/U2000.pdf I was using the EB Garamond font. https://forum.affinity.serif.com/index.php?/topic/142990-eb-garamond-a-font-with-lots-of-ligatures/ I had the word 'distinctive' so as to get a ligature for 'st' and a ligature for 'ct' in a design for a greetings card. In the event, the result was that the ligature for 'is' prevented the ligature for 'st' being displayed. I already had a ligature for 'is' elsewhere in the text. So, with some difficulty, I put a ZWNJ between the 'i' and the 's'. It worked. The whole story, with illustrations, is in the later part of page 7 of the following thread. https://forum.affinity.serif.com/index.php?/topic/138654-artwork-for-greetings-cards/ How do readers use ZWNJ please,? William Overington Thursday 27 May 2021 From jukkakk at gmail.com Fri May 28 02:04:00 2021 From: jukkakk at gmail.com (Jukka K. Korpela) Date: Fri, 28 May 2021 10:04:00 +0300 Subject: Using ZWNJ ZERO WIDTH NON-JOINER in practice In-Reply-To: <25513e9a.4da9.179af4173b7.Webtop.87@btinternet.com> References: <25513e9a.4da9.179af4173b7.Webtop.87@btinternet.com> Message-ID: William_J_G Overington wrote: I had the word 'distinctive' so as to get a ligature for 'st' and a > ligature for 'ct' in a design for a greetings card. > > In the event, the result was that the ligature for 'is' prevented the > ligature for 'st' being displayed. > This looks natural to me: when a program is applying ligatures, it processes the text sequentially, so it detects that there is a ligature for ?is? in the font, so it uses it (when there is no ligature for the longer sequence ?ist?). (I didn?t quite understand in which software you used this, but I tested it in Word 365, with the EB Garamond installed and with historical ligatures enabled in font settings in Word.) And it is typical intended use for ZWNJ to break a character pair so that a ligature is not used for. This in turn causes ?st? to be detected as pair for which there is a ligature. ZWNJ could also be used to prevent a ligature in a context where it is not desired for some reason and program settings in general have enabled ligatures for the text. In typesetting, you would probably just select a piece of text and turn ligatures off, but ZWNJ handles the issue at the character level and might therefore be better in some case (e.g., it might be carried from a word processor to a typesetting problem?ot it might not), Jukka K. Korpela https://jkorpela.fi -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Fri May 28 05:30:37 2021 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Fri, 28 May 2021 11:30:37 +0100 (BST) Subject: Using ZWNJ ZERO WIDTH NON-JOINER in practice In-Reply-To: <5868a2b1.5651.179b2657bbb.Webtop.87@btinternet.com> References: <25513e9a.4da9.179af4173b7.Webtop.87@btinternet.com> <5868a2b1.5651.179b2657bbb.Webtop.87@btinternet.com> Message-ID: <24be96a7.57fb.179b286a07d.Webtop.87@btinternet.com> Thank you for replying. > I didn?t quite understand in which software you used this, ... The software is Affinity Designer. Made by Serif (Europe) Limited, a company based in Nottingham, a city in England. https://affinity.serif.com/en-gb/ https://affinity.serif.com/en-gb/designer/ https://forum.affinity.serif.com/ https://forum.affinity.serif.com/index.php?/forum/10-share-your-work/ Best regards, William Overington Friday 28 May 2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Fri May 28 17:39:28 2021 From: jameskass at code2001.com (James Kass) Date: Fri, 28 May 2021 22:39:28 +0000 Subject: Using ZWNJ ZERO WIDTH NON-JOINER in practice In-Reply-To: References: <25513e9a.4da9.179af4173b7.Webtop.87@btinternet.com> Message-ID: <8a519ffb-c235-f850-4ed3-4a0e9e6e28ae@code2001.com> On 2021-05-28 7:04 AM, Jukka K. Korpela via Unicode wrote: > And it is typical intended use for ZWNJ to break a character pair so that > a ligature is not used for. Exac?tly. di?stinctive ? ZWNJ between i-s pair But it should not be typical for the ZWNJ insertion to break spell checking.? Yet Mozilla Thunderbird error flags English words with either ZWNJ or ZWJ insertions.? (The word "Exactly" above has a ZWJ between the c-t pair.)