From cldr-users at unicode.org Sat Aug 11 19:07:15 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Sun, 12 Aug 2018 02:07:15 +0200 (CEST) Subject: Exemplar punctuation in fr-FR Message-ID: <1225068146.7535.1534032435438.JavaMail.www@wwinf1m18> Mark, Thank you for looking into French exemplar character sets and helping complete with TC votes. I greatly appreciated support of our work, but I didn?t look at it as a favor, just as a due response to our efforts. Hence I?m deeply disappointed that filing a ticket [1] I was prompted to file on ST French forum triggered the withdrawal of your votes for a set of punctuation. I did not believe that TC would revert their decision, only wait for additional rationales prior to including the ASCII apostrophe and the underscore, and eventually the horizontal bar U+2015. I didn?t think of these, especially the former two, as much of an issue, given the actual fr-FR Accepted Data already includes the ASCII quote and the at and number signs, and given the semantics of U+2015 is used in French, and use of #2015 is permitted there as per French translation of the Code Charts [2]. Now I?ve changed my mind and am only asking you to be so kind and just set back your vote for the set you devised, ie [!-#\&(-*,-/\:;?@\[\]????-??????????] where the last dash is U+2014. For those unfamiliar with French data on CLDR, the Accepted value here is (as a full enumeration): [\- ? ? ? , ; \: ! ? . ? ? " ? ? ? ? ( ) \[ \] ? @ * / \& # ? ?]. The proposed value having now most support is [!-#\&-*,-/\:;?@\[\]_????-??????????]. Another proposed value that has actually less support is [!-#\&(-*,-/\:;?@\[\]????????????] We see that the TC position is a good compromise between the two proposed sets. TC?s decision of keeping the ASCII apostrophe off that list will be extremely useful to fight those who meanly tolerate and promote the use of U+0027 in publishing, eg on Wikip?dia, especially in the spelling of entries. Regardless of a persisting keyboarding issue, online publishers can easily set up routines (bots) ensuring correct typography and more appropriate redirections. Therefore I now welcome your initiative of locking APOSTROPHE out of French. I?m now able to tell communities that I honestly strived to make it legal for backwards compatibility, and that CLDR Chair and Unicode President Mark Davis [if quoting you by name is permitted in this context] assessed that idea as bad, and supports those in France who struggle against poor typography and lazy fallbacks. I know on Wikip?dia many people will be glad. Thanks a lot. And please be so kind and follow through. The data isn?t frozen yet. Best regards, Marcel [1] http://unicode.org/cldr/trac/ticket/11332 [2] http://hapax.qc.ca/Tableaux-10.0/U2000.pdf Please see also this proposal to make the data more comprehensive: http://unicode.org/cldr/trac/ticket/11339 From cldr-users at unicode.org Mon Aug 13 09:15:39 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Mon, 13 Aug 2018 16:15:39 +0200 (CEST) Subject: Exemplar punctuation in fr-FR In-Reply-To: <1225068146.7535.1534032435438.JavaMail.www@wwinf1m18> References: <1225068146.7535.1534032435438.JavaMail.www@wwinf1m18> Message-ID: <835510801.5673.1534169739678.JavaMail.www@wwinf1f34> Since I got aware that the issue that triggered this thread is mainly due to a flaw in the Information Hub for Linguists, I?ve filed a ticket suggesting to correct the documentation: https://unicode.org/cldr/trac/ticket/11343 Marcel From cldr-users at unicode.org Wed Aug 15 23:41:51 2018 From: cldr-users at unicode.org (Martin Hosken via CLDR-Users) Date: Thu, 16 Aug 2018 11:41:51 +0700 Subject: english language names Message-ID: <20180816114151.652a4003@sil-mh8> Dear All, I notice that the en locale is missing a lot of language names. Is this merely a lack of time/will/desire to copy the names from the IETF list or is there some stronger reason for not having at least an English entry for every language in the world? Yours, Martin From cldr-users at unicode.org Thu Aug 16 06:37:43 2018 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Thu, 16 Aug 2018 13:37:43 +0200 Subject: english language names In-Reply-To: <20180816114151.652a4003@sil-mh8> References: <20180816114151.652a4003@sil-mh8> Message-ID: It is more that we don't want to duplicate the https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry . We have added languages where we have locale XML files for them, and a few others (not completely consistent as to criteria: there is a ticket to clean that up). {phone} On Thu, Aug 16, 2018, 06:42 Martin Hosken via CLDR-Users < cldr-users at unicode.org> wrote: > Dear All, > > I notice that the en locale is missing a lot of language names. Is this > merely a lack of time/will/desire to copy the names from the IETF list or > is there some stronger reason for not having at least an English entry for > every language in the world? > > Yours, > Martin > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Thu Aug 16 12:59:54 2018 From: cldr-users at unicode.org (Hugh Paterson via CLDR-Users) Date: Thu, 16 Aug 2018 10:59:54 -0700 Subject: english language names In-Reply-To: References: <20180816114151.652a4003@sil-mh8> Message-ID: If CLDR were to include the language names, and then the language names were to change at IANA, or ISO 639-3 then my understanding is that it would take a significant amount of effort to change the CLDR entries; because of the necessary voting system and the lack of a formal commitment of congruence with standards like IANA or ISO 639-3. Is this a correct understanding? - Hugh On Thu, Aug 16, 2018 at 4:37 AM, Mark Davis ?? via CLDR-Users < cldr-users at unicode.org> wrote: > It is more that we don't want to duplicate the https://www.iana.org/ > assignments/language-subtag-registry/language-subtag-registry. > > We have added languages where we have locale XML files for them, and a few > others (not completely consistent as to criteria: there is a ticket to > clean that up). > > {phone} > > On Thu, Aug 16, 2018, 06:42 Martin Hosken via CLDR-Users < > cldr-users at unicode.org> wrote: > >> Dear All, >> >> I notice that the en locale is missing a lot of language names. Is this >> merely a lack of time/will/desire to copy the names from the IETF list or >> is there some stronger reason for not having at least an English entry for >> every language in the world? >> >> Yours, >> Martin >> _______________________________________________ >> CLDR-Users mailing list >> CLDR-Users at unicode.org >> http://unicode.org/mailman/listinfo/cldr-users >> > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > > -- *Hugh Paterson III *Innovation Analyst *Innovation Development & Experimentation*, *SIL International* *Web*: Contact & CV *Video chat: *Appear in *Skype*: misionpilot *Time Zone*: UTC-7/8 *Collaborative Note Pad*: Spline -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Thu Aug 16 13:56:52 2018 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Thu, 16 Aug 2018 20:56:52 +0200 Subject: english language names In-Reply-To: References: <20180816114151.652a4003@sil-mh8> Message-ID: The language subtag registry would only be used for English names, so voting doesn't really come into play except for regional English variants. But in any event, the main reason, as I said, was that there wouldn't be much value to CLDR's simply duplicating entries in the language subtag registry that would not otherwise be used for translations. It is not a goal to translate all of the 7K+ language names in language subtag registry. Mark On Thu, Aug 16, 2018 at 8:00 PM Hugh Paterson wrote: > If CLDR were to include the language names, and then the language names > were to change at IANA, or ISO 639-3 then my understanding is that it would > take a significant amount of effort to change the CLDR entries; because of > the necessary voting system and the lack of a formal commitment of > congruence with standards like IANA or ISO 639-3. Is this a correct > understanding? > > - Hugh > > On Thu, Aug 16, 2018 at 4:37 AM, Mark Davis ?? via CLDR-Users < > cldr-users at unicode.org> wrote: > >> It is more that we don't want to duplicate the >> https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry >> . >> >> We have added languages where we have locale XML files for them, and a >> few others (not completely consistent as to criteria: there is a ticket to >> clean that up). >> >> {phone} >> >> On Thu, Aug 16, 2018, 06:42 Martin Hosken via CLDR-Users < >> cldr-users at unicode.org> wrote: >> >>> Dear All, >>> >>> I notice that the en locale is missing a lot of language names. Is this >>> merely a lack of time/will/desire to copy the names from the IETF list or >>> is there some stronger reason for not having at least an English entry for >>> every language in the world? >>> >>> Yours, >>> Martin >>> _______________________________________________ >>> CLDR-Users mailing list >>> CLDR-Users at unicode.org >>> http://unicode.org/mailman/listinfo/cldr-users >>> >> >> _______________________________________________ >> CLDR-Users mailing list >> CLDR-Users at unicode.org >> http://unicode.org/mailman/listinfo/cldr-users >> >> > > > -- > *Hugh Paterson III *Innovation Analyst > *Innovation Development & Experimentation*, *SIL International* > > *Web*: Contact & CV > *Video chat: *Appear in > *Skype*: misionpilot > *Time Zone*: UTC-7/8 > *Collaborative Note Pad*: Spline > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Fri Aug 17 07:46:10 2018 From: cldr-users at unicode.org (Denis Jacquerye via CLDR-Users) Date: Fri, 17 Aug 2018 13:46:10 +0100 Subject: Still unable to open issues on unicode.org/cldr/trac Message-ID: Hi, I?m still unable to post or comment on issues on unicode.org/cldr/trac. Is there a way to resolve this? Thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Fri Aug 17 08:03:24 2018 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Fri, 17 Aug 2018 15:03:24 +0200 Subject: Still unable to open issues on unicode.org/cldr/trac In-Reply-To: References: Message-ID: What is the behavior when you try to do that? Mark On Fri, Aug 17, 2018 at 2:47 PM Denis Jacquerye via CLDR-Users < cldr-users at unicode.org> wrote: > Hi, > > I?m still unable to post or comment on issues on unicode.org/cldr/trac. > Is there a way to resolve this? > > Thank you > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Sun Aug 19 11:05:50 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Sun, 19 Aug 2018 18:05:50 +0200 (CEST) Subject: Still unable to open issues on unicode.org/cldr/trac In-Reply-To: References: Message-ID: <635067303.5373.1534694750480.JavaMail.www@wwinf1p26> Denis, Did you file a complaint via the Contact form, as suggested earlier? No doubt the problem is then already solved. If so, please disregard the rest below. What do you see when your post is rejected? That information would allow to identify the buggy component (Akismet, LinkSleeve, or whatsoever). Hopefully that will then be disabled, until a more appropriate system comes into existence. (Anyway the actual scheme does not prevent some ads from being posted as bug reports.) Best, Marcel > Message du 17/08/18 15:06 > De : "Mark Davis ?? via CLDR-Users" > A : "moyogo at gmail.com" > Copie ? : "cldr-users at unicode.org" > Objet : Re: Still unable to open issues on unicode.org/cldr/trac > > What is the behavior when you try to do that? > ? Mark ? > > On Fri, Aug 17, 2018 at 2:47 PM Denis Jacquerye via CLDR-Users wrote: > Hi, > I?m still unable to post or comment on issues on unicode.org/cldr/trac. Is there a way to resolve this? > Thank you > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > _______________________________________________ CLDR-Users mailing list CLDR-Users at unicode.org http://unicode.org/mailman/listinfo/cldr-users From cldr-users at unicode.org Mon Aug 20 09:56:30 2018 From: cldr-users at unicode.org (Denis Jacquerye via CLDR-Users) Date: Mon, 20 Aug 2018 15:56:30 +0100 Subject: Still unable to open issues on unicode.org/cldr/trac In-Reply-To: References: Message-ID: Whatever I try I end up with the following message: Submission rejected as potential spam (BotScout says this is spam (Y|MULTI|IP|1|MAIL|1|NAME|1)) On Fri, 17 Aug 2018 at 14:03 Mark Davis ?? wrote: > What is the behavior when you try to do that? > > Mark > > > On Fri, Aug 17, 2018 at 2:47 PM Denis Jacquerye via CLDR-Users < > cldr-users at unicode.org> wrote: > >> Hi, >> >> I?m still unable to post or comment on issues on unicode.org/cldr/trac. >> Is there a way to resolve this? >> >> Thank you >> > _______________________________________________ >> CLDR-Users mailing list >> CLDR-Users at unicode.org >> http://unicode.org/mailman/listinfo/cldr-users >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Mon Aug 20 11:07:00 2018 From: cldr-users at unicode.org (Steven R Loomis via CLDR-Users) Date: Mon, 20 Aug 2018 16:07:00 +0000 Subject: Still unable to open issues on unicode.org/cldr/trac In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Mon Aug 20 13:47:12 2018 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Mon, 20 Aug 2018 20:47:12 +0200 Subject: Charts (rough cut) Message-ID: We are in the process of resolving the data, and I generated an early version of the charts, in case people are interested. Of course, there are more cleanup/data fixes to be made, but this is a rough cut. https://unicode.org/repos/cldr-aux/charts/34/index.html For the data that changed from v33 to the present, see https://unicode.org/repos/cldr-aux/charts/34/delta/index.html. For example, here's the Swedish delta chart: https://unicode.org/repos/cldr-aux/charts/34/delta/sv.html Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Mon Aug 20 16:44:23 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Mon, 20 Aug 2018 23:44:23 +0200 (CEST) Subject: Charts (rough cut) In-Reply-To: References: Message-ID: <732616753.15061.1534801464109.JavaMail.www@wwinf1p13> On 20/08/18 20:49, Mark Davis ?? via CLDR-Users wrote: > > We are in the process of resolving the data, and I generated an early version of the charts, in case people are interested. Thanks for the data. I?ve checked some sensitive parts. > Of course, there are more cleanup/data fixes to be made, but this is a rough cut. I see that the set of fr punctuation is not fixed. Do you project to give it an extension? Eg: ? The single angle quotation marks are still excluded, although they are heavily used in fr-CH AFAIK. ? I?m not advocating the current use of ASCII quotes, but the double one is present while the single one is missing. ? The horizontal bar is specified by Unicode and admitted by experts, as is HYPHEN, that is included though scarcely used AFAIK. You may wish to refer to an older thread and to ticket #11332 and Xrefs there: https://unicode.org/cldr/trac/ticket/11332 > https://unicode.org/repos/cldr-aux/charts/34/index.html > For the data that changed from v33 to the present, see?https://unicode.org/repos/cldr-aux/charts/34/delta/index.html. > For example, here's the Swedish delta chart:?https://unicode.org/repos/cldr-aux/charts/34/delta/sv.html Thanks for sharing the data in an early production stage. Regards, Marcel P.S. The group separator for fr-CA is still U+00A0 instead of preferred U+202F. I guess an inheritance bug. Needs fix. From cldr-users at unicode.org Tue Aug 21 00:38:46 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Tue, 21 Aug 2018 07:38:46 +0200 (CEST) Subject: Charts (rough cut) In-Reply-To: <732616753.15061.1534801464109.JavaMail.www@wwinf1p13> References: <732616753.15061.1534801464109.JavaMail.www@wwinf1p13> Message-ID: <895699240.325.1534829926714.JavaMail.www@wwinf1p26> Mark, What bothers me severely is that we don?t have the non-breaking hyphen, nor the single angle quotes. If you could just add these three, that would be fine, and also superscript small o to replace the Latin-1 fallback DEGREE SIGN that is systematically used for numero abbreviation, though not Unicode conformant. It is not preferred, just available right on the keyboard. There I propose to have "n??" and "N??" sequences for use instead, with real superscript o. Regards, Marcel > Date: 20/08/18 23:47 > From: "Marcel Schneider via CLDR-Users" > To: "Mark Davis ??" > Copie ? : "cldr-users at unicode.org" > Objet : Re: Charts (rough cut) > > On 20/08/18 20:49, Mark Davis ?? via CLDR-Users wrote: > > We are in the process of resolving the data, and I generated an early version of the charts, in case people are interested. Thanks for the data. I?ve checked some sensitive parts. > Of course, there are more cleanup/data fixes to be made, but this is a rough cut. I see that the set of fr punctuation is not fixed. Do you project to give it an extension? Eg: ? The single angle quotation marks are still excluded, although they are heavily used in fr-CH AFAIK. ? I?m not advocating the current use of ASCII quotes, but the double one is present while the single one is missing. ? The horizontal bar is specified by Unicode and admitted by experts, as is HYPHEN, that is included though scarcely used AFAIK. You may wish to refer to an older thread and to ticket #11332 and Xrefs there: https://unicode.org/cldr/trac/ticket/11332 > https://unicode.org/repos/cldr-aux/charts/34/index.html > For the data that changed from v33 to the present, see?https://unicode.org/repos/cldr-aux/charts/34/delta/index.html. > For example, here's the Swedish delta chart:?https://unicode.org/repos/cldr-aux/charts/34/delta/sv.html Thanks for sharing the data in an early production stage. Regards, Marcel P.S. The group separator for fr-CA is still U+00A0 instead of preferred U+202F. I guess an inheritance bug. Needs fix. _______________________________________________ CLDR-Users mailing list CLDR-Users at unicode.org http://unicode.org/mailman/listinfo/cldr-users From cldr-users at unicode.org Tue Aug 21 18:30:08 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Wed, 22 Aug 2018 01:30:08 +0200 (CEST) Subject: Charts (rough cut) Message-ID: <1762786961.15557.1534894208630.JavaMail.www@wwinf1p21> Philippe, I?m sorry not to have joined in your proposal of pushing [!-#\&(-*,-/\:;?@\[\]????????????] as a punctuation set, Clearly that made it difficult for TC to do anything for French here, the more as their attempt to make a compromise to be pushed through was not welcomed. In reality it was (better than nothing), just I didn?t acknowledge while responding. Now I?m suggesting to discuss that here so TC can see why there is a latent ?dispute? (OK, I see there are many ?disputed items?, given a huge part of corrections I proposed weren?t accepted by fellow vetters). There was a more complete punctuation set that was gaining traction: [!-#\&-*,-/\:;?@\[\]_????-??????????]. In comparison with that set, you are excluding the following characters: ? U+0027 APOSTROPHE. This is not preferred, but neither is U+0022 QUOTATION MARK that you keep ????including. And you yourself are using the ASCII apostrophe when typing, due to its exclusive presence on ????widespread keyboard layouts. ? U+005F LOW LINE. This is not more typically French than the NUMBER SIGN and the AT SIGN, both of ????which you include. And you do have an underscore in your own e-mail address. ? U+2011 NON-BREAKING HYPHEN. See below. ? U+2012 FIGURE DASH ? U+2015 HORIZONTAL BAR [I?ve lost comments and references on U+2012 and U+2015 by inadvertently hitting the shortcut closing the browser. No doubt I was too detailed. Now I?ll post further information on request only.] ? U+2018 LEFT SINGLE QUOTATION MARK ? U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK ? U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK Can anybody tell us why U+2011 NON-BREAKING HYPHEN is not default in every Latin-script using locale? Obviously contributors and vetters are lacking guidance, because CLDR documentation is still a stub compared to what it could and should be. I don?t actually have time to rewrite more parts of it, not even knowing whether TC will use suggested updates, or not. In my belief, the engineering effort ought to be done basically by those who are in charge of maintaining the data. I?m ready to contribute if there is a demand that I may cater for. Regards, Marcel From cldr-users at unicode.org Wed Aug 22 01:36:59 2018 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Wed, 22 Aug 2018 08:36:59 +0200 Subject: Charts (rough cut) In-Reply-To: <1762786961.15557.1534894208630.JavaMail.www@wwinf1p21> References: <1762786961.15557.1534894208630.JavaMail.www@wwinf1p21> Message-ID: Thanks for your message. Briefly, I think it is a good idea to add the non-breaking variants of included characters. That is something we can do automatically (there is a processor that can modify display of data in the survey tool, and modify data that is typed into the survey tool). Can you file a ticket for that? We have to strike a balance here, because often the typographically more desired form is not present in fonts. We typically delay, for example, using new currency symbols until widespread fonts have caught up. We stay away from extensive use of the super or subscript Latin characters. Those are not uniformly supported in fonts, and tend to have a ransom-note appearance. So for English we don't use 13??, for example, even though that form would be in theory preferred to 13th. However, the Latin-1 characters are well supported: ? U+00AA FEMININE ORDINAL INDICATOR ? U+00BA MASCULINE ORDINAL INDICATOR We don't include ' and " in the regular punctuation, because they are rarely the preferred form for display. (There is a mechanism called parseLenients that we could consider extended for cases where it could be useful to indicate that various input forms might be equivalent...). As for more documentation, we'd welcome that. Some thoughts (not complete): - We need to think about the best forum for it. The LDML spec is heavy-weight and slow to modify, while the http://cldr.unicode.org/translation pages have a very fast turn-around. The connections between the Survey Tool info panel for a path (or set of paths) and a particular http://cldr.unicode.org/translation page do require rebuilding and deploying the tool, which is not as light-weight, but fairly straightforward. - Best is to pick out obvious fixes or enhancements to the documentation with suggested rewording or additions for clearly identified places. - Suggestions for policy changes or enhancements should be kept separate, so that they can be reviewed and discussed first before specific text is considered. Mark On Wed, Aug 22, 2018 at 1:30 AM Marcel Schneider via CLDR-Users < cldr-users at unicode.org> wrote: > > Philippe, > > I?m sorry not to have joined in your proposal of pushing > [!-#\&(-*,-/\:;?@\[\]????????????] as a punctuation set, > Clearly that made it difficult for TC to do anything for French here, the > more as their attempt to make a compromise > to be pushed through was not welcomed. In reality it was (better than > nothing), just I didn?t acknowledge while responding. > > Now I?m suggesting to discuss that here so TC can see why there is a > latent ?dispute? (OK, I see there are many > ?disputed items?, given a huge part of corrections I proposed weren?t > accepted by fellow vetters). > > There was a more complete punctuation set that was gaining traction: > [!-#\&-*,-/\:;?@\[\]_????-??????????]. > In comparison with that set, you are excluding the following characters: > > ? U+0027 APOSTROPHE. This is not preferred, but neither is U+0022 > QUOTATION MARK that you keep > including. And you yourself are using the ASCII apostrophe when > typing, due to its exclusive presence on > widespread keyboard layouts. > > ? U+005F LOW LINE. This is not more typically French than the NUMBER SIGN > and the AT SIGN, both of > which you include. And you do have an underscore in your own e-mail > address. > > ? U+2011 NON-BREAKING HYPHEN. See below. > > ? U+2012 FIGURE DASH > > ? U+2015 HORIZONTAL BAR > > [I?ve lost comments and references on U+2012 and U+2015 by inadvertently > hitting the shortcut closing the browser. > No doubt I was too detailed. Now I?ll post further information on request > only.] > > ? U+2018 LEFT SINGLE QUOTATION MARK > > ? U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK > > ? U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK > > > Can anybody tell us why U+2011 NON-BREAKING HYPHEN is not default in every > Latin-script using locale? > > Obviously contributors and vetters are lacking guidance, because CLDR > documentation is still a stub compared > to what it could and should be. > > I don?t actually have time to rewrite more parts of it, not even knowing > whether TC will use suggested updates, > or not. > > In my belief, the engineering effort ought to be done basically by those > who are in charge of maintaining the data. > I?m ready to contribute if there is a demand that I may cater for. > > > Regards, > > Marcel > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Wed Aug 22 10:33:02 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Wed, 22 Aug 2018 17:33:02 +0200 (CEST) Subject: Charts (rough cut) In-Reply-To: References: <1762786961.15557.1534894208630.JavaMail.www@wwinf1p21> Message-ID: <1834819947.7567.1534951982889.JavaMail.www@wwinf1n27> On 22/08/18 08:37, Mark Davis ?? wrote: > > Thanks for your message. You are welcome; thank you for your comprehensive response. I?ll try to dive into some problems. > Can you file a ticket for that? Currently my Linux keyboard layout is wrecked after a system reinstall, I?m debugging the backup, but I?ll file this and all other required tickets as soon as possible. Sorry for delay. Thanks, Marcel From cldr-users at unicode.org Wed Aug 22 18:28:10 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Thu, 23 Aug 2018 01:28:10 +0200 (CEST) Subject: Charts (rough cut) Message-ID: <149995647.14166.1534980490471.JavaMail.www@wwinf1m18> On?22/08/18 08:39,?Mark Davis ?? via CLDR-Users wrote: > > I think it is a good idea to add the non-breaking variants of included > characters. That is something we can do automatically (there is a processor > that can modify display of data in the survey tool, and modify data that is > typed into the survey tool). Can you file a ticket for that? Yes?gladly, I?ve also quoted your text in http://unicode.org/cldr/trac/ticket/11376 > > We have to strike a balance here, because often the typographically more > desired form is not present in fonts. We typically delay, for example, > using new currency symbols until widespread fonts have caught up. We stay > away from extensive use of the super or subscript Latin characters. Those > are not uniformly supported in fonts, and tend to have a ransom-note > appearance. So for English we don't use 13??, for example, even though that > form would be in theory preferred to 13th. However, the Latin-1 characters > are well supported: > ? U+00AA FEMININE ORDINAL INDICATOR > ? U+00BA MASCULINE ORDINAL INDICATOR Yes they are, thanks to well-established good policies. I?ve discussed superscripts in next ticket: http://unicode.org/cldr/trac/ticket/11377 > > We don't include ' and " in the regular punctuation, because they are > rarely the preferred form for display. (There is a mechanism > called parseLenients that we could consider extended for cases where it > could be useful to indicate that various input forms might be > equivalent...). The point of the ASCII single and double quotes among others made me think and file ticket #11343 suggesting to remove them in English, too (should be another ticket, will post and Xref): http://unicode.org/cldr/trac/ticket/11343 > > As for more documentation, we'd welcome that. Some thoughts (not complete): > > - We need to think about the best forum for it. The LDML spec is > heavy-weight and slow to modify, while the > http://cldr.unicode.org/translation pages have a very fast turn-around. > The connections between the Survey Tool info panel for a path (or set of > paths) and a particular http://cldr.unicode.org/translation page do > require rebuilding and deploying the tool, which is not as light-weight, > but fairly straightforward. > - Best is to pick out obvious fixes or enhancements to the documentation > with suggested rewording or additions for clearly identified places. Above linked ticket #11343 is one example I thought at. > - Suggestions for policy changes or enhancements should be kept > separate, so that they can be reviewed and discussed first before specific > text is considered. Last posted #11377 falls in this category. Hopefully CLDR-TC and UTC may consider reviewing those policies, when seeing what?s at stake. Thanks for follow-up, Marcel From cldr-users at unicode.org Thu Aug 23 17:28:51 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Fri, 24 Aug 2018 00:28:51 +0200 (CEST) Subject: ASCII quotes removal (was Re: Charts (rough cut)) Message-ID: <171565728.14261.1535063331117.JavaMail.www@wwinf1m18> I fully agree that the ASCII quotes should not be included in any locale punctuation, and have filed ticket #11378 about that topic. http://unicode.org/cldr/trac/ticket/11378 I guess that their widespread presence across locales could eventually be due to the English example provided in the information hub for linguists (please see links to By-Type charts at the end of the Description part of the cited ticket, and the Xref at the bottom of the body). ? As a remedial I?ve suggested to draw the attention of all vetters, at least in Latin script, to this topic, asking people to focus on reviewing punctuation keeping in mind that following a new guideline yet to set up, ASCII punctuation is referenced by default on a fallback basis and is not locale-specific. The exemplar punctuation set should be streamlined for publishing, as it is needed for user interfaces. All vetters being supposed to monitor this list, I think it could be a good idea to get familiar with the scheme, that isn?t new for some locales, but still would be so for many others.? ? > > We don't include ' and " in the regular punctuation, because they are rarely the preferred form for display. > The point of the ASCII single and double quotes among others made me think and file ticket #11343 > suggesting to remove them [?] (should be another ticket, will post and Xref) That is the abovementioned http://unicode.org/cldr/trac/ticket/11378 From cldr-users at unicode.org Thu Aug 30 01:47:00 2018 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Thu, 30 Aug 2018 08:47:00 +0200 Subject: CLDR data freeze Message-ID: The CLDR data for v34 is now frozen, so the remaining data bugs will be pushed to a future release. For CLDR developers, this means that no further data bugs should be made without agreement from the TC or in accordance with the BRS checklist. Note: this *doesn't* mean no changes to CLDR data before the release: 1. There is a lot of processing of the data files remaining before alpha (currently scheduled for Sept 10) 2. The data needs to be integrated into ICU and tested there. So there will be changes resulting from that processing and will (very likely) be fixes for problems found in that processing and in integration into ICU. -------------- next part -------------- An HTML attachment was scrubbed... URL: