From emmo at us.ibm.com Wed Jun 11 17:44:35 2014 From: emmo at us.ibm.com (John Emmons) Date: Wed, 11 Jun 2014 17:44:35 -0500 Subject: Supplemental data - characters.xml Message-ID: The CLDR TC is considering retiring the "characters.xml" file from our distribution. The file is intended to provide a set of fallback characters that can be used as reasonable alternatives for a given character if the character does not exist in a particular font. See http://unicode.org/cldr/trac/ticket/7123. We would like to get feedback from any interested parties who might be using this data, and who would want to make a case for keeping this information in the CLDR. If we don't get any response at all, or very little response, then we will likely remove this data in the 26 release. Regards, John C. Emmons Globalization Architect & Unicode CLDR TC Chairman IBM Software Group Internet: emmo at us.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus.icu at gmail.com Thu Jun 12 02:55:06 2014 From: markus.icu at gmail.com (Markus Scherer) Date: Thu, 12 Jun 2014 00:55:06 -0700 Subject: Supplemental data - characters.xml In-Reply-To: References: Message-ID: On Wed, Jun 11, 2014 at 3:44 PM, John Emmons wrote: > The CLDR TC is considering retiring the "characters.xml" file from our > distribution. The file is intended to provide a set of fallback characters > that can be used as reasonable alternatives for a given character if the > character does not exist in a particular font. > The data could be at least as useful for conversion from Unicode to other charsets, except we don't do that so much any more, relatively speaking. > See http://unicode.org/cldr/trac/ticket/7123. > I added a link to the data file there. markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at macchiato.com Thu Jun 12 04:12:10 2014 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=) Date: Thu, 12 Jun 2014 11:12:10 +0200 Subject: Supplemental data - characters.xml In-Reply-To: References:

Message-ID: I think the main issue is that we have not really maintained and extended that file, and without people committed to working on it, it just becomes more and more stale. We don't know of a big user of the data, either. So I think the appropriate action to take, unless some people are willing to step up to the plate, is to deprecate that file in some way. Some options are: - remove it - replace the contents of the file by a pointer to the last CLDR version where it occurs - keep it, but document in the file and outside that it is stale data In any case, we would need to document the status, and also remove the chart: http://www.unicode.org/cldr/charts/25/supplemental/character_fallback_substitutions.html Mark *? Il meglio ? l?inimico del bene ?* On Thu, Jun 12, 2014 at 9:55 AM, Markus Scherer wrote: > On Wed, Jun 11, 2014 at 3:44 PM, John Emmons wrote: > >> The CLDR TC is considering retiring the "characters.xml" file from our >> distribution. The file is intended to provide a set of fallback characters >> that can be used as reasonable alternatives for a given character if the >> character does not exist in a particular font. >> > The data could be at least as useful for conversion from Unicode to other > charsets, except we don't do that so much any more, relatively speaking. > >> See http://unicode.org/cldr/trac/ticket/7123. >> > I added a link to the data file there. > > markus > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From verdy_p at wanadoo.fr Thu Jun 12 05:59:23 2014 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Thu, 12 Jun 2014 12:59:23 +0200 Subject: Supplemental data - characters.xml In-Reply-To: References:

Message-ID: 2014-06-12 11:12 GMT+02:00 Mark Davis ?? : > replace the contents of the file by a pointer to the last CLDR version where it occurs sounds good solution. Keep in in archived versions of CLDR, just place a #comment line in new version pointing to the last version -------------- next part -------------- An HTML attachment was scrubbed... URL: From verdy_p at wanadoo.fr Thu Jun 12 06:05:55 2014 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Thu, 12 Jun 2014 13:05:55 +0200 Subject: Supplemental data - characters.xml In-Reply-To: References:

Message-ID: Another solution: replace by keeping only fallbacks from the Auxiliary Character Set to the Basic Character set (as defined in core data for the locale). This should require little maintenance per locale. Fallbacks from other characters could come from their core properties (e.g. whitespaces can be infered by renderers using basic font metrics; if they are not all mapped in fonts, without even using any font fallback possibly using wrong metrics). 2014-06-12 12:59 GMT+02:00 Philippe Verdy : > 2014-06-12 11:12 GMT+02:00 Mark Davis ?? : > > replace the contents of the file by a pointer to the last CLDR version > where it occurs > > sounds good solution. Keep in in archived versions of CLDR, just place a > #comment line in new version pointing to the last version > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From verdy_p at wanadoo.fr Fri Jun 13 14:57:09 2014 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Fri, 13 Jun 2014 21:57:09 +0200 Subject: Missing locales (lrc, hrx) Message-ID: New localized Wikimedia projects have been approved for creation after their test in Wikimedia Incubator: * (azb) South Azerbaijani Wikipedia * (bqi) Bakhtiari Wikipedia * (hrx) Rio Grande German Wikipedia * (lrc) Northern Luri Wikipedia * (sli) Silesian German Wikipedia * (tcy) Tulu Wikipedia * (tly) Talysh Wikipedia * (pnb) Western Punjabi Wikiquote (already existing as Wikipedia) Within them the Northern Luri Wikipedia shows significant advances: * https://incubator.wikimedia.org/wiki/Wp/lrc/??????? Its localisation is in progress: * https://tools.wmflabs.org/robin/?tool=codelookup&code=lrc But the "lrc" locale code is still not in CLDR (even in the Comprehensive level) Same remark about the "hrx" locale code for Rio Grande German. Could I ask these "lrc" and "hrx" locale codes being added to the list of supported languages (at least in beta) for allowing input in survey, and translation of their language name ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From shervinafshar at gmail.com Fri Jun 13 15:10:55 2014 From: shervinafshar at gmail.com (Shervin Afshar) Date: Fri, 13 Jun 2014 13:10:55 -0700 Subject: Missing locales (lrc, hrx) In-Reply-To: References: Message-ID: To add a new locale, seed data is needed to be provided first: http://cldr.unicode.org/index/cldr-spec/minimaldata If you have the linguistic data, I can help with submitting the seed data and adding the locale. Shervin ? shervinafshar.name On Fri, Jun 13, 2014 at 12:57 PM, Philippe Verdy wrote: > New localized Wikimedia projects have been approved for creation after their > test in Wikimedia Incubator: > > * (azb) South Azerbaijani Wikipedia > * (bqi) Bakhtiari Wikipedia > * (hrx) Rio Grande German Wikipedia > * (lrc) Northern Luri Wikipedia > * (sli) Silesian German Wikipedia > * (tcy) Tulu Wikipedia > * (tly) Talysh Wikipedia > * (pnb) Western Punjabi Wikiquote (already existing as Wikipedia) > > Within them the Northern Luri Wikipedia shows significant advances: > * https://incubator.wikimedia.org/wiki/Wp/lrc/??????? > Its localisation is in progress: > * https://tools.wmflabs.org/robin/?tool=codelookup&code=lrc > But the "lrc" locale code is still not in CLDR (even in the Comprehensive > level) > > Same remark about the "hrx" locale code for Rio Grande German. > > Could I ask these "lrc" and "hrx" locale codes being added to the list of > supported languages (at least in beta) for allowing input in survey, and > translation of their language name ? > > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > From srl at icu-project.org Fri Jun 13 15:25:01 2014 From: srl at icu-project.org (Steven R. Loomis) Date: Fri, 13 Jun 2014 13:25:01 -0700 Subject: Missing locales (lrc, hrx) In-Reply-To: References: Message-ID: <539B5E1D.5070700@icu-project.org> Good, also probably should have a separate ticket for the English names + locale codes. Steven On 06/13/2014 01:10 PM, Shervin Afshar wrote: > To add a new locale, seed data is needed to be provided first: > > http://cldr.unicode.org/index/cldr-spec/minimaldata > > If you have the linguistic data, I can help with submitting the seed > data and adding the locale. > > Shervin > ? shervinafshar.name > > > On Fri, Jun 13, 2014 at 12:57 PM, Philippe Verdy wrote: >> New localized Wikimedia projects have been approved for creation after their >> test in Wikimedia Incubator: >> >> * (azb) South Azerbaijani Wikipedia >> * (bqi) Bakhtiari Wikipedia >> * (hrx) Rio Grande German Wikipedia >> * (lrc) Northern Luri Wikipedia >> * (sli) Silesian German Wikipedia >> * (tcy) Tulu Wikipedia >> * (tly) Talysh Wikipedia >> * (pnb) Western Punjabi Wikiquote (already existing as Wikipedia) >> >> Within them the Northern Luri Wikipedia shows significant advances: >> * https://incubator.wikimedia.org/wiki/Wp/lrc/??????? >> Its localisation is in progress: >> * https://tools.wmflabs.org/robin/?tool=codelookup&code=lrc >> But the "lrc" locale code is still not in CLDR (even in the Comprehensive >> level) >> >> Same remark about the "hrx" locale code for Rio Grande German. >> >> Could I ask these "lrc" and "hrx" locale codes being added to the list of >> supported languages (at least in beta) for allowing input in survey, and >> translation of their language name ? >> >> >> _______________________________________________ >> CLDR-Users mailing list >> CLDR-Users at unicode.org >> http://unicode.org/mailman/listinfo/cldr-users >> > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users -- IBMer but all opinions are mine. https://www.ohloh.net/accounts/srl295 // fingerprint @ https://ssl.icu-project.org/trac/wiki/Srl From jkorpela at cs.tut.fi Fri Jun 13 15:42:30 2014 From: jkorpela at cs.tut.fi (Jukka K. Korpela) Date: Fri, 13 Jun 2014 23:42:30 +0300 Subject: Missing locales (lrc, hrx) In-Reply-To: References: Message-ID: <539B6236.6010204@cs.tut.fi> 2014-06-13 22:57, Philippe Verdy wrote: > New localized Wikimedia projects have been approved for creation after > their test in Wikimedia Incubator: Wiki* things as such are comparable to drawings on public toilet walls, except that such drawings may carry the author?s name or signature. My point is that Wiki* stuff as such is not significant as a reference. If reliable references are cited on a Wiki* page, it is better to cite those references directly. Please remember that anyone can write anything at Wiki* and often will. Yucca From shervinafshar at gmail.com Fri Jun 13 15:57:48 2014 From: shervinafshar at gmail.com (Shervin Afshar) Date: Fri, 13 Jun 2014 13:57:48 -0700 Subject: Missing locales (lrc, hrx) In-Reply-To: <539B6236.6010204@cs.tut.fi> References: <539B6236.6010204@cs.tut.fi> Message-ID: To be clear, nothing from the wiki would directly go into the CLDR. The idea is to use the help of community members to collect the minimal data and get them involved with the submission procedure. And of course, everything is voted on by members. Shervin ? shervinafshar.name On Fri, Jun 13, 2014 at 1:42 PM, Jukka K. Korpela wrote: > 2014-06-13 22:57, Philippe Verdy wrote: > >> New localized Wikimedia projects have been approved for creation after >> their test in Wikimedia Incubator: > > > Wiki* things as such are comparable to drawings on public toilet walls, > except that such drawings may carry the author?s name or signature. > > My point is that Wiki* stuff as such is not significant as a reference. If > reliable references are cited on a Wiki* page, it is better to cite those > references directly. > > Please remember that anyone can write anything at Wiki* and often will. > > Yucca > > > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users From gerard.meijssen at gmail.com Fri Jun 13 17:13:20 2014 From: gerard.meijssen at gmail.com (Gerard Meijssen) Date: Sat, 14 Jun 2014 00:13:20 +0200 Subject: Missing locales (lrc, hrx) In-Reply-To: <539B6236.6010204@cs.tut.fi> References: <539B6236.6010204@cs.tut.fi> Message-ID: Hoi, For your information. When a language gets to the stage where it is promoted to a full Wikipedia, the articles written in the Incubator will be verified by a linguist who knows that language. The question whether Wiki stuff is significant as a reference is not relevant. What is relevant is that an expert opinion is sought to verify that the language used is indeed the language in use. It would not be a problem to ask the expert to look at the data provided to the CLDR. Thanks, GerardM On 13 June 2014 22:42, Jukka K. Korpela wrote: > 2014-06-13 22:57, Philippe Verdy wrote: > > New localized Wikimedia projects have been approved for creation after >> their test in Wikimedia Incubator: >> > > Wiki* things as such are comparable to drawings on public toilet walls, > except that such drawings may carry the author?s name or signature. > > My point is that Wiki* stuff as such is not significant as a reference. If > reliable references are cited on a Wiki* page, it is better to cite those > references directly. > > Please remember that anyone can write anything at Wiki* and often will. > > Yucca > > > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srl at icu-project.org Fri Jun 13 17:35:54 2014 From: srl at icu-project.org (Steven R. Loomis) Date: Fri, 13 Jun 2014 15:35:54 -0700 Subject: Missing locales (lrc, hrx) In-Reply-To: References: <539B6236.6010204@cs.tut.fi> Message-ID: <539B7CCA.1070007@icu-project.org> On 06/13/2014 03:13 PM, Gerard Meijssen wrote: > Hoi, > For your information. When a language gets to the stage where it is > promoted to a full Wikipedia, the articles written in the Incubator > will be verified by a linguist who knows that language. > > The question whether Wiki stuff is significant as a reference is not > relevant. What is relevant is that an expert opinion is sought to > verify that the language used is indeed the language in use. It would > not be a problem to ask the expert to look at the data provided to the > CLDR. > Thanks, > GerardM > Right, and the point is that there are active users creating content in these languages,thus a demand (a use case) for considering the localization of those language codes. Creating/contributing the locales is via the usual CLDR process. -s -- IBMer but all opinions are mine. https://www.ohloh.net/accounts/srl295 // fingerprint @ https://ssl.icu-project.org/trac/wiki/Srl From shervinafshar at gmail.com Fri Jun 13 19:16:48 2014 From: shervinafshar at gmail.com (Shervin Afshar) Date: Fri, 13 Jun 2014 17:16:48 -0700 Subject: Missing locales (lrc, hrx) In-Reply-To: <87776977097f4641906c54f150a29cf0@BY2PR03MB491.namprd03.prod.outlook.com> References: <87776977097f4641906c54f150a29cf0@BY2PR03MB491.namprd03.prod.outlook.com> Message-ID: On Fri, Jun 13, 2014 at 4:27 PM, Shawn Steele wrote: >> To add a new locale, seed data is needed to be provided first: > >> http://cldr.unicode.org/index/cldr-spec/minimaldata > > Why? > > I mean, I get that locales need a certain amount of data to be helpful, but perhaps if all someone had were the month & day names, maybe that'd help start getting the data so that when a few more people contribute that it gets fleshed out sufficiently to be usable? FWIW, any data repository has its own rules of accepting public submission and CLDR is no exception. > Seems like the stuff necessary to begin collecting data for a locale would pretty much be knowing the appropriate language tag? Everything else "just" gets attached to that? It's not quite like that. To be able to collect the rest of data through Survey Tool, the locale should be added there for which having the core data is a prerequisite. From verdy_p at wanadoo.fr Sat Jun 14 00:43:27 2014 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Sat, 14 Jun 2014 07:43:27 +0200 Subject: Missing locales (lrc, hrx) In-Reply-To: <539B6236.6010204@cs.tut.fi> References: <539B6236.6010204@cs.tut.fi> Message-ID: The contents of articles does not matter. The content was evaluated before being accepted for project creation. These two languages have passed the test that there is a significant enough community, and that the software can also be translated, and there are interest in writing in these two languages. I have absolutely no opinon on the "encyclipedic" value of beta articles found in that wiki and in fact I absolutely don't care about that. All languages will start with fez contributors making some beta texts, then there will be gems being developed. Also having content in a wiki for languages that are live in all sources, means that it merits localization. North Luri is as much significant as Bakhhtiari in CLDR and even more importnat in terms of population. Bakhtiari is present in CLDR because it is the language of most important rulers in Iran and also from the former Shah if I remember well. Both are located in the most active area in Iran, with the most resources. And both are important ressources for the Persan/Farsi language. I'm just concerned by the fact that now MEdiaWiki will be localized in those languages. And remember that MediaWiki is also used by many academic groups for their own publications, outside Wikimedia projects, and for documenting lots of private collaborative processes: there's a world after Wikimedia sites. 2014-06-13 22:42 GMT+02:00 Jukka K. Korpela : > 2014-06-13 22:57, Philippe Verdy wrote: > > New localized Wikimedia projects have been approved for creation after >> their test in Wikimedia Incubator: >> > > Wiki* things as such are comparable to drawings on public toilet walls, > except that such drawings may carry the author?s name or signature. > > My point is that Wiki* stuff as such is not significant as a reference. If > reliable references are cited on a Wiki* page, it is better to cite those > references directly. > > Please remember that anyone can write anything at Wiki* and often will. > > Yucca > > > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shervinafshar at gmail.com Sat Jun 14 12:43:21 2014 From: shervinafshar at gmail.com (Shervin Afshar) Date: Sat, 14 Jun 2014 10:43:21 -0700 Subject: Missing locales (lrc, hrx) In-Reply-To: References: Message-ID: In conclusion, I will file a ticket to have English names of these languages added, but for adding the languages themselves to the CLDR, I need a couple of linguists for each language to get in touch with me directly regarding core data. Here is the direct link to the list of needed data items for core data: http://cldr.unicode.org/index/cldr-spec/minimaldata/form Shervin ? shervinafshar.name On Fri, Jun 13, 2014 at 12:57 PM, Philippe Verdy wrote: > New localized Wikimedia projects have been approved for creation after their > test in Wikimedia Incubator: > > * (azb) South Azerbaijani Wikipedia > * (bqi) Bakhtiari Wikipedia > * (hrx) Rio Grande German Wikipedia > * (lrc) Northern Luri Wikipedia > * (sli) Silesian German Wikipedia > * (tcy) Tulu Wikipedia > * (tly) Talysh Wikipedia > * (pnb) Western Punjabi Wikiquote (already existing as Wikipedia) > > Within them the Northern Luri Wikipedia shows significant advances: > * https://incubator.wikimedia.org/wiki/Wp/lrc/??????? > Its localisation is in progress: > * https://tools.wmflabs.org/robin/?tool=codelookup&code=lrc > But the "lrc" locale code is still not in CLDR (even in the Comprehensive > level) > > Same remark about the "hrx" locale code for Rio Grande German. > > Could I ask these "lrc" and "hrx" locale codes being added to the list of > supported languages (at least in beta) for allowing input in survey, and > translation of their language name ? > > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > From verdy_p at wanadoo.fr Sat Jun 14 21:07:47 2014 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Sun, 15 Jun 2014 04:07:47 +0200 Subject: Missing locales (lrc, hrx) In-Reply-To: References: Message-ID: That's OK. This will allow at least lining to these languages and designating them. Wikimedia can still develop its own locale datas but will benefit from translations of these two names, provided or reviewed by linguist expers. That was what I was asking anyway (when I spoke about Bakthiari, this was already the case: language name translatable, even without the core data to have its own locale). Unfortunately, this means that the autonoym cannot be added to the CLDR (it would require their own locale), but I suspect that the autonym for Luri, or Lori, is the same as in Persan, which would likely be a good fallback for this language). 2014-06-14 19:43 GMT+02:00 Shervin Afshar : > In conclusion, I will file a ticket to have English names of these > languages added, but for adding the languages themselves to the CLDR, > I need a couple of linguists for each language to get in touch with me > directly regarding core data. Here is the direct link to the list of > needed data items for core data: > > http://cldr.unicode.org/index/cldr-spec/minimaldata/form > Shervin > ? shervinafshar.name > > > On Fri, Jun 13, 2014 at 12:57 PM, Philippe Verdy > wrote: > > New localized Wikimedia projects have been approved for creation after > their > > test in Wikimedia Incubator: > > > > * (azb) South Azerbaijani Wikipedia > > * (bqi) Bakhtiari Wikipedia > > * (hrx) Rio Grande German Wikipedia > > * (lrc) Northern Luri Wikipedia > > * (sli) Silesian German Wikipedia > > * (tcy) Tulu Wikipedia > > * (tly) Talysh Wikipedia > > * (pnb) Western Punjabi Wikiquote (already existing as Wikipedia) > > > > Within them the Northern Luri Wikipedia shows significant advances: > > * https://incubator.wikimedia.org/wiki/Wp/lrc/??????? > > Its localisation is in progress: > > * https://tools.wmflabs.org/robin/?tool=codelookup&code=lrc > > But the "lrc" locale code is still not in CLDR (even in the Comprehensive > > level) > > > > Same remark about the "hrx" locale code for Rio Grande German. > > > > Could I ask these "lrc" and "hrx" locale codes being added to the list of > > supported languages (at least in beta) for allowing input in survey, and > > translation of their language name ? > > > > > > _______________________________________________ > > CLDR-Users mailing list > > CLDR-Users at unicode.org > > http://unicode.org/mailman/listinfo/cldr-users > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srl at icu-project.org Sun Jun 15 00:07:40 2014 From: srl at icu-project.org (Steven R. Loomis) Date: Sat, 14 Jun 2014 22:07:40 -0700 Subject: Missing locales (lrc, hrx) In-Reply-To: References:

Message-ID: <2BC0C8A8-6D1E-4BF1-87E2-94201B5D40D8@icu-project.org> Enviado desde nuestro iPhone. > El jun 14, 2014, a las 7:07 PM, Philippe Verdy escribi?: > > That's OK. This will allow at least lining to these languages and designating them. > Wikimedia can still develop its own locale datas but will benefit from translations of these two names, provided or reviewed by linguist expers. That was what I was asking anyway (when I spoke about Bakthiari, this was already the case: language name translatable, even without the core data to have its own locale). Each locale has to be a case by case basis: which code, which script(s), who is signed up to provide core data Etc. > Unfortunately, this means that the autonoym cannot be added to the CLDR (it would require their own locale), but I suspect that the autonym for Luri, or Lori, is the same as in Persan, which would likely be a good fallback for this language). Cldr doesn't handle "autonyms" separately, they are just a part of core data. > > > > 2014-06-14 19:43 GMT+02:00 Shervin Afshar : >> In conclusion, I will file a ticket to have English names of these >> languages added, but for adding the languages themselves to the CLDR, >> I need a couple of linguists for each language to get in touch with me >> directly regarding core data. Here is the direct link to the list of >> needed data items for core data: >> >> http://cldr.unicode.org/index/cldr-spec/minimaldata/form >> Shervin >> ? shervinafshar.name >> >> >> On Fri, Jun 13, 2014 at 12:57 PM, Philippe Verdy wrote: >> > New localized Wikimedia projects have been approved for creation after their >> > test in Wikimedia Incubator: >> > >> > * (azb) South Azerbaijani Wikipedia >> > * (bqi) Bakhtiari Wikipedia >> > * (hrx) Rio Grande German Wikipedia >> > * (lrc) Northern Luri Wikipedia >> > * (sli) Silesian German Wikipedia >> > * (tcy) Tulu Wikipedia >> > * (tly) Talysh Wikipedia >> > * (pnb) Western Punjabi Wikiquote (already existing as Wikipedia) >> > >> > Within them the Northern Luri Wikipedia shows significant advances: >> > * https://incubator.wikimedia.org/wiki/Wp/lrc/??????? >> > Its localisation is in progress: >> > * https://tools.wmflabs.org/robin/?tool=codelookup&code=lrc >> > But the "lrc" locale code is still not in CLDR (even in the Comprehensive >> > level) >> > >> > Same remark about the "hrx" locale code for Rio Grande German. >> > >> > Could I ask these "lrc" and "hrx" locale codes being added to the list of >> > supported languages (at least in beta) for allowing input in survey, and >> > translation of their language name ? >> > >> > >> > _______________________________________________ >> > CLDR-Users mailing list >> > CLDR-Users at unicode.org >> > http://unicode.org/mailman/listinfo/cldr-users >> > > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From verdy_p at wanadoo.fr Sun Jun 15 01:19:06 2014 From: verdy_p at wanadoo.fr (Philippe Verdy) Date: Sun, 15 Jun 2014 08:19:06 +0200 Subject: Missing locales (lrc, hrx) In-Reply-To: <2BC0C8A8-6D1E-4BF1-87E2-94201B5D40D8@icu-project.org> References:

<2BC0C8A8-6D1E-4BF1-87E2-94201B5D40D8@icu-project.org> Message-ID: 2014-06-15 7:07 GMT+02:00 Steven R. Loomis : > Cldr doesn't handle "autonyms" separately, they are just a part of core > data. > You don't understand what I mean: an "autonym" is a language name defined in that language itself. To be able to define an autonym for a language, it requires support of its associated locale. This is not required when translating language names to other languages like English, so the English name "Northern Luri" can be easily defined in the existing survey (with just the addition of the language code to the root locale in the list of languages), and can be named in al languages that have a supported locale (e.g. Persan here), but you won't be able to input the autonym without the locale. So yes, an autonym requires some core data to be filled (the language code, the initial suggested language name before it is surveyed, the script (and implicitly its direction), the examplar characters (can be surveyed too), the set of digits and some basic punctuation (can also be surveyed), and a few supplementary data (population by country... with some very rough estimation, not essential on fact for creating a locale), and the plural rule (cannot be surveyed : in PO/POT locales, this is in fact the only core data really needed). So I think that you don't even really need more than just the plural rule and the numeric system used (however some validation tests in the survey tool are trying to check the characters in data : * this could be ignored by starting without the examplar or auxiliary set ; * after initial survey, if there's still no agreement on the examplar set, * all the data would remain in "draft" state (not published in the release) because data would have not been checked to be using the recommended subset of their script, or because there could have been diagreements about the script to use, meaning that locale variants may be needed, or data for transliterations be specified); even the country location may be left of the CLDR Survey tool did not use it to subclass locales in groups : it could have a "other" group or could even start with no coutnry at all, meaning that there won't be initially per-country variants of the base locale for that language). The initial submission for new language could use the same CLDR tool, but with input checks relaxed, and probably in a separate draft database. All that would be required would be to define the number and type of plural forms and the language code. Even the language name would not be necessary (but it will likely be prefilled with a suggested English name (for easy selection of the locale) and the suggested autonym. But both will be surveyed: the English name (or French, German, Arabic, Persan, etc.) will be surveyed in the main database (because English has a supported locale), the autonym would be surveyed in the draft database. With the benefit of less administrative cost to initiate a new locale and less difficulty for initial requester to provide everything you're asking for the initial data (which is much less than the "Core" data we see in the Survey !) -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin_hosken at sil.org Mon Jun 23 01:22:18 2014 From: martin_hosken at sil.org (Martin Hosken) Date: Mon, 23 Jun 2014 13:22:18 +0700 Subject: What script is bm.xml? Message-ID: <20140623132218.38bcc0d2@sil-mh6> Dear All, I notice that there is bm.xml and bm_Latn.xml. How is one to know that bm.xml should really be bm_Nkoo.xml? In supplementalData.xml there is a scripts="Latn Nkoo" entry, so we can tell that the script for bm.xml should be either Latn or Nkoo. Further in the same file it lists that 46% of the population of Mali is literate in Bambara in Latin, but only 2% in Nkoo script. Is there a data driven approach to resolving the script for a particular .xml file or do I have to use a process of elimination (bm_Latn.xml exists, so bm.xml must be Nkoo)? TIA, Yours, Martin From markus.icu at gmail.com Mon Jun 23 02:44:48 2014 From: markus.icu at gmail.com (Markus Scherer) Date: Mon, 23 Jun 2014 09:44:48 +0200 Subject: What script is bm.xml? In-Reply-To: <20140623132218.38bcc0d2@sil-mh6> References: <20140623132218.38bcc0d2@sil-mh6> Message-ID: On Mon, Jun 23, 2014 at 8:22 AM, Martin Hosken wrote: > I notice that there is bm.xml and bm_Latn.xml. How is one to know that > bm.xml should really be bm_Nkoo.xml? In supplementalData.xml there is a > scripts="Latn Nkoo" entry, so we can tell that the script for bm.xml should > be either Latn or Nkoo. Further in the same file it lists that 46% of the > population of Mali is literate in Bambara in Latin, but only 2% in Nkoo > script. > > Is there a data driven approach to resolving the script for a particular > .xml file or do I have to use a process of elimination (bm_Latn.xml exists, > so bm.xml must be Nkoo)? > Maybe it's a mistake? likelySubtags.xml seems to confirm the supplementalData: I suggest you submit a ticket. markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin_raymond at sil.org Mon Jun 23 08:53:45 2014 From: martin_raymond at sil.org (Martin Raymond) Date: Mon, 23 Jun 2014 14:53:45 +0100 Subject: What script is bm.xml? In-Reply-To: References: <20140623132218.38bcc0d2@sil-mh6> Message-ID: <53A83169.2040801@sil.org> On 23/06/2014 08:44, Markus Scherer wrote: > On Mon, Jun 23, 2014 at 8:22 AM, Martin Hosken > wrote: > > I notice that there is bm.xml and bm_Latn.xml. How is one to know that bm.xml should really be bm_Nkoo.xml? In > supplementalData.xml there is a scripts="Latn Nkoo" entry, so we can tell that the script for bm.xml should be > either Latn or Nkoo. Further in the same file it lists that 46% of the population of Mali is literate in Bambara > in Latin, but only 2% in Nkoo script. > > Is there a data driven approach to resolving the script for a particular .xml file or do I have to use a process > of elimination (bm_Latn.xml exists, so bm.xml must be Nkoo)? > > > Maybe it's a mistake? > likelySubtags.xml seems to confirm the supplementalData: > > I suggest you submit a ticket. > > markus Hi Martin, This is a work in progress. The current CLDR Release, 25 just contains bm.xml and bm_ML.xml. bm.xml is Latin script data, which concurs with the entry in likelySubtags.xml. Ticket http://unicode.org/cldr/trac/ticket/7232 is to set up a separate bm_Nkoo locale, but that work has not yet been completed. Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: