From cldr-users at unicode.org Sat Dec 2 05:52:43 2017 From: cldr-users at unicode.org (Kip Cole via CLDR-Users) Date: Sat, 2 Dec 2017 22:52:43 +1100 Subject: UCA question / Produce Collation Element Arrays Message-ID: <56E27983-5488-41D7-818F-AA6E8AD35A47@gmail.com> Markus, probably another dumb question but I?m making progress. In section 7.2 or TR10 the algorithm for producing a CE array says: S2.1 Find the longest initial substring S at each point that has a match in the collation element table. S2.1.1 If there are any non-starters following S, process each non-starter C. S2.1.2 If C is an unblocked non-starter with respect to S, find if S + C has a match in the collation element table. Note: This condition is specific to non-starters, and is not precisely the same as the concept of blocking in normalization, since it is dealing with look ahead for a discontiguous match, rather than with normalization forms. Hangul jamos and other starters are only supported with contiguous matches . S2.1.3 If there is a match, replace S by S + C, and remove C. For s2.1.1 I?m trying to confirm what ?process each non-starter C? means. Best I understand so far it means ?ignore? or ?skip? all C that are non-starters. is that the correct interpretation? -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Sat Dec 2 06:32:55 2017 From: cldr-users at unicode.org (Kip Cole via CLDR-Users) Date: Sat, 2 Dec 2017 23:32:55 +1100 Subject: UCA question / Produce Collation Element Arrays In-Reply-To: <56E27983-5488-41D7-818F-AA6E8AD35A47@gmail.com> References: <56E27983-5488-41D7-818F-AA6E8AD35A47@gmail.com> Message-ID: <88B29D14-0D41-470A-9BE7-4E80C8191B02@gmail.com> Markus and co, probably another dumb question but I?m making progress. In section 7.2 or TR10 the algorithm for producing a CE array says: > S2.1 Find the longest initial substring S at each point that has a match in the collation element table. > > S2.1.1 If there are any non-starters following S, process each non-starter C. > > S2.1.2 If C is an unblocked non-starter with respect to S, find if S + C has a match in the collation element table. > > Note: This condition is specific to non-starters, and is not precisely the same as the concept of blocking in normalization, since it is dealing with look ahead for a discontiguous match, rather than with normalization forms. Hangul jamos and other starters are only supported with contiguous matches . > > S2.1.3 If there is a match, replace S by S + C, and remove C. > For s2.1.1 I?m trying to confirm what ?process each non-starter C? means. Best I understand so far it means ?ignore? or ?skip? all C that are non-starters. is that the correct interpretation? It would seem to be consistent with the annotation: Steps 2.1.1 ?process each non-starter C? and 2.1.2 ?find if S + C has a match in the table?, where one or more intermediate non-starters may be skipped (making it discontiguous), extends a contraction match by one code point at a time to find the next match. In particular, if C is a non-starter and if the table had a mapping for ABC but not one for AB, then a discontiguous-contraction match on text ABMC (with M being a skippable non-starter) would never be found. Well-formedness condition 5 requires the presence of the prefix contraction AB. From cldr-users at unicode.org Sat Dec 2 09:25:30 2017 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Sat, 2 Dec 2017 16:25:30 +0100 Subject: UCA question / Produce Collation Element Arrays In-Reply-To: <88B29D14-0D41-470A-9BE7-4E80C8191B02@gmail.com> References: <56E27983-5488-41D7-818F-AA6E8AD35A47@gmail.com> <88B29D14-0D41-470A-9BE7-4E80C8191B02@gmail.com> Message-ID: Supposed that you have the following, where S are starters and n are non-starters. | represents the current position. | S1 S2 S3 n1 n2 n3 n4 S4 S1 S2 isn't in the CET, so you emit and logically change the input. I'll represent that as: w(S1) | S2 S3 n1 n2 n3 n4 S4 S2 S3 are in the CET, so set S to them. I'll show S by [...] w(S1) [ S2 S3 ] | n1 n2 n3 n4 S4 You then successively look through each of the n's. Suppose S2 S3 n1 isn't in the CET, so you continue. Suppose S2 S3 n2 is in the CET, but n2 is blocked, so you also continue Suppose S2 S3 n3 is in the CET, and n3 is not blocked, so you set S to them. Logically the input list now looks like the following w(S1) [ S2 S3 n3 ] n1 n2 | n4 S4 Suppose S2 S3 n3 n4 is in the CET, and n4 is not blocked, so you set S to them. You now have: w(S1) [ S2 S3 n3 n4 ] n1 n2 | S4 You have run out of non-starters so you stop and emit weight(S2 S3 n3 n4), and reset the current position to after them. w(S1) w(S2 S3 n3 n4) | n1 n2 S4 So the next item you consider is n1. There is just one subtlety. Notice that when considering whether n4 is blocked, you don't consider the items you have already put into S. So n3 and n4 can have the same ccc. Normally people don't actually modify the input stream, so thinking n4 is blocked is an easy error to make. Mark On Sat, Dec 2, 2017 at 1:32 PM, Kip Cole via CLDR-Users < cldr-users at unicode.org> wrote: > Markus and co, probably another dumb question but I?m making progress. In > section 7.2 or TR10 the algorithm for producing a CE array says: > > > S2.1 Find the longest initial substring S at each point that has a match > in the collation element table. > > > > S2.1.1 If there are any non-starters following S, process each > non-starter C. > > > > S2.1.2 If C is an unblocked non-starter with respect to S, find if S + C > has a match in the collation element table. > > > > Note: This condition is specific to non-starters, and is not precisely > the same as the concept of blocking in normalization, since it is dealing > with look ahead for a discontiguous match, rather than with normalization > forms. Hangul jamos and other starters are only supported with contiguous > matches . > > > > S2.1.3 If there is a match, replace S by S + C, and remove C. > > > > For s2.1.1 I?m trying to confirm what ?process each non-starter C? means. > Best I understand so far it means ?ignore? or ?skip? all C that are > non-starters. is that the correct interpretation? It would seem to be > consistent with the annotation: > > Steps 2.1.1 ?process each non-starter C? and 2.1.2 ?find if S + C has a > match in the table?, where one or more intermediate non-starters may be > skipped (making it discontiguous), extends a contraction match by one code > point at a time to find the next match. In particular, if C is a > non-starter and if the table had a mapping for ABC but not one for AB, then > a discontiguous-contraction match on text ABMC (with M being a skippable > non-starter) would never be found. Well-formedness condition 5 requires the > presence of the prefix contraction AB. > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Sat Dec 2 13:52:15 2017 From: cldr-users at unicode.org (Richard Wordingham via CLDR-Users) Date: Sat, 2 Dec 2017 19:52:15 +0000 Subject: UCA question / Produce Collation Element Arrays In-Reply-To: References: <56E27983-5488-41D7-818F-AA6E8AD35A47@gmail.com> <88B29D14-0D41-470A-9BE7-4E80C8191B02@gmail.com> Message-ID: <20171202195215.08ae11a9@JRWUBU2> On Sat, 2 Dec 2017 16:25:30 +0100 Mark Davis ?? via CLDR-Users wrote: > Supposed that you have the following, where S are starters and n are > non-starters. | represents the current position. > > | S1 S2 S3 n1 n2 n3 n4 S4 > > S1 S2 isn't in the CET, so you emit and logically change the input. > I'll represent that as: > > w(S1) | S2 S3 n1 n2 n3 n4 S4 One subtle nitpick here. One also has to eliminate , , ... and before one can conclude that the relevant collating element is . I do this by recording whether each collating element and prefix of a collating element is the prefix of a collating element. This sort of tagging is not logically necessary, but is practically very useful. The simplest example of this issue in the DUCET is . Or is a conformant implementation of the UCA allowed to reject DUCET even if one can find a way to specify that it be used? There's no explicit concession that a CET has to be well-formed. Richard. From cldr-users at unicode.org Sun Dec 3 06:36:57 2017 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Sun, 3 Dec 2017 13:36:57 +0100 Subject: UCA question / Produce Collation Element Arrays In-Reply-To: <20171202195215.08ae11a9@JRWUBU2> References: <56E27983-5488-41D7-818F-AA6E8AD35A47@gmail.com> <88B29D14-0D41-470A-9BE7-4E80C8191B02@gmail.com> <20171202195215.08ae11a9@JRWUBU2> Message-ID: The algorithm is predicated on any input table being well formed. ( http://unicode.org/reports/tr10/#Well-Formed) Tibetan is a documented exception in the DUCET, but it also documents how to fix it. Mark On Sat, Dec 2, 2017 at 8:52 PM, Richard Wordingham via CLDR-Users < cldr-users at unicode.org> wrote: > On Sat, 2 Dec 2017 16:25:30 +0100 > Mark Davis ?? via CLDR-Users wrote: > > > Supposed that you have the following, where S are starters and n are > > non-starters. | represents the current position. > > > > | S1 S2 S3 n1 n2 n3 n4 S4 > > > > S1 S2 isn't in the CET, so you emit and logically change the input. > > I'll represent that as: > > > > w(S1) | S2 S3 n1 n2 n3 n4 S4 > > One subtle nitpick here. One also has to eliminate , S3 n1>, ... and before one can conclude that > the relevant collating element is . I do this by recording whether > each collating element and prefix of a collating element is the prefix > of a collating element. This sort of tagging is not logically > necessary, but is practically very useful. > > The simplest example of this issue in the DUCET is U+0F80>. Or is a conformant implementation of the UCA allowed to reject > DUCET even if one can find a way to specify that it be used? There's > no explicit concession that a CET has to be well-formed. > > Richard. > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Sun Dec 3 13:23:51 2017 From: cldr-users at unicode.org (Richard Wordingham via CLDR-Users) Date: Sun, 3 Dec 2017 19:23:51 +0000 Subject: UCA question / Produce Collation Element Arrays In-Reply-To: References: <56E27983-5488-41D7-818F-AA6E8AD35A47@gmail.com> <88B29D14-0D41-470A-9BE7-4E80C8191B02@gmail.com> <20171202195215.08ae11a9@JRWUBU2> Message-ID: <20171203192351.70e2f2ed@JRWUBU2> On Sun, 3 Dec 2017 13:36:57 +0100 Mark Davis ?? via CLDR-Users wrote: > The algorithm is predicated on any input table being well formed. ( > http://unicode.org/reports/tr10/#Well-Formed) > > Tibetan is a documented exception in the DUCET, but it also documents > how to fix it. But adding the fix does not preserve the order of all strings in the Tibetan script, only the order of linguistically plausible strings. The example is the order of the non-defective NFD strings ???? 0F40 0FB2 0F84 0F71 ??? 0F40 0FB2 0F84 ??? 0F40 0FB2 0F71 (I've only added U+0F40 to make the strings non-defective.) Relevant facts are: ccc(0F84) = 9 ccc(0F71) = 129 CE(0F71) < CE(0F84) All relevant collation elements have different, primary weights. Under DUCET, we get: Key of OF40 0FB2 OF71 = CE(0F40) CE(OFB2) CE(0F71) Key of 0F40 0FB2 0F84 = CE(0F40) CE(0FB2) CE(0F84) Key of 0F40 0FB2 OF84 0F71 = CE(0F40) CE(0FB2) CE(0F84) CE(0F71) Tailoring DUCET by adding 'all ten' contractions, making a well formed collation while not perturbing the sorting of Sanskrit, yields a different order: Key of OF40 0FB2 OF71 = CE(0F40) CE(OFB2) CE(0F71) Key of 0F40 0FB2 OF84 0F71 = CE(0F40) CE(0FB2) CE(0F71) CE(0F84) Key of 0F40 0FB2 0F84 = CE(0F40) CE(0FB2) CE(0F84) To create a well-formed collation equivalent to DUCET, one has to add many more contractions - about 650 by my reckoning. So, are you saying that a UCA-conformant implementation can simply reject DUCET for not being well-formed? Alternatively, are you claiming that there is a known, straightforward algorithm to repair any case of non-compliance with WF5 without changing the ordering of strings? Richard. From cldr-users at unicode.org Sun Dec 3 13:49:03 2017 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Sun, 3 Dec 2017 20:49:03 +0100 Subject: UCA question / Produce Collation Element Arrays In-Reply-To: <20171203192351.70e2f2ed@JRWUBU2> References: <56E27983-5488-41D7-818F-AA6E8AD35A47@gmail.com> <88B29D14-0D41-470A-9BE7-4E80C8191B02@gmail.com> <20171202195215.08ae11a9@JRWUBU2> <20171203192351.70e2f2ed@JRWUBU2> Message-ID: Mark On Sun, Dec 3, 2017 at 8:23 PM, Richard Wordingham via CLDR-Users < cldr-users at unicode.org> wrote: > On Sun, 3 Dec 2017 13:36:57 +0100 > Mark Davis ?? via CLDR-Users wrote: > > > The algorithm is predicated on any input table being well formed. ( > > http://unicode.org/reports/tr10/#Well-Formed) > > > > Tibetan is a documented exception in the DUCET, but it also documents > > how to fix it. > > But adding the fix does not preserve the order of all strings in > the Tibetan script, only the order of linguistically plausible strings. > The example is the order of the non-defective NFD strings > > ???? 0F40 0FB2 0F84 0F71 > ??? 0F40 0FB2 0F84 > ??? 0F40 0FB2 0F71 > > (I've only added U+0F40 to make the strings non-defective.) > > Relevant facts are: > > ccc(0F84) = 9 > ccc(0F71) = 129 > CE(0F71) < CE(0F84) > All relevant collation elements have different, primary weights. > > Under DUCET, we get: > Key of OF40 0FB2 OF71 = CE(0F40) CE(OFB2) CE(0F71) > Key of 0F40 0FB2 0F84 = CE(0F40) CE(0FB2) CE(0F84) > Key of 0F40 0FB2 OF84 0F71 = CE(0F40) CE(0FB2) CE(0F84) CE(0F71) > > Tailoring DUCET by adding 'all ten' contractions, making a well formed > collation while not perturbing the sorting of Sanskrit, yields a > different order: > > Key of OF40 0FB2 OF71 = CE(0F40) CE(OFB2) CE(0F71) > Key of 0F40 0FB2 OF84 0F71 = CE(0F40) CE(0FB2) CE(0F71) CE(0F84) > Key of 0F40 0FB2 0F84 = CE(0F40) CE(0FB2) CE(0F84) > > To create a well-formed collation equivalent to DUCET, one has to add > many more contractions - about 650 by my reckoning.? > So, are you saying that a UCA-conformant implementation can simply > reject DUCET for not being well-formed? ?Well, yes, if they don't use ? http://unicode.org/reports/tr10/#Well_Formed_DUCET to fix it in one way or another. CLDR does do adjustments, for example. Alternatively, are you > claiming that there is a known, straightforward algorithm to repair > any case of non-compliance with WF5 without changing the ordering of > strings? > The algorithm is not defined for non-well-formed strings, so it is odd to talk about "without changing the ordering of strings". I think your main point (above) is that you think that a batch of other changes are necessary for it to work for Tibetan. That may be the case; I am not that familiar with Tibetan requirements. ? > > Richard. > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Sun Dec 3 16:48:57 2017 From: cldr-users at unicode.org (Richard Wordingham via CLDR-Users) Date: Sun, 3 Dec 2017 22:48:57 +0000 Subject: UCA question / Produce Collation Element Arrays In-Reply-To: References: <56E27983-5488-41D7-818F-AA6E8AD35A47@gmail.com> <88B29D14-0D41-470A-9BE7-4E80C8191B02@gmail.com> <20171202195215.08ae11a9@JRWUBU2> <20171203192351.70e2f2ed@JRWUBU2> Message-ID: <20171203224857.6a805539@JRWUBU2> On Sun, 3 Dec 2017 20:49:03 +0100 Mark Davis ?? via CLDR-Users wrote: > Mark > On Sun, Dec 3, 2017 at 8:23 PM, Richard Wordingham via CLDR-Users < > cldr-users at unicode.org> wrote: > > On Sun, 3 Dec 2017 13:36:57 +0100 > > Mark Davis ?? via CLDR-Users wrote: > > So, are you saying that a UCA-conformant implementation can simply > > reject DUCET for not being well-formed? > ?Well, yes, if they don't use ? > http://unicode.org/reports/tr10/#Well_Formed_DUCET to fix it in one > way or another. CLDR does do adjustments, for example. Interesting. So an implementation can reject the conformance test as invalid. It would seem that an implementation that simply prints "DUCET is not well-formed!" passes the conformance test provided. What do you mean by 'CLDR does...'? I have seen ICU wrongly reject apparently redundant collating elements of a collation - but perhaps I was doing something wrong. Do you just mean that the CLDR root collation includes the ten additions? > > Alternatively, are you > > claiming that there is a known, straightforward algorithm to repair > > any case of non-compliance with WF5 without changing the ordering of > > strings? > The algorithm is not defined for non-well-formed strings, so it is > odd to talk about "without changing the ordering of strings". I think you've misunderstood my assertion. By the "ordering of strings" I mean the order in which they are sorted, not the ordering of the bytes within the strings. I was not talking about strings that are not well-formed. > I think > your main point (above) is that you think that a batch of other > changes are necessary for it to work for Tibetan. That may be the > case; I am not that familiar with Tibetan requirements. No, my new point was that to make DUCET comply with WF5 without altering the ordering, it requires about 650 additional contractions. However, only the 10 (really 6) contractions are needed for natural language strings. The 650, for example, include four contractions for each virama, though in natural language there is only one virama that occurs with Tibetan consonants. The UCA conformance test includes many strings that do not occur in natural language, as in the example given in https://www.unicode.org/Public/UCA/10.0.0/CollationTest.html , namely 0FB2 0F80 0F71 0334, which does not sort equal to 0F77 0334 under DUCET, but does when just the ten contractions are added. This pair no longer appear in the conformance test. Richard. From cldr-users at unicode.org Mon Dec 4 05:59:21 2017 From: cldr-users at unicode.org (Richard Wordingham via CLDR-Users) Date: Mon, 4 Dec 2017 11:59:21 +0000 Subject: UCA question / Produce Collation Element Arrays In-Reply-To: <20171203192351.70e2f2ed@JRWUBU2> References: <56E27983-5488-41D7-818F-AA6E8AD35A47@gmail.com> <88B29D14-0D41-470A-9BE7-4E80C8191B02@gmail.com> <20171202195215.08ae11a9@JRWUBU2> <20171203192351.70e2f2ed@JRWUBU2> Message-ID: <20171204115921.2455d761@JRWUBU2> On Sun, 3 Dec 2017 19:23:51 +0000 Richard Wordingham via CLDR-Users wrote: > But adding the fix does not preserve the order of all strings in > the Tibetan script, only the order of linguistically plausible > strings. > To create a well-formed collation equivalent to DUCET, one has to add > many more contractions - about 650 by my reckoning. I've checked my calculations, and it's actually about 970 NFD entries. They are: CE(0FB2 x) = CE(0FB2) CE(x) CE(0FB2 x 0F80) = CE(0FB2 0F80) CE(x) CE(0FB2 x 0F71 0F80) = CE(0FB2 0F71 0F80) CE(x) CE(0FB3 x) = CE(0FB3) CE(x) CE(0FB3 x 0F80) = CE(0FB3 0F80) CE(x) CE(0FB3 x 0F71 0F80) = CE(0FB3 0F71 0F80) CE(x) wherever ccc(x) < ccc(0F71), i.e. ccc(x) < 129. The first set undoes the changes wrought by adding the contraction CE(0FB2 0F71) for the sake of WF5. The second and third sets undo the changes wrought by the first set. Richard. From cldr-users at unicode.org Thu Dec 21 17:25:56 2017 From: cldr-users at unicode.org (Loic Dachary via CLDR-Users) Date: Fri, 22 Dec 2017 00:25:56 +0100 Subject: Kurdish Kurmanji progress Message-ID: Hi, I'm interested in following the progress of the work done on Kurdish Kurmanji[1] to be notified when it transitions from "seed" to "common". How can I do that ? Thanks in advance for any pointers you can provide :-) [1] http://www.unicode.org/cldr/charts/32/supplemental/locale_coverage.html -- Lo?c Dachary, Artisan Logiciel Libre From cldr-users at unicode.org Thu Dec 28 19:54:03 2017 From: cldr-users at unicode.org (Shervin Afshar via CLDR-Users) Date: Thu, 28 Dec 2017 17:54:03 -0800 Subject: Kurdish Kurmanji progress In-Reply-To: References: Message-ID: You could monitor the data files which at the moment live in seed directory in the codebase. When the locale data file is mature enough, they would be moved to common directory . Also see this comment regarding "seed" vs. "common" and why a locale being under either of these shouldn't make a difference for contributors. ? Shervin On Thu, Dec 21, 2017 at 3:25 PM, Loic Dachary via CLDR-Users < cldr-users at unicode.org> wrote: > Hi, > > I'm interested in following the progress of the work done on Kurdish > Kurmanji[1] to be notified when it transitions from "seed" to "common". How > can I do that ? > > Thanks in advance for any pointers you can provide :-) > > [1] http://www.unicode.org/cldr/charts/32/supplemental/locale_ > coverage.html > > -- > Lo?c Dachary, Artisan Logiciel Libre > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Fri Dec 29 02:59:46 2017 From: cldr-users at unicode.org (Loic Dachary via CLDR-Users) Date: Fri, 29 Dec 2017 09:59:46 +0100 Subject: Kurdish Kurmanji progress In-Reply-To: References: Message-ID: <84b915e8-fd7c-941f-bb09-826c1a86b4bc@dachary.org> Hi, On 12/29/2017 02:54 AM, Shervin Afshar wrote: > You could monitor the data files which at the moment live in seed directory ?in the codebase. When the locale data file is mature enough, they would be moved to common directory . Also see this comment regarding "seed" vs. "common" and why a locale being under either of these shouldn't make a difference for contributors. Thanks a lot for the pointer :-) I'm not fluent in Kurdish Kurmanji and therefore unable to participate, unfortunately. Should I find someone motivated to help, is http://cldr.unicode.org/development/new-cldr-developers the best place to suggest to get them started ? Or is there another guide that I may have missed ? Cheers > > > ? Shervin > > On Thu, Dec 21, 2017 at 3:25 PM, Loic Dachary via CLDR-Users > wrote: > > Hi, > > I'm interested in following the progress of the work done on Kurdish Kurmanji[1] to be notified when it transitions from "seed" to "common". How can I do that ? > > Thanks in advance for any pointers you can provide :-) > > [1] http://www.unicode.org/cldr/charts/32/supplemental/locale_coverage.html > > -- > Lo?c Dachary, Artisan Logiciel Libre > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > > -- Lo?c Dachary, Artisan Logiciel Libre From cldr-users at unicode.org Fri Dec 29 12:00:44 2017 From: cldr-users at unicode.org (Shervin Afshar via CLDR-Users) Date: Fri, 29 Dec 2017 10:00:44 -0800 Subject: Kurdish Kurmanji progress In-Reply-To: <84b915e8-fd7c-941f-bb09-826c1a86b4bc@dachary.org> References: <84b915e8-fd7c-941f-bb09-826c1a86b4bc@dachary.org> Message-ID: That page you pointed to is for developers. Data is collected for most of the entries through Survey Tool. You can find more information here: http://cldr.unicode.org/index/survey-tool/accounts ? Shervin On Fri, Dec 29, 2017 at 12:59 AM, Loic Dachary wrote: > Hi, > > On 12/29/2017 02:54 AM, Shervin Afshar wrote: > > You could monitor the data files which at the moment live in seed > directory in the > codebase. When the locale data file is mature enough, they would be moved > to common directory trac/browser/trunk/common/main>. Also see this comment < > https://unicode.org/cldr/trac/ticket/9964#comment:2> regarding "seed" vs. > "common" and why a locale being under either of these shouldn't make a > difference for contributors. > > Thanks a lot for the pointer :-) > > I'm not fluent in Kurdish Kurmanji and therefore unable to participate, > unfortunately. Should I find someone motivated to help, is > http://cldr.unicode.org/development/new-cldr-developers the best place to > suggest to get them started ? Or is there another guide that I may have > missed ? > > Cheers > > > > > > > ? Shervin > > > > On Thu, Dec 21, 2017 at 3:25 PM, Loic Dachary via CLDR-Users < > cldr-users at unicode.org > wrote: > > > > Hi, > > > > I'm interested in following the progress of the work done on Kurdish > Kurmanji[1] to be notified when it transitions from "seed" to "common". How > can I do that ? > > > > Thanks in advance for any pointers you can provide :-) > > > > [1] http://www.unicode.org/cldr/charts/32/supplemental/locale_ > coverage.html coverage.html> > > > > -- > > Lo?c Dachary, Artisan Logiciel Libre > > _______________________________________________ > > CLDR-Users mailing list > > CLDR-Users at unicode.org > > http://unicode.org/mailman/listinfo/cldr-users < > http://unicode.org/mailman/listinfo/cldr-users> > > > > > > -- > Lo?c Dachary, Artisan Logiciel Libre > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Fri Dec 29 14:20:07 2017 From: cldr-users at unicode.org (Loic Dachary via CLDR-Users) Date: Fri, 29 Dec 2017 21:20:07 +0100 Subject: Kurdish Kurmanji progress In-Reply-To: References: <84b915e8-fd7c-941f-bb09-826c1a86b4bc@dachary.org> Message-ID: <6fdc9a90-a45d-4996-4fbd-5236c4649c94@dachary.org> Thanks for clearing the confusion, this is most helpful :-) On 12/29/2017 07:00 PM, Shervin Afshar wrote: > That page you pointed to is for developers. Data is collected for most of the entries through Survey Tool. You can find more information here:?http://cldr.unicode.org/index/survey-tool/accounts > > ? Shervin > > On Fri, Dec 29, 2017 at 12:59 AM, Loic Dachary > wrote: > > Hi, > > On 12/29/2017 02:54 AM, Shervin Afshar wrote: > > You could monitor the data files which at the moment live in seed directory >?in the codebase. When the locale data file is mature enough, they would be moved to common directory >. Also see this comment > regarding "seed" vs. "common" and why a locale being under either of these shouldn't make a difference for contributors. > > Thanks a lot for the pointer :-) > > I'm not fluent in Kurdish Kurmanji and therefore unable to participate, unfortunately. Should I find someone motivated to help, is http://cldr.unicode.org/development/new-cldr-developers the best place to suggest to get them started ? Or is there another guide that I may have missed ? > > Cheers > > > > > > > ? Shervin > > > > On Thu, Dec 21, 2017 at 3:25 PM, Loic Dachary via CLDR-Users >> wrote: > > > >? ? ?Hi, > > > >? ? ?I'm interested in following the progress of the work done on Kurdish Kurmanji[1] to be notified when it transitions from "seed" to "common". How can I do that ? > > > >? ? ?Thanks in advance for any pointers you can provide :-) > > > >? ? ?[1] http://www.unicode.org/cldr/charts/32/supplemental/locale_coverage.html > > > > >? ? ?-- > >? ? ?Lo?c Dachary, Artisan Logiciel Libre > >? ? ?_______________________________________________ > >? ? ?CLDR-Users mailing list > >? ? ?CLDR-Users at unicode.org > > >? ? ?http://unicode.org/mailman/listinfo/cldr-users > > > > > > > -- > Lo?c Dachary, Artisan Logiciel Libre > > -- Lo?c Dachary, Artisan Logiciel Libre