From cldr-users at unicode.org Sat Dec 1 07:59:58 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Sat, 1 Dec 2018 14:59:58 +0100 Subject: Minifying CLDR sources (also: Re: Hard-to-use "annotations" files in LDML) In-Reply-To: <531cf9b2-b817-eba4-fd3e-a8d82e3ae993@orange.fr> References: <32b9eb3b-6fdb-2b22-a6bb-0f34fdf1d34d@orange.fr> <4ace2465-2a07-4ec6-8ece-a25ec31300fa@ix.netcom.com> <531cf9b2-b817-eba4-fd3e-a8d82e3ae993@orange.fr> Message-ID: <4c077c9e-f132-a8b3-baec-6b6bec76979b@orange.fr> VS Code is a great text editor. Thanks for sharing the hint. My issue is just that while taking into account all my key remappings, it does not so for BKSP, so the backspace key did Ctrl+Backspace all at once. Just fixed it by editing keybindings.json. > Would you mind adding these comments to a copy of the following two files: We may identify the emoji using their short names already present in the files next to the keywords. I now understand that leaving out the code points is a way of minifying the files. Eg the English flag element would be in annotationsDerived/fr.xml: drapeau : Angleterre But indeed for survey we don?t need that information. Sorry for my request. Given that minifying the files is an interesting issue, one might wish to go even a step further by collapsing the element of the keywords and the element of the short name. Taking again the first emoji (modifier) in annotations/fr.xml: Now: peau | peau claire peau claire After collapsing: peau | peau claire That would reduce these files to almost half their actual size without any loss of data, given extracting the short name from an argument value rather than from an element content is only a matter of processing XML/LDML. Best regards, Marcel From cldr-users at unicode.org Sun Dec 2 06:34:14 2018 From: cldr-users at unicode.org (=?UTF-8?Q?Christoph_P=C3=A4per?= via CLDR-Users) Date: Sun, 2 Dec 2018 13:34:14 +0100 (CET) Subject: Hard-to-use "annotations" files in LDML In-Reply-To: References: <32b9eb3b-6fdb-2b22-a6bb-0f34fdf1d34d@orange.fr> <4ace2465-2a07-4ec6-8ece-a25ec31300fa@ix.netcom.com> Message-ID: <2051409773.16150.1543754054531@ox.hosteurope.de> Marcel Schneider via CLDR-Users: > > Please help people like me with solutions that work out of the box. This. The web survey tool is slow and unstable, so editing the files directly -- after one has figured out this is possible at all -- is preferable for some tasks at least. XML is not the best format to do that, but I can cope with it. From cldr-users at unicode.org Sun Dec 2 16:33:24 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Sun, 2 Dec 2018 23:33:24 +0100 Subject: Failing to correct assessed errors via ST Message-ID: French has a number of remaining errors to correct, listed in ticket #11303: https://unicode.org/cldr/trac/ticket/11303 But the GUI of ST has no button enabled to do it. So I?ve tried to submit this file (most comments stripped off): {0} par kilogramme But submission failed because that value is read-only in this limited survey round. It would be nice if ST would be set up so that we can fix errors now instead of leaving them for survey 36. Thanks. Marcel From cldr-users at unicode.org Sun Dec 2 18:56:39 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Mon, 3 Dec 2018 01:56:39 +0100 Subject: Hard-to-use "annotations" files in LDML In-Reply-To: <2051409773.16150.1543754054531@ox.hosteurope.de> References: <32b9eb3b-6fdb-2b22-a6bb-0f34fdf1d34d@orange.fr> <4ace2465-2a07-4ec6-8ece-a25ec31300fa@ix.netcom.com> <2051409773.16150.1543754054531@ox.hosteurope.de> Message-ID: <7f1524e6-f8d5-60a6-ad2e-2bac2591df2a@orange.fr> On 02/12/2018 13:34, Christoph P?per via CLDR-Users wrote: > Marcel Schneider via CLDR-Users: >> >> Please help people like me with solutions that work out of the box. > > This. The web survey tool is slow and unstable, so editing the files > directly -- after one has figured out this is possible at all -- is > preferable for some tasks at least. XML is not the best format to do > that, but I can cope with it. The new problem now is that this survey is a limited one, so many fixes are impossible. Survey Tool itself is reported to be much more stable. That said, I?ve added the emoji code points in trailing comments following Steven R. Loomis? advice, using VS Code and LibreOffice Calc. Using LDML files we can do our homework in expectation of survey 36, while righ now French has to catch up vetting. (German is among the most advanced locales, regarding the number of already submitted votes.) http://st.unicode.org/cldr-apps/v#statistics/// Best regards, Marcel From cldr-users at unicode.org Mon Dec 3 01:34:52 2018 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Mon, 3 Dec 2018 08:34:52 +0100 Subject: Failing to correct assessed errors via ST In-Reply-To: References: Message-ID: This is a limited-scope release, thus changes are specific to certain types of items as described on http://cldr.unicode.org/translation. Other items will be read-only. This is due to the season ? and because of shift of 3 months forward in the release of Unicode 12, giving limited time for data resolution. So please hold off on trying to make other changes until the general submission in v36. Note also that we are still in Shakedown until the limited submission is begun: the target is for end of day today unless there are significant problems in the tool. Mark On Sun, Dec 2, 2018 at 11:34 PM Marcel Schneider via CLDR-Users < cldr-users at unicode.org> wrote: > French has a number of remaining errors to correct, listed in ticket > #11303: > > https://unicode.org/cldr/trac/ticket/11303 > > But the GUI of ST has no button enabled to do it. So I?ve tried to submit > this file (most comments stripped off): > > > > > > > > > > > > {0} par > kilogramme > > > > > > But submission failed because that value is read-only in this limited > survey round. > > It would be nice if ST would be set up so that we can fix errors now > instead of leaving them for survey 36. > > Thanks. > > Marcel > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Mon Dec 3 06:37:51 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Mon, 3 Dec 2018 13:37:51 +0100 Subject: Failing to correct assessed errors via ST In-Reply-To: References: Message-ID: On 03/12/2018 08:34, Mark Davis ?? via CLDR-Users wrote: > This is a limited-scope release, thus changes are specific to certain types of items as described on http://cldr.unicode.org/translation. Other items will be read-only. > > This is due to the season ? and because of shift of 3 months forward in the release of Unicode 12, giving limited time for data resolution. So please hold off on trying to make other changes until the general submission in v36. > > Note also that we are still in Shakedown until the limited submission is begun: the target is for end of day today unless there are significant problems in the tool. > > Mark OK, got it. Thank you for clarification. Best Regards, Marcel > // > > > On Sun, Dec 2, 2018 at 11:34 PM Marcel Schneider via CLDR-Users > wrote: > > French has a number of remaining errors to correct, listed in ticket #11303: > > https://unicode.org/cldr/trac/ticket/11303 > > But the GUI of ST has no button enabled to do it. So I?ve tried to submit this file (most comments stripped off): > > > > > ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? > ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? {0} par kilogramme? ? > ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? > > > But submission failed because that value is read-only in this limited survey round. > > It would be nice if ST would be set up so that we can fix errors now instead of leaving them for survey 36. > > Thanks. > > Marcel > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > > > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Tue Dec 4 17:58:45 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Wed, 5 Dec 2018 00:58:45 +0100 Subject: Enhanced support of bulk submission during partial or full survey (was: Re: Hard-to-use "annotations" files in LDML) In-Reply-To: <2051409773.16150.1543754054531@ox.hosteurope.de> References: <32b9eb3b-6fdb-2b22-a6bb-0f34fdf1d34d@orange.fr> <4ace2465-2a07-4ec6-8ece-a25ec31300fa@ix.netcom.com> <2051409773.16150.1543754054531@ox.hosteurope.de> Message-ID: <533eae0c-ff47-8eee-104f-3065d6afc230@orange.fr> On 02/12/2018 13:34, Christoph P?per via CLDR-Users wrote: [?] > editing the files directly [?] is preferable for some tasks [?]. > [?] I can cope with it [XML]. As suggested in ticket #11646 https://unicode.org/cldr/trac/ticket/11646 it would be helpful when all new items and other priority items were summarized in one single LDML file for quick localization in a text editor, which often involves bulk search-and-replace, leaving alone the identifiers (values of `type` arguments) of course. Doing the edits in a text editor has the additional benefit of easily checking the correct non-breakable spaces when using appropriate software. (I?m using Gedit with the Draw Spaces plugin installed and "Draw non-breaking spaces" enabled.) It?s now up to the Community to look if that would be handy for everyone or at least for many people, as setting up such a file just for a single community or worse, a single vetter would be a non-starter. Meanwhile we may already do it for our personal use and share the off-falling file(s) with co-vetters. This is a call for feedback. Best wishes, Marcel From cldr-users at unicode.org Fri Dec 7 04:52:08 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Fri, 7 Dec 2018 11:52:08 +0100 Subject: SurveyTool support request: How do you vote for empty? Message-ID: Hello, Many time zones do not switch to daylight time, and many zones seem not to have a short name like "PT" or "PST" for Pacific [Standard] Time. But I?m unable to find out how to get appropriate votes into SurveyTool. I?ve filed a ticket about the issue as I see it: https://unicode.org/cldr/trac/ticket/11655 Thanks for any hint. Marcel From cldr-users at unicode.org Fri Dec 7 23:27:47 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Sat, 8 Dec 2018 06:27:47 +0100 Subject: SurveyTool support request: How do you vote for empty? In-Reply-To: References: Message-ID: > Many time zones do not switch to daylight time, and many > zones seem not to have a short name like "PT" or "PST" > for Pacific [Standard] Time. > > But I?m unable to find out how to get appropriate votes > into SurveyTool. For Your information: I?m going to post "" in these empty fields. https://unicode.org/cldr/trac/ticket/11655#comment:3 Regards, Marcel From cldr-users at unicode.org Sun Dec 9 06:55:26 2018 From: cldr-users at unicode.org (lmelonimamo via CLDR-Users) Date: Sun, 09 Dec 2018 12:55:26 +0000 Subject: Converting translation files to xml Message-ID: Hello, I'm trying to contribute to Sardinian by using some translation files that I worked on some time ago (the ISO standards for Debian, that contain things like translations for currencies, locales, language families, countries and administrative divisions). I have them in these formats: csv, po, tmx, tbx, xliff and xlsx. Is there a way to convert them to the xml format that can be used for bulk data upload? Best regards, Luca -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Mon Dec 10 05:06:40 2018 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Mon, 10 Dec 2018 12:06:40 +0100 Subject: SurveyTool support request: How do you vote for empty? In-Reply-To: References: Message-ID: Please don't do that or encourage others to. If the strings were ever to be used (such as if a timezone or metazone adds daylight savings) then the literal string would show up to users, eg "12:35 ". You may be using Coverage: Comprehensive, which should not normally be used without direction by administrators, especially for timezones, since it contains many many items that are not normally used and should just be left alone. If so, you should reset down to Coverage: Modern. If you have suggestions for a hacks for getting around what you see as a problem, please file a ticket. The committee will review and then let people know if it is a recommended approach. (It has done so in the past for various issues raised by vetters.) Mark On Sat, Dec 8, 2018 at 6:28 AM Marcel Schneider via CLDR-Users < cldr-users at unicode.org> wrote: > > Many time zones do not switch to daylight time, and many > > zones seem not to have a short name like "PT" or "PST" > > for Pacific [Standard] Time. > > > > But I?m unable to find out how to get appropriate votes > > into SurveyTool. > > For Your information: I?m going to post "" in these > empty fields. > > https://unicode.org/cldr/trac/ticket/11655#comment:3 > > Regards, > Marcel > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Mon Dec 10 07:48:28 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Mon, 10 Dec 2018 14:48:28 +0100 Subject: SurveyTool support request: How do you vote for empty? In-Reply-To: References: Message-ID: <98317add-4fc4-ddaf-4e3c-076fa27e2261@orange.fr> On 10/12/2018 12:06, Mark Davis ?? wrote: > Please don't do that or encourage others to. I?m sorry. > > If the strings were ever to be used (such as if a timezone or metazone adds daylight savings) then the literal string would show up to users, eg "12:35 ". Tbat was not intended behavior, but postprocessing and resolution replacing the embarrassment placeholder. Of course I will abstain, thanks for the guideline. > You may be using Coverage: Comprehensive, which should not normally be used without direction by administrators, especially for timezones, since it contains many many items that are not normally used and should just be left alone. If so, you should reset down to Coverage: Modern. Indeed I am, but otherwise I cannot solve the many problems that otherwise would be hidden. Eg Qyzylorda Time is not included in Coverage:Modern, so it?s hidden to me when I?m not under Coverage:Comprehensive, yet it is untranslated and as such is on my SurveyTool Dashboard, but only when I?m under Coverage:Comprehensive. Would you allow me to stay under Coverage:Comprehensive under these circumstances? > > If you have suggestions for a hacks for getting around what you see as a problem, please file a ticket. The committee will review and then let people know if it is a recommended approach. (It has done so in the past for various issues raised by vetters.) Will do, thanks. Yet I don?t retrieve the values to get rid of, and they?re not in XML neither. Perhaps when I saw them it was under ST shakedown and something went wrong for me. Best regards, Marcel > > Mark > // > > > On Sat, Dec 8, 2018 at 6:28 AM Marcel Schneider via CLDR-Users > wrote: > > > Many time zones do not switch to daylight time, and many > > zones seem not to have a short name like "PT" or "PST" > > for Pacific [Standard] Time. > > > > But I?m unable to find out how to get appropriate votes > > into SurveyTool. > > For Your information: I?m going to post "" in these > empty fields. > > https://unicode.org/cldr/trac/ticket/11655#comment:3 > > Regards, > Marcel > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Tue Dec 11 09:55:17 2018 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Tue, 11 Dec 2018 16:55:17 +0100 Subject: Converting translation files to xml In-Reply-To: References: Message-ID: There is no automatic way to do that, sorry. Mark On Sun, Dec 9, 2018 at 4:54 PM lmelonimamo via CLDR-Users < cldr-users at unicode.org> wrote: > Hello, > > I'm trying to contribute to Sardinian by using some translation files that > I worked on some time ago (the ISO standards for Debian, that contain > things like translations for currencies, locales, language families, > countries and administrative divisions). I have them in these formats: csv, > po, tmx, tbx, xliff and xlsx. Is there a way to convert them to the xml > format that can be used for bulk data upload? > > Best regards, > Luca > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Tue Dec 11 10:54:48 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Tue, 11 Dec 2018 17:54:48 +0100 Subject: Minifying CLDR sources In-Reply-To: <4c077c9e-f132-a8b3-baec-6b6bec76979b@orange.fr> References: <32b9eb3b-6fdb-2b22-a6bb-0f34fdf1d34d@orange.fr> <4ace2465-2a07-4ec6-8ece-a25ec31300fa@ix.netcom.com> <531cf9b2-b817-eba4-fd3e-a8d82e3ae993@orange.fr> <4c077c9e-f132-a8b3-baec-6b6bec76979b@orange.fr> Message-ID: <80f6a76c-54ec-ff7d-0c49-a963fa9ef7d8@orange.fr> On 01/12/2018 14:59, Marcel Schneider via CLDR-Users wrote: [?] > Given that minifying the files is an interesting issue, one might wish to go even a step further by collapsing > the element of the keywords and the element of the short name. [?] > After collapsing: > peau | peau claire Turns out that was how the data was stored at release 27 (first annotations/ subdirectory): sourire; dr?le That must have been so impractical that it was decided to modify the schema for release 30: sourire grand sourire That is how the emoji annotation data is stored today. Best regards, Marcel From cldr-users at unicode.org Tue Dec 11 12:13:47 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Tue, 11 Dec 2018 19:13:47 +0100 Subject: Converting translation files to xml In-Reply-To: References: Message-ID: <8d5f2733-20e2-08d6-e723-eb789d961270@orange.fr> On 09/12/2018 13:55, lmelonimamo via CLDR-Users wrote: > Hello, > > I'm trying to contribute to Sardinian by using some translation files > that I worked on some time ago (the ISO standards for Debian, that > contain things like translations for currencies, locales, language > families, countries and administrative divisions). I have them in > these formats: csv, po, tmx, tbx, xliff and xlsx. Is there a way to > convert them to the xml format that can be used for bulk data > upload? On 11/12/2018 16:55, Mark Davis ?? via CLDR-Users wrote: > There is no automatic way to do that, sorry. > I?m currently editing XML/LDML by hand and do that using text editors and spreadsheet software which is known as the quick-and-dirty way. There?s much copy-pasting, formulas add code around the data, and for final formatting VS Code has the XML Tools extension. Nothing new for you but on my part I always thought at programs able to take in format X, store the data and output it as an XML file based on the provided DTD. Turns out it?s not that easy. Good luck. Best regards, Marcel From cldr-users at unicode.org Tue Dec 11 16:34:28 2018 From: cldr-users at unicode.org (Steven R. Loomis via CLDR-Users) Date: Tue, 11 Dec 2018 14:34:28 -0800 Subject: Converting translation files to xml In-Reply-To: <8d5f2733-20e2-08d6-e723-eb789d961270@orange.fr> References: <8d5f2733-20e2-08d6-e723-eb789d961270@orange.fr> Message-ID: Marcel, The DTD gives you some,but not all of the information needed to produce LDML. The spec is needed as well. An XML DTD is not enough information to automatically transform between formats. Luca, As Mark said there is currently no automatic way to do this transform between xliff and ldml. . It's not a bad idea, though, An issue though is how the naming would work. Some amount of configuration would be needed to set up this transform even in the best case. On Tue, Dec 11, 2018 at 10:13 AM Marcel Schneider via CLDR-Users < cldr-users at unicode.org> wrote: > On 09/12/2018 13:55, lmelonimamo via CLDR-Users wrote: > > Hello, > > > > I'm trying to contribute to Sardinian by using some translation files > > that I worked on some time ago (the ISO standards for Debian, that > > contain things like translations for currencies, locales, language > > families, countries and administrative divisions). I have them in > > these formats: csv, po, tmx, tbx, xliff and xlsx. Is there a way to > > convert them to the xml format that can be used for bulk data > > upload? > > On 11/12/2018 16:55, Mark Davis ?? via CLDR-Users wrote: > > There is no automatic way to do that, sorry. > > > > I?m currently editing XML/LDML by hand and do that using text editors > and spreadsheet software which is known as the quick-and-dirty way. > There?s much copy-pasting, formulas add code around the data, and for > final formatting VS Code has the XML Tools extension. > > Nothing new for you but on my part I always thought at programs able > to take in format X, store the data and output it as an XML file > based on the provided DTD. Turns out it?s not that easy. > > Good luck. > > Best regards, > Marcel > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Wed Dec 12 01:59:18 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Wed, 12 Dec 2018 08:59:18 +0100 Subject: Converting translation files to xml In-Reply-To: References: <8d5f2733-20e2-08d6-e723-eb789d961270@orange.fr> Message-ID: <1b67fc25-4e21-c085-8359-df01d28e9c15@orange.fr> On 11/12/2018 23:34, Steven R. Loomis wrote: > Marcel, > ?The DTD gives you some,but not all of the information needed to produce LDML. The spec is needed as well. Is that due to an insufficient level of support the DTD schema language is actually capable of? Perhaps it needs to be upgraded like HTML, CSS and PHP are regularly. Obviously the momentum to improve the latter three is much stronger than for special usage like what is needed for LDML. Wikipedia states: As of 2009, newerXML namespace -awareschema languages (such asW3C XML Schema andISO RELAX NG ) have largely superseded DTDs. https://en.wikipedia.org/wiki/Document_type_definition > > ?An XML DTD is not enough information to automatically transform between formats. Then it?s surprising, and would be interesting to investigate, that XLIFF is reported ?to allow translation work to be standardised no matter what the source format and to allow the work to be freely moved from tool to tool.? http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/xliff2po.html?id=toolkit/xliff2po&redirect=1 XLIFF benefits from support by many tools and it?s backed by both OASIS and Microsoft: https://en.wikipedia.org/wiki/XLIFF#Related_tools The spec?s introduction is very promising and thus might raise the question why LDML and XLIFF haven?t been merged: http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#SectionIntroduction My hints are that either XLIFF is proprietary and thus unfit for a free and collaborative database like CLDR, despite it has many open source tools in its galaxy. Or it?s for the same reason its usage is discouraged on the following project, among other reasons: ?XLIFF verbosity is unbearable.? https://github.com/symfony/symfony/issues/22566 > > Luca, > ?As Mark said there is currently no automatic way to do this transform between xliff and ldml. . It's not a bad idea, though, ?An issue though is how the naming would work. Some amount of configuration would be needed to set up this transform even in the best case. Indeed a library is probably needed to get the types matching, and that would need to be set up by hand. Microsoft?s XLIFF 2.0 object model is here, but where is the locale data? Where else than in CLDR? https://github.com/Microsoft/XLIFF2-Object-Model Didn?t XLIFF predate LDML (2002 vs 2003)? Perhaps they were too far away from each other to be merged like Unicode and ISO/IEC?10646. Marcel > > > On Tue, Dec 11, 2018 at 10:13 AM Marcel Schneider via CLDR-Users > wrote: > > On 09/12/2018 13:55, lmelonimamo via CLDR-Users wrote: > > Hello, > > > > I'm trying to contribute to Sardinian by using some translation files > > that I worked on some time ago (the ISO standards for Debian, that > > contain things like translations for currencies, locales, language > > families, countries and administrative divisions). I have them in > > these formats: csv, po, tmx, tbx, xliff and xlsx. Is there a way to > > convert them to the xml format that can be used for bulk data > > upload? > > On 11/12/2018 16:55, Mark Davis ?? via CLDR-Users wrote: > > There is no automatic way to do that, sorry. > > > > I?m currently editing XML/LDML by hand and do that using text editors > and spreadsheet software which is known as the quick-and-dirty way. > There?s much copy-pasting, formulas add code around the data, and for > final formatting VS Code has the XML Tools extension. > > Nothing new for you but on my part I always thought at programs able > to take in format X, store the data and output it as an XML file > based on the provided DTD. Turns out it?s not that easy. > > Good luck. > > Best regards, > Marcel > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Wed Dec 12 04:45:19 2018 From: cldr-users at unicode.org (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= via CLDR-Users) Date: Wed, 12 Dec 2018 11:45:19 +0100 Subject: Converting translation files to xml In-Reply-To: <1b67fc25-4e21-c085-8359-df01d28e9c15@orange.fr> References: <8d5f2733-20e2-08d6-e723-eb789d961270@orange.fr> <1b67fc25-4e21-c085-8359-df01d28e9c15@orange.fr> Message-ID: XLIFF didn't offer everything that we needed. Note that the DTD in CLDR is augmented in order to give us much more control over the structure, as needed to make inheritance work properly. If some enterprising person wanted to put together and make available other tools (eg on github) for generating CLDR XML from various types of sources, that might be useful to experiment with. Mark On Wed, Dec 12, 2018 at 9:00 AM Marcel Schneider via CLDR-Users < cldr-users at unicode.org> wrote: > On 11/12/2018 23:34, Steven R. Loomis wrote: > > Marcel, > The DTD gives you some,but not all of the information needed to produce > LDML. The spec is needed as well. > > Is that due to an insufficient level of support the DTD schema language is > actually capable of? Perhaps it needs to be upgraded like HTML, CSS and PHP > are regularly. Obviously the momentum to improve the latter three is much > stronger than for special usage like what is needed for LDML. Wikipedia > states: > > As of 2009, newer XML namespace > -aware schema languages > (such as W3C > XML Schema > and ISO > > RELAX NG ) have largely > superseded DTDs. > > https://en.wikipedia.org/wiki/Document_type_definition > > > An XML DTD is not enough information to automatically transform between > formats. > > Then it?s surprising, and would be interesting to investigate, that XLIFF > is reported ?to allow translation work to be standardised no matter what > the source format and to allow the work to be freely moved from tool to > tool.? > > http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/xliff2po.html?id=toolkit/xliff2po&redirect=1 > > XLIFF benefits from support by many tools and it?s backed by both OASIS > and Microsoft: > https://en.wikipedia.org/wiki/XLIFF#Related_tools > > The spec?s introduction is very promising and thus might raise the > question why LDML and XLIFF haven?t been merged: > > http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#SectionIntroduction > > My hints are that either XLIFF is proprietary and thus unfit for a free > and collaborative database like CLDR, despite it has many open source tools > in its galaxy. > Or it?s for the same reason its usage is discouraged on the following > project, among other reasons: ?XLIFF verbosity is unbearable.? > https://github.com/symfony/symfony/issues/22566 > > > Luca, > As Mark said there is currently no automatic way to do this transform > between xliff and ldml. . It's not a bad idea, though, An issue though is > how the naming would work. Some amount of configuration would be needed to > set up this transform even in the best case. > > Indeed a library is probably needed to get the types matching, and that > would need to be set up by hand. > > Microsoft?s XLIFF 2.0 object model is here, but where is the locale data? > Where else than in CLDR? > https://github.com/Microsoft/XLIFF2-Object-Model > > Didn?t XLIFF predate LDML (2002 vs 2003)? Perhaps they were too far away > from each other to be merged like Unicode and ISO/IEC?10646. > > Marcel > > > > On Tue, Dec 11, 2018 at 10:13 AM Marcel Schneider via CLDR-Users < > cldr-users at unicode.org> wrote: > >> On 09/12/2018 13:55, lmelonimamo via CLDR-Users wrote: >> > Hello, >> > >> > I'm trying to contribute to Sardinian by using some translation files >> > that I worked on some time ago (the ISO standards for Debian, that >> > contain things like translations for currencies, locales, language >> > families, countries and administrative divisions). I have them in >> > these formats: csv, po, tmx, tbx, xliff and xlsx. Is there a way to >> > convert them to the xml format that can be used for bulk data >> > upload? >> >> On 11/12/2018 16:55, Mark Davis ?? via CLDR-Users wrote: >> > There is no automatic way to do that, sorry. >> > >> >> I?m currently editing XML/LDML by hand and do that using text editors >> and spreadsheet software which is known as the quick-and-dirty way. >> There?s much copy-pasting, formulas add code around the data, and for >> final formatting VS Code has the XML Tools extension. >> >> Nothing new for you but on my part I always thought at programs able >> to take in format X, store the data and output it as an XML file >> based on the provided DTD. Turns out it?s not that easy. >> >> Good luck. >> >> Best regards, >> Marcel >> _______________________________________________ >> CLDR-Users mailing list >> CLDR-Users at unicode.org >> http://unicode.org/mailman/listinfo/cldr-users >> > > _______________________________________________ > CLDR-Users mailing list > CLDR-Users at unicode.org > http://unicode.org/mailman/listinfo/cldr-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Wed Dec 12 18:12:12 2018 From: cldr-users at unicode.org (lmelonimamo via CLDR-Users) Date: Thu, 13 Dec 2018 00:12:12 +0000 Subject: Converting translation files to xml In-Reply-To: References: <8d5f2733-20e2-08d6-e723-eb789d961270@orange.fr> <1b67fc25-4e21-c085-8359-df01d28e9c15@orange.fr> Message-ID: Ok, thank you very much to everyone for the answers and the informations. I had this doubt for a while, since I saw that using some online translation tools I could export data in different formats, but this particular kind of xml was not one of them. Well, too bad. I guess we will have to hope there will be a way to do it in the future, then. Best regards, Luca ??????? Original Message ??????? On Wednesday, December 12, 2018 11:45 AM, Mark Davis ?? wrote: > XLIFF didn't offer everything that we needed. Note that the DTD in CLDR is augmented in order to give us much more control over the structure, as needed to make inheritance work properly. > > If some enterprising person wanted to put together and make available other tools (eg on github) for generating CLDR XML from various types of sources, that might be useful to experiment with. > > Mark > > On Wed, Dec 12, 2018 at 9:00 AM Marcel Schneider via CLDR-Users wrote: > >> On 11/12/2018 23:34, Steven R. Loomis wrote: >> >>> Marcel, >>> The DTD gives you some,but not all of the information needed to produce LDML. The spec is needed as well. >> >> Is that due to an insufficient level of support the DTD schema language is actually capable of? Perhaps it needs to be upgraded like HTML, CSS and PHP are regularly. Obviously the momentum to improve the latter three is much stronger than for special usage like what is needed for LDML. Wikipedia states: >> >>>>> As of 2009, newer [XML namespace](https://en.wikipedia.org/wiki/XML_namespace)-aware [schema languages](https://en.wikipedia.org/wiki/XML_schema) (such as [W3C](https://en.wikipedia.org/wiki/W3C) [XML Schema](https://en.wikipedia.org/wiki/XML_Schema_%28W3C%29) and [ISO](https://en.wikipedia.org/wiki/International_Organization_for_Standardization) [RELAX NG](https://en.wikipedia.org/wiki/RELAX_NG)) have largely superseded DTDs. >> >> https://en.wikipedia.org/wiki/Document_type_definition >> >>>> >> >>> An XML DTD is not enough information to automatically transform between formats. >> >> Then it?s surprising, and would be interesting to investigate, that XLIFF is reported ?to allow translation work to be standardised no matter what the source format and to allow the work to be freely moved from tool to tool.? >> http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/xliff2po.html?id=toolkit/xliff2po&redirect=1 >> >> XLIFF benefits from support by many tools and it?s backed by both OASIS and Microsoft: >> https://en.wikipedia.org/wiki/XLIFF#Related_tools >> >> The spec?s introduction is very promising and thus might raise the question why LDML and XLIFF haven?t been merged: >> http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#SectionIntroduction >> >> My hints are that either XLIFF is proprietary and thus unfit for a free and collaborative database like CLDR, despite it has many open source tools in its galaxy. >> Or it?s for the same reason its usage is discouraged on the following project, among other reasons: ?XLIFF verbosity is unbearable.? >> https://github.com/symfony/symfony/issues/22566 >> >>> Luca, >>> As Mark said there is currently no automatic way to do this transform between xliff and ldml. . It's not a bad idea, though, An issue though is how the naming would work. Some amount of configuration would be needed to set up this transform even in the best case. >> >> Indeed a library is probably needed to get the types matching, and that would need to be set up by hand. >> >> Microsoft?s XLIFF 2.0 object model is here, but where is the locale data? Where else than in CLDR? >> https://github.com/Microsoft/XLIFF2-Object-Model >> >> Didn?t XLIFF predate LDML (2002 vs 2003)? Perhaps they were too far away from each other to be merged like Unicode and ISO/IEC?10646. >> >> Marcel >> >>> On Tue, Dec 11, 2018 at 10:13 AM Marcel Schneider via CLDR-Users wrote: >>> >>>> On 09/12/2018 13:55, lmelonimamo via CLDR-Users wrote: >>>>> Hello, >>>>> >>>>> I'm trying to contribute to Sardinian by using some translation files >>>>> that I worked on some time ago (the ISO standards for Debian, that >>>>> contain things like translations for currencies, locales, language >>>>> families, countries and administrative divisions). I have them in >>>>> these formats: csv, po, tmx, tbx, xliff and xlsx. Is there a way to >>>>> convert them to the xml format that can be used for bulk data >>>>> upload? >>>> >>>> On 11/12/2018 16:55, Mark Davis ?? via CLDR-Users wrote: >>>>> There is no automatic way to do that, sorry. >>>>> >>>> >>>> I?m currently editing XML/LDML by hand and do that using text editors >>>> and spreadsheet software which is known as the quick-and-dirty way. >>>> There?s much copy-pasting, formulas add code around the data, and for >>>> final formatting VS Code has the XML Tools extension. >>>> >>>> Nothing new for you but on my part I always thought at programs able >>>> to take in format X, store the data and output it as an XML file >>>> based on the provided DTD. Turns out it?s not that easy. >>>> >>>> Good luck. >>>> >>>> Best regards, >>>> Marcel >>>> _______________________________________________ >>>> CLDR-Users mailing list >>>> CLDR-Users at unicode.org >>>> http://unicode.org/mailman/listinfo/cldr-users >> >> _______________________________________________ >> CLDR-Users mailing list >> CLDR-Users at unicode.org >> http://unicode.org/mailman/listinfo/cldr-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From cldr-users at unicode.org Thu Dec 20 08:45:18 2018 From: cldr-users at unicode.org (Marcel Schneider via CLDR-Users) Date: Thu, 20 Dec 2018 15:45:18 +0100 Subject: Where is CLDR emoji annotation data used? And why is it not used? Message-ID: While striving to edit a subset of emoji data now under survey, I?ve tried to find out where that data is actually used, but couldn?t find anything on the internet. Most social media either don?t have a search bar in the emoji palette and don?t display TTS names in tooltips, or don?t feature emoji palettes at all, or if they do, the data displayed or recognized at search doesn?t match CLDR data (neither keywords nor emoji names), despite that platform (birdie microblogging) is reported to implement CLDR data. It?s not a matter of using latest release as previous data doesn?t match neither. Hence I feel a need to ask one simple question: What is the point in having emoji annotations in CLDR, if CLDR users are redacting that data considered ?raw data? or even disregard that data and make up their own databases for all the locales they are supporting, so as to use fully proprietary data and not to contribute to common efforts in maintaining the data in the Common Locale Data Repository? Any hints are welcome, notably those pertaining to the quality of the data so that we can assess whether the data meets the requirements, and if it doesn?t, what can be done to improve the quality of the data so that it conforms to the standards set by CLDR users. Marcel