Converting translation files to xml

lmelonimamo via CLDR-Users cldr-users at unicode.org
Wed Dec 12 18:12:12 CST 2018


Ok, thank you very much to everyone for the answers and the informations. I had this doubt for a while, since I saw that using some online translation tools I could export data in different formats, but this particular kind of xml was not one of them. Well, too bad. I guess we will have to hope there will be a way to do it in the future, then.

Best regards,
Luca

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, December 12, 2018 11:45 AM, Mark Davis ☕️ <mark at macchiato.com> wrote:

> XLIFF didn't offer everything that we needed. Note that the DTD in CLDR is augmented in order to give us much more control over the structure, as needed to make inheritance work properly.
>
> If some enterprising person wanted to put together and make available other tools (eg on github) for generating CLDR XML from various types of sources, that might be useful to experiment with.
>
> Mark
>
> On Wed, Dec 12, 2018 at 9:00 AM Marcel Schneider via CLDR-Users <cldr-users at unicode.org> wrote:
>
>> On 11/12/2018 23:34, Steven R. Loomis wrote:
>>
>>> Marcel,
>>>  The DTD gives you some,but not all of the information needed to produce LDML. The spec is needed as well.
>>
>> Is that due to an insufficient level of support the DTD schema language is actually capable of? Perhaps it needs to be upgraded like HTML, CSS and PHP are regularly. Obviously the momentum to improve the latter three is much stronger than for special usage like what is needed for LDML. Wikipedia states:
>>
>>>>> As of 2009, newer [XML namespace](https://en.wikipedia.org/wiki/XML_namespace)-aware [schema languages](https://en.wikipedia.org/wiki/XML_schema) (such as [W3C](https://en.wikipedia.org/wiki/W3C) [XML Schema](https://en.wikipedia.org/wiki/XML_Schema_%28W3C%29) and [ISO](https://en.wikipedia.org/wiki/International_Organization_for_Standardization) [RELAX NG](https://en.wikipedia.org/wiki/RELAX_NG)) have largely superseded DTDs.
>>
>> https://en.wikipedia.org/wiki/Document_type_definition
>>
>>>>
>>
>>>  An XML DTD is not enough information to automatically transform between formats.
>>
>> Then it’s surprising, and would be interesting to investigate, that XLIFF is reported “to allow translation work to be standardised no matter what the source format and to allow the work to be freely moved from tool to tool.”
>> http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/xliff2po.html?id=toolkit/xliff2po&redirect=1
>>
>> XLIFF benefits from support by many tools and it’s backed by both OASIS and Microsoft:
>> https://en.wikipedia.org/wiki/XLIFF#Related_tools
>>
>> The spec’s introduction is very promising and thus might raise the question why LDML and XLIFF haven’t been merged:
>> http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#SectionIntroduction
>>
>> My hints are that either XLIFF is proprietary and thus unfit for a free and collaborative database like CLDR, despite it has many open source tools in its galaxy.
>> Or it’s for the same reason its usage is discouraged on the following project, among other reasons: “XLIFF verbosity is unbearable.”
>> https://github.com/symfony/symfony/issues/22566
>>
>>> Luca,
>>>  As Mark said there is currently no automatic way to do this transform between xliff and ldml. . It's not a bad idea, though,  An issue though is how the naming would work. Some amount of configuration would be needed to set up this transform even in the best case.
>>
>> Indeed a library is probably needed to get the types matching, and that would need to be set up by hand.
>>
>> Microsoft’s XLIFF 2.0 object model is here, but where is the locale data? Where else than in CLDR?
>> https://github.com/Microsoft/XLIFF2-Object-Model
>>
>> Didn’t XLIFF predate LDML (2002 vs 2003)? Perhaps they were too far away from each other to be merged like Unicode and ISO/IEC 10646.
>>
>> Marcel
>>
>>> On Tue, Dec 11, 2018 at 10:13 AM Marcel Schneider via CLDR-Users <cldr-users at unicode.org> wrote:
>>>
>>>> On 09/12/2018 13:55, lmelonimamo via CLDR-Users wrote:
>>>>> Hello,
>>>>>
>>>>> I'm trying to contribute to Sardinian by using some translation files
>>>>> that I worked on some time ago (the ISO standards for Debian, that
>>>>> contain things like translations for currencies, locales, language
>>>>> families, countries and administrative divisions). I have them in
>>>>> these formats: csv, po, tmx, tbx, xliff and xlsx. Is there a way to
>>>>> convert them to the xml format that can be used for bulk data
>>>>> upload?
>>>>
>>>> On 11/12/2018 16:55, Mark Davis ☕️ via CLDR-Users wrote:
>>>>> There is no automatic way to do that, sorry.
>>>>>
>>>>
>>>> I’m currently editing XML/LDML by hand and do that using text editors
>>>> and spreadsheet software which is known as the quick-and-dirty way.
>>>> There’s much copy-pasting, formulas add code around the data, and for
>>>> final formatting VS Code has the XML Tools extension.
>>>>
>>>> Nothing new for you but on my part I always thought at programs able
>>>> to take in format X, store the data and output it as an XML file
>>>> based on the provided DTD. Turns out it’s not that easy.
>>>>
>>>> Good luck.
>>>>
>>>> Best regards,
>>>> Marcel
>>>> _______________________________________________
>>>> CLDR-Users mailing list
>>>> CLDR-Users at unicode.org
>>>> http://unicode.org/mailman/listinfo/cldr-users
>>
>> _______________________________________________
>> CLDR-Users mailing list
>> CLDR-Users at unicode.org
>> http://unicode.org/mailman/listinfo/cldr-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20181213/84cc35be/attachment-0001.html>


More information about the CLDR-Users mailing list