UCD in XML or in CSV? (is: UCD data consumption)

Janusz S. Bień via Unicode unicode at unicode.org
Mon Sep 3 01:24:06 CDT 2018

On Sun, Sep 02 2018 at  4:16 +0200, 


> So you can understand that I’m not unaware of the complexity of UCD. Though
> I don’t think that this could be an argument for not publishing a medium-size 
> CSV file with scalar values listed as in UnicodeData.txt.

For a non-programmer like me CVS is much more convenient form than XML -
I can use it not only with a spreadsheet, but also import directly into
a database and analyse with various queries. XML is politically correct,
but practically almost unusable without a specialised parser.

On Sat, Sep 01 2018 at 15:15 +0200, unicode at unicode.org writes:
> On 31/08/18 10:47 Manuel Strehl via Unicode wrote:
>> To handle the UCD XML file a streaming parser like Expat is necessary.
> Thanks for the tip. However for my needs, Expat looks like overkill, and I’m 
> looking out for a much simpler standalone tool, just converting XML to CSV.

I think CSV and XML can coexist peacefully, we just need an open source
round-trip converter.

Last but not least, let me remind that the thread was started by a
question what is the most convenient way to describe the properties of
PUA characters.

Best regards


Janusz S. Bien
emeryt (emeritus)

More information about the Unicode mailing list