UCD in XML or in CSV? (is: UCD data consumption)

Adam Borowski via Unicode unicode at unicode.org
Mon Sep 3 05:07:39 CDT 2018


On Mon, Sep 03, 2018 at 08:24:06AM +0200, Janusz S. Bień via Unicode wrote:
> For a non-programmer like me CVS is much more convenient form than XML -
> I can use it not only with a spreadsheet, but also import directly into
> a database and analyse with various queries. XML is politically correct,
> but practically almost unusable without a specialised parser.

And for a programmer, XML is outright insane.  You need a complex library to
do so, and those fail KISS so badly that you have a CVE roughly yearly.
On the other hand, writing a parser for current headerless ;-separated data
completely from scratch is just:

cut -d';' -f 1,6 </usr/share/unicode/UnicodeData.txt
or:
(split/;/)[0,5]

JSON is somewhat better, but still needs drastically more effort.
CSV (especially with no escapes) is trivial to handle.


ᛗᛖᛟᚹ!
-- 
⢀⣴⠾⠻⢶⣦⠀ What Would Jesus Do, MUD/MMORPG edition:
⣾⠁⢰⠒⠀⣿⡁ • multiplay with an admin char to benefit your mortal [Mt3:16-17]
⢿⡄⠘⠷⠚⠋⠀ • abuse item cloning bugs [Mt14:17-20, Mt15:34-37]
⠈⠳⣄⠀⠀⠀⠀ • use glitches to walk on water [Mt14:25-26]


More information about the Unicode mailing list