UTC public review issues to close January 2

Peter Constable pgcon6 at msn.com
Tue Dec 19 13:24:34 CST 2023


Human effort — a committed volunteer — was, indeed, the missing factor that led to asking whether it was worth continuing to maintain UCDXML.

Peter

-----Original Message-----
From: Doug Ewell <doug at ewellic.org>
Sent: Monday, December 18, 2023 10:32 PM
To: Peter Constable <pgcon6 at msn.com>; unicode at unicode.org <unicode at corp.unicode.org>
Subject: RE: UTC public review issues to close January 2

Peter Constable wrote:

> https://www.u/
> nicode.org%2Freview%2Fpri486%2Fhttps%3A%2F%2Fwww.unicode.org%2Freview%
> 2Fpri486%2F&data=05%7C02%7C%7Cb50c6c78d2774ef006e308dc0053ca3f%7C84df9
> e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638385607077909013%7CUnknown%7CT
> WFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> 6Mn0%3D%7C3000%7C%7C%7C&sdata=zwMU3kvBDqgLESjedkZ3c6akN0L%2FxhndyHurzI
> ZBzyI%3D&reserved=0 — UAX #42 provides the data for the Unicode
> Character Database in XML format. (UCD is character property data for
> use in processing algorithms that is provide with each version of
> Unicode. This PRI is for feedback on a planned UTC action to freeze
> UAX #42 as of Unicode 15.1.

This is a shame. I don’t know how widely the XML files were adopted, but I certainly found them easier to process than the traditional Unicode data files.

I imagine creating these files was a matter of auto-generation with custom tools, combined with human fine-tuning and judgment (i.e. where to draw the line when grouping characters). It would be great if Eric and/or Laurențiu could donate any tools, but the human effort is probably what could not be replaced.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org




More information about the Unicode mailing list