UTC public review issues to close January 2

Doug Ewell doug at ewellic.org
Mon Dec 18 23:31:40 CST 2023


Peter Constable wrote:

> https://www.unicode.org/review/pri486/https://www.unicode.org/review/pri486/
> — UAX #42 provides the data for the Unicode Character Database in XML
> format. (UCD is character property data for use in processing
> algorithms that is provide with each version of Unicode. This PRI is
> for feedback on a planned UTC action to freeze UAX #42 as of Unicode
> 15.1.

This is a shame. I don’t know how widely the XML files were adopted, but I certainly found them easier to process than the traditional Unicode data files.

I imagine creating these files was a matter of auto-generation with custom tools, combined with human fine-tuning and judgment (i.e. where to draw the line when grouping characters). It would be great if Eric and/or Laurențiu could donate any tools, but the human effort is probably what could not be replaced.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org




More information about the Unicode mailing list