NamesList.txt as data source

Doug Ewell doug at ewellic.org
Mon Mar 28 13:18:25 CDT 2016


Mark Davis wrote:

> I think there is a misunderstanding because of the online utilities
> which have been, for convenience, hosted with the same server as the
> CLDR survey tool. So one sees "cldr" in the following URL, but that
> doesn't mean a particular association with CLDR.

Yes, that was my fault.

> But subheads are *not* Unicode Character Properties. And repeating the
> caveats expressed earlier, the Nameslist data is designed for chart
> production, not as a reliable source of machine-readable data. While
> it may be in some cases useful to look at, the subheads are not
> designed to be a consistent source of data. For example, one couldn't
> use them effectively to find non-modern-use characters, because
> different terms are used for that, and the groupings mix in other
> characters.

I don't recall anyone asking for that.

> Other examples: the NamesList data doesn't include all the case
> mappings, nor all the normative name aliases.

Nor that.

> One needs to use the UCD instead of trying to dig this information out
> of the NamesList.txt file — because such information will be wrong and
> incomplete.

I don't recall anyone suggesting to use data from NamesList in
preference to other UCD files. The issue is when NamesList is the only
source.

To circle back to the original topic, I suggested using NamesList data
to find the cross-references from holes in the Mathematical Alphanumeric
Symbols to existing BMP characters, in preference to using (a) comments
in the (b) non-UCD MathClass* files. Both (a) and (b) prevent this
scenario from being a matter of "use the UCD."

Sorry to keep dragging this out, but I think there are still some
misunderstandings and mischaracterizations surrounding the expectations
of stability, formality, comprehensiveness, etc. of this data and its
availability in other places.

--
Doug Ewell | http://ewellic.org | Thornton, CO ����




More information about the Unicode mailing list