ID_Start, ID_Continue, and stability extensions
public at khwilliamson.com
Fri Apr 25 13:53:27 CDT 2014
On 04/24/2014 01:56 PM, Steffen Nurpmeso wrote:
> Markus Scherer <markus.icu at gmail.com> wrote:
> |I strongly recommend you parse the derived properties rather than trying to
> |follow the derivation formula, because that can change over time.
> ..this file includes only those core properties that have
> themselves a derivation-may-change property?
> (I long hesitated to write this though.)
Somewhere it says that the derived property files are subservient to the
other files. And in fact in some Unicode releases, they contained
errors. I therefor changed my parser to populate my internal db first
with the derived files, and then to populate using the non-derived
files. Any conflicts were thus automatically resolved in favor of the
non-derived. But if the derived files contained things not in the
non-derived ones, they would be used.
I think that Unicode is doing a better job of making their files
consistent and accurate these days, but I haven't had to worry since I
made that change. (I no longer remember any details of what the
If I were starting from scratch, I would try the xml version first.
More information about the Unicode