ID_Start, ID_Continue, and stability extensions

Karl Williamson public at khwilliamson.com
Fri Apr 25 13:53:27 CDT 2014


On 04/24/2014 01:56 PM, Steffen Nurpmeso wrote:
> Markus Scherer <markus.icu at gmail.com> wrote:
>   |I strongly recommend you parse the derived properties rather than trying to
>   |follow the derivation formula, because that can change over time.
>
> ..this file includes only those core properties that have
> themselves a derivation-may-change property?
> (I long hesitated to write this though.)
>
> --steffen
> _______________________________________________

Somewhere it says that the derived property files are subservient to the 
other files.  And in fact in some Unicode releases, they contained 
errors.  I therefor changed my parser to populate my internal db first 
with the derived files, and then to populate using the non-derived 
files.  Any conflicts were thus automatically resolved in favor of the 
non-derived.  But if the derived files contained things not in the 
non-derived ones, they would be used.

I think that Unicode is doing a better job of making their files 
consistent and accurate these days, but I haven't had to worry since I 
made that change.  (I no longer remember any details of what the 
problems were.)

If I were starting from scratch, I would try the xml version first.



More information about the Unicode mailing list