Potential contradiction between the WordBreak test data and UAX #29
daniel.buenzli at erratique.ch
Wed Nov 23 05:45:04 CST 2016
On Wednesday 23 November 2016 at 12:28, Tom Hacohen wrote:
> I took a look at the ICU sources, and they explicitly mention this case,
> so it seems I was mistaken with interpreting the intention of the UAX. I
> still find it confusing, but based on this thread, it seems to just be me.
It's not only you, I also sometimes get confused by it (see for example  and subsequent messages). Maybe the operational model could be clarified a bit.
I also think it would be better if the UAX29 didn't use ignore rules at all, so that going from rules to implementation is more straightforward --- though I understand it may make the spec harder to maintain.
More information about the Unicode