UAX #29 and WB4

Daniel Bünzli via Unicode unicode at
Wed Mar 4 11:48:09 CST 2020

On 4 March 2020 at 18:01:25, Daniel Bünzli (daniel.buenzli at wrote:

> Re-reading the text I suspect I should not restart the rules from the first one when a WB4  
> rewrite occurs but only apply the subsequent rules. Is that correct ?

However even if that's correct I don't understand how this test case works:

÷ 1F6D1 × 200D × 1F6D1 ÷ #  ÷ [0.2] OCTAGONAL SIGN (ExtPict) × [4.0] ZERO WIDTH JOINER (ZWJ_FE) × [3.3] OCTAGONAL SIGN (ExtPict) ÷ [0.3]

Here the first two chars get rewritten with WB4 to ExtPic then if only subsequent rules are applied we end up in WB999 and a break between 200D and 1F6D1. The justification in the comment indicates to use WB3c on the ZWJ but that one should have been rewritten to ExtPict by WB4. 



More information about the Unicode mailing list