UAX #29 and WB4

Daniel Bünzli via Unicode unicode at
Wed Mar 4 11:48:09 CST 2020

On 4 March 2020 at 18:01:25, Daniel Bünzli (daniel.buenzli at wrote:

> Re-reading the text I suspect I should not restart the rules from the first one when a WB4  
> rewrite occurs but only apply the subsequent rules. Is that correct ?

However even if that's correct I don't understand how this test case works:

÷ 1F6D1 × 200D × 1F6D1 ÷ #  ÷ [0.2] OCTAGONAL SIGN (ExtPict) × [4.0] ZERO WIDTH JOINER (ZWJ_FE) × [3.3] OCTAGONAL SIGN (ExtPict) ÷ [0.3]

Here the first two chars get rewritten with WB4 to ExtPic then if only subsequent rules are applied we end up in WB999 and a break between 200D and 1F6D1. The justification in the comment indicates to use WB3c on the ZWJ but that one should have been rewritten to ExtPict by WB4. 



