Need help understanding a test case for the Line Break property in 16.0

Karl Williamson public at khwilliamson.com
Sat Apr 5 23:15:44 CDT 2025


In the file LineBreakTest.txt, there is this test case:

× 23E9 × 00AB ÷	#  × [0.3] BLACK RIGHT-POINTING DOUBLE TRIANGLE (AL) × 
[19.11] LEFT-POINTING DOUBLE ANGLE QUOTATION MARK 
(QU_QU_Pi_QUmPf_NotEastAsian) ÷ [0.3]

U+23E9 has General Category 'So', and has Line Break classification AL
U+00AB has General Category 'Pi', and has Line Break classification QU

Rule LB19 says

LB19 Do not break before non-initial unresolved quotation marks, such as 
‘ ” ’ or ‘ " ’, nor after non-final unresolved quotation marks, such as 
‘ “ ’ or ‘ " ’.

× [ QU - \p{Pi} ]

[ QU - \p{Pf} ] ×

The test wants there to not be a break between the two characters, and 
cites 19.11 as the reason.  (I've never understood where things like the 
.11 come from)

But LB19 does not apply in this case as far as I can tell.  U+00AB is 
Pi.  The first part of the rule is applicable only to QU characters that 
are non-Pi, so doesn't apply here.  The second part of the rule does 
apply to Pi, but only when that is the first character in the pair, 
which it isn't, so Lb19 doesn't apply.

It appears to me that the highest priority rule that applies to this 
pair of Line Break types is LB31 Break everywhere else.

I'm hoping someone can explain this to me.



More information about the Unicode mailing list