Bidi Parenthesis Algorithm and BidiCharacterTest.txt

Eli Zaretskii eliz at gnu.org
Tue Oct 14 14:56:42 CDT 2014


> From: "Andrew Glass (WINDOWS)" <Andrew.Glass at microsoft.com>
> Date: Tue, 14 Oct 2014 18:07:24 +0000
> 
> The difference is that N0 is applied per bracket pair and the result of the
> resolution of one bracket pair may impact the resolution of other bracket pairs
> in the same isolating run sequence. So in your example:
> 
> · 2-17 is resolved to R as you say.
> 
> · Since 2-17 is now R and not neutral, the resolution of 3-9 is R because the
> check for context finds the opening parenthesis at 2 (now R) before the a at 1.
> Therefore 2-17 is R under N0c2.

But there's nothing about this in the UAX#9 language!  How did you
arrive at this dependency, using just what the UBA says?

> The proposed update attempts to make this clearer in the intro to 3.3.5:
> 
> http://www.unicode.org/reports/tr9/tr9-32.html#N0
> 
> Note that this rule is applied based on the current bidirectional character
> type of each paired bracket and not the original type, as this could have
> changed under X6.
> 
> Perhaps this should be emended to include that N0 can also update the type for
> subsequent tests under N0, which is the case here.

There's a big difference between X6 and N0.  X6 is about the explicit
override, and is applied before N0.  Your interpretation makes N0 a
recursive rule, something that is not even hinted at by the UBA spec.

> Currently N0 states:
> 
> N0. Process bracket pairs in an isolating run sequence sequentially in the
> logical order of the text positions of the opening paired brackets using the
> logic given below.
> 
> Example 1 illustrates a similar case in that the neutral ! resolves to R
> because of the bracket resolution to R rather than the context between two Ls.
> This of course takes place in N1 and not N0 as in the example you ask about.

Of course!  And so Example 1 is very different from what we are
discussing, because each stage of the algorithm is applied to the
results of the previous stage.  But there's no other place, AFAICS,
where the same stage is applied recursively.  So I really don't see
how this interpretation could be gleaned from the UBA description.

Thanks for explaining, but it is really frustrating to find out about
these untold subtleties at this late stage.  (And yes, I've read the
proposed changes in tr9-32.html, and not even they say anything about
this.)  How can we be sure that your interpretation is indeed correct,
if it is not even hinted anywhere?


More information about the Unicode mailing list