Problems with BidiCharTest.txt
Dov Grobgeld via Unicode
unicode at unicode.org
Sun Jul 16 14:56:41 CDT 2017
Thanks Eli. That makes sense of the test. Now I just need to figure out how
to implement it...
Indeed, Philippe, the isolate semantics is much easier to wrap your head
On Sun, Jul 16, 2017 at 8:09 PM, Eli Zaretskii <eliz at gnu.org> wrote:
> > Date: Sun, 16 Jul 2017 07:13:02 +0300
> > From: Dov Grobgeld via Unicode <unicode at unicode.org>
> > While implementing UAX#9 for Unicode 6.3 (and beyond) in FriBidi, I'm
> trying to pass all the tests of
> > BidiCharacterTest.txt , and I'm having problem understanding a few of
> the tests that to me appear to
> > contradict the spefication. The problematic lines in
> BidiCharacterTest-10.0.0.txt are the tests on lines 262,
> > 263, and 264.
> > Let's consider test from line 262:
> (I believe you meant line 264.)
> > Dir: RTL
> > Input: a ( b <RLE> c <pdf> ) _ 1
> > Level: 2 2 2 x 4 x 1 1 2
> > The problem I'm having is that the first opening bracket is assigned
> level 2 and the closing bracket level 1.
> > This seems to contradict the three rules N0.b, N0.c.1, and N0.c.2 in the
> specification that all describe
> > overriding the type of both brackets with either the embedding or the
> opposite direction. The only case we can
> > possibly get different levels (correct me if I'm wrong!) is if rule N0.d
> is applied and the brackets retain their
> > neutral status until they are resolved in subsequent rules.
> The example is correct, IMO. (FWIW, Emacs produces the same reordered
> display as expected by the test.) I think the effect you mention is
> produced by the RLE..PDF embedding: it causes the opening and the
> closing parentheses to be in 2 different isolating run sequences, see
> examples in BD13. Bracket pairs are processed as such only if they
> are in the same isolating run sequence. Try the same test without the
> RLE..PDF part, and you will see the result you expect.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Unicode