Unclear text in the UBA (UAX#9) of Unicode 6.3

Ilya Zakharevich nospam-abuse at ilyaz.org
Mon Apr 21 22:32:15 CDT 2014


On Mon, Apr 21, 2014 at 06:08:12PM -0700, Asmus Freytag wrote:
> Here's the text I supplied, with numbers added for discussion. It
> definitely needs some
> editing, but the point of the exercise would be to see what:
> 
>     1.  A bracket pair is a pair of characters consisting of an opening
>          paired bracket and a closing paired bracket such that the
>          Bidi_Paired_Bracket property value of the former equals the
> latter,
>          subject to the following constraints.
> 
>         a - both characters of a pair occur in the same isolating run
>    sequence
>         b - the closing character of a pair follows the opening character
>         c - any bracket character can belong at most to one pair, the
>    earliest possible one
>         d - any bracket character not part of a pair is treated like an
>    ordinary character
>         e - pairs may nest properly, but their spans may not overlap
>    otherwise
> 
> 
>     2.  Bracket characters with canonical decompositions are
> supposed to be treated
>          as if they had been normalized, to allow normalized and
> non-normalized text
>         to give the same result.
> 
> 
> c) needs rewording, because it is not correct
> 
> The BD16 examples show
> 
> 	a ( b ) c ) d		2-4
> 	a ( b ( c ) d		4-6
> 
> From that, it follows that it's not the earliest but the one with the smallest span.

Sorry, I do not see any definition here.  Just a collection of words
which looks like a definition, but only locally…

And I think I can even invent an example which I cannot parse using
your definition:

  1(  2[  3(  4]  5)  6)

Is looking-at-1 forcing match of 3-and-5?  Or what?

Thanks,
Ilya



More information about the Unicode mailing list