BidiMirrored property and ancient scripts (Was Re: Plain text custom fraction input)
richard.wordingham at ntlworld.com
Sat Jul 25 08:36:51 CDT 2015
On Sat, 25 Jul 2015 12:52:53 +0300
Eli Zaretskii <eliz at gnu.org> wrote:
> > Date: Sat, 25 Jul 2015 10:11:02 +0100
> > From: Richard Wordingham <richard.wordingham at ntlworld.com>
> > On Sat, 25 Jul 2015 10:51:19 +0300
> > Eli Zaretskii <eliz at gnu.org> wrote:
> If your implementation's purpose is to illustrate random permutations
> of glyphs, or artificially scrambling the text appearance, maybe.
Obviously the purpose would be to demonstrate that a cart and horses
can be driven through the Unicode standard.
> if the implementation's purpose is to present a legible text using
> that character in some modern script, then no, it makes no sense and
> would be perceived as a bug. Although it'd probably be rendered "not
> guilty for lack of evidence" in a court of UBA law.
No, it should be "not guilty because acting lawfully".
> > Similarly, an arrow with a resolved directionality of R may be
> > mirrored if a higher level protocol so dictates.
> Again, you'd have to present a protocol that makes sense in the
> context of the specific implementation. Otherwise, it's a bug.
No, it's a feature. :-) It's only a bug if there's a requirement to be
fit for purpose. If the purpose of the implementation is to gobble up
disk space, then it's not a bug.
> > The issue lies with the wording of condition (1). One might expect
> > it to apply only to characters with a bidirectional type of L.
> I see no reason to restrict this to L characters. I'd be interested
> to hear your rationale for that.
A) A strong character's form in the corresponding directional context
is the form identified by the Unicode charts. If it is of type AL or
R, it will , by definition, not be mirrored.
B) A weak or neutral character's form in the charts is the form that
occurs in the left-to-right direction. Such a character has
Bidi-mirrored set to Yes if it has different forms for left-to-right and
right-to-left. By rule L4, it will be mirrored if it receives a
resolved direction of R.
C) A character of type L may need to be mirrored if it receives a
resolved directionality of R. The most notable example is Egyptian
hieroglyphs, but the same applies to Greek.
There is a definite hole in my argument for non-spacing marks; marks
used primarily in the Arabic script are shown in a form they take in a
> > My surmise is that it attempts to address text whose directionality
> > is not known before rendering.
> Indeed, UBA mirroring is only relevant to neutral characters.
Then how do you explain condition (2):
"Characters with a resolved directionality of L and whose
bidirectional type is R or AL"
Obviously these characters are not neutral characters. The only way
they can acquire a resolved directionality of R is by application of
> I don't think so. I agree with those who maintain that boustrophedon
> is unidirectional text, and so out of scope for the UBA.
There are three main parts to the UBA:
1) Interpreting the text as nested runs of text in the same order.
2) Sorting out the left-to-right order in which to write them (L2)
3) Sorting out mirroring (L4)
Interpreting LRO and RLO is part of (1). I'd like to know what the
justification for have directionality overrides is.
Now, ancient boustrophedon text, to the best of my knowledge, does not
need parts 1 to 2. Modern numerical place notation should be a problem
when writing boustrophedon. Boustrophedon starts from the assumption
that text has an order from start to finish, but numbers in place
notation have a left and a right.
Where we may part company is in our view of Hebrew text (no Arabic
numbers) with parentheses in a right-to-left paragraph. I think such
text is really just as unidirectional as equivalent Latin text in a
left-to-right paragraph. However, one needs the UBA to sort out the
rendering of the parentheses in the Hebrew text. Indeed, one may rely
on the bidi algorithm to declare the Latin example unidirectional.
If one can determine that text to be rendered boustrophedon is genuinely
'unidirectional', it seems entirely reasonable to call upon the Bidi
algorithm to sort out the mirroring of glyphs on a *line* once one has
chosen the direction of a line.
Where we may have a problem is that the Latin and Hebrew commas have
the same codepoint, *despite* having the same appearance.
I can accept is that the handling a mixture of boustrophedon,
left-to-right and right-to-left text is to much to ask of the Bidi
algorithm. The very first problem is that of defining what would
constitute unidirectional boustrophedon text
More information about the Unicode