BidiMirrored property and ancient scripts (Was Re: Plain text custom fraction input)

Frédéric Grosshans frederic.grosshans at gmail.com
Thu Jul 23 05:00:06 CDT 2015


Le 23/07/2015 00:54, Richard Wordingham a écrit :
> Which means that Ancient Egyptian hieroglyphs are unencoded!  Their
> default direction is right-to-left, but that's only the start of the
> trouble.  The encoded hieroglyphs aren't Bidi-mirrored, so if I embed
> then in a right-to-left override, I should get retrograde characters.
The text of the standard say that they should be mirrored in this case. 
The version 7.0.0. has the following comment on Egyptian hieroglyphs : 
(p424, p9 of pdf) :

    “When left-to-right directionality is overridden to display Egyptian
    hieroglyphic text right to left, the glyphs should be mirrored from
    those shown in the code charts.”

Similar comments are present for other historic script (Italic, Runic), 
but also Old North Arabian, which is encoded as RTL but “Glyphs may be 
mirrored in lines whenthey have left-to-right directionality”. This kind 
of implementation at the font level is perfectly possible and is indeed 
done sometimes (see e.g. Andrew West’s anglo-saxon runic fonts 
http://babelstone.co.uk/Fonts/AngloSaxon.html).

The BidiMirrored property is not adapted in this case because, it is for 
a few “characters such as parentheses” (Unicode8.0.0, §4.7 p180=pf 23 of 
ch04.pdf), and it is thought for a LTR default : it can in no way 
consider the case of Old North Arabian.

Extending this property for whole scripts would be a lot of work, and 
should be more than a Y/N property as currently, since it should account 
for cases where the glyph are

 1. always mirrored (Egyptian, Italic, Runic. Greek ?),
 2. sometimes mirrored (I have examples of both cases in Latin. North
    Arabian seems to be in this case too),
 3. never mirrored (Han),
 4. not exactly mirrored ( like for U+2232 CLOCKWISE CONTOUR INTEGRAL
    and U+221B CUBE ROOT )
 5. And also when the behaviour under direction change is undefined (I
    have difficulties to guess what it means to have LTR Arabic or
    Syriac, or RTL Devanagari. Maybe there are some traditions for some
    complex scripts, but it makes no sense to invent a uniform behaviour
    for them)

Currently a BidiMirrorred=N can mean anything of the above, and 
BidiMirrored=Y means (1. or 4.).

By the way, I think a comment should be added in the §4.7 of the 
standard to clarify that the BidiMirrored property is not intended for 
cases like hieroglyphs or italic.

     Frédéric





More information about the Unicode mailing list