BidiMirrored property and ancient scripts (Was Re: Plain text custom fraction input)

Richard Wordingham richard.wordingham at ntlworld.com
Thu Jul 23 13:42:50 CDT 2015


On Thu, 23 Jul 2015 12:00:06 +0200
Frédéric Grosshans <frederic.grosshans at gmail.com> wrote:

> Le 23/07/2015 00:54, Richard Wordingham a écrit :
> > Which means that Ancient Egyptian hieroglyphs are unencoded!  Their
> > default direction is right-to-left, but that's only the start of the
> > trouble.  The encoded hieroglyphs aren't Bidi-mirrored, so if I
> > embed then in a right-to-left override, I should get retrograde
> > characters.
> The text of the standard say that they should be mirrored in this
> case. The version 7.0.0. has the following comment on Egyptian
> hieroglyphs : (p424, p9 of pdf) :
> 
>     “When left-to-right directionality is overridden to display
> Egyptian hieroglyphic text right to left, the glyphs should be
> mirrored from those shown in the code charts.”

The UCD may trump the core specification; I'm expecting to be advised
not to trust anything in the core specification.

> Similar comments are present for other historic script (Italic,
> Runic), but also Old North Arabian, which is encoded as RTL but
> “Glyphs may be mirrored in lines whenthey have left-to-right
> directionality”. This kind of implementation at the font level is
> perfectly possible and is indeed done sometimes (see e.g. Andrew
> West’s anglo-saxon runic fonts
> http://babelstone.co.uk/Fonts/AngloSaxon.html).

> The BidiMirrored property is not adapted in this case because, it is
> for a few “characters such as parentheses” (Unicode8.0.0, §4.7
> p180=pf 23 of ch04.pdf), and it is thought for a LTR default : it can
> in no way consider the case of Old North Arabian.

There had been hope until today.

> Extending this property for whole scripts would be a lot of work, and 
> should be more than a Y/N property as currently, since it should
> account for cases where the glyph are
> 
>  1. always mirrored (Egyptian, Italic, Runic. Greek ?),
>  2. sometimes mirrored (I have examples of both cases in Latin. North
>     Arabian seems to be in this case too),
>  3. never mirrored (Han),
>  4. not exactly mirrored ( like for U+2232 CLOCKWISE CONTOUR INTEGRAL
>     and U+221B CUBE ROOT )
>  5. And also when the behaviour under direction change is undefined (I
>     have difficulties to guess what it means to have LTR Arabic or
>     Syriac, or RTL Devanagari. Maybe there are some traditions for
> some complex scripts, but it makes no sense to invent a uniform
> behaviour for them)
 
> Currently a BidiMirrorred=N can mean anything of the above, and 
> BidiMirrored=Y means (1. or 4.).

To be precise, having reread the Bidi algorithm, in particular L4 and
HL6:

1) If resolved directionality is R and Bidi_Mirrored=Yes,
mirroring is mandatory.

2) If resolved directionality is L and bidirectional type is not R
or AL, mirroring is prohibited.

3) Otherwise, mirroring is optional.

It's odd that a font that reverses all the Hebrew letters is compliant
with the Unicode standard.

So, I was wrong.  Not marking hieroglyphs as Bidi_Mirrored didn't stop
them being used for Ancient Egyptian in marked up text.

> By the way, I think a comment should be added in the §4.7 of the 
> standard to clarify that the BidiMirrored property is not intended
> for cases like hieroglyphs or italic.

That is a stupid and dangerous remark.

If the hieroglyphs had had the BidiMirrored property corrected to Yes,
one could have had, in plain text, once fonts had caught up:

<U+132B9 EGYPTIAN HIEROGLYPH R008> for nṯr in normal left-to-right text
<U+202B RIGHT-TO-LEFT EMBEDDING, U+132B9, U+202C POP DIRECTIONAL
FORMATTING> for nṯr in retrograde left-to-right text

and embed whole paragraphs in <U+202B>...<U+202C> for right-to-left
text.

Once your remark has been adopted in the Unicode Standard, the only
way to get consistently oriented Ancient Egyptian in plain text is to:

a) Add a complete set of right-to-left hieroglyphs.
b) Add the retrograde hieroglyphs to each set.

One hopes that Egyptian Hieroglyphs is the only script for which
mirroring or not has meaning.

Richard.



More information about the Unicode mailing list