BidiMirrored property and ancient scripts (Was Re: Plain text custom fraction input)
kenwhistler at att.net
Fri Jul 24 11:28:05 CDT 2015
On 7/24/2015 2:59 AM, Frédéric Grosshans wrote:
> Is that better ? Once again, I agree that forbidding ancient Egyptian
> to be mirrored when “stupid and dangerous”
I can see that this thread seems to have gone off the rails a bit.
The Unicode Standard does not forbid Egyptian hieroglyphs from being
"mirrored" in a RTL layout context. The Unicode Bidirectional Algorithm
neither requires nor forbids that. It is simply out of scope.
First there is a general issue of general mirroring of body text for some
ancient scripts, which in paleographic contexts often followed conventions
(no longer seen, except in rare edge cases) of having the direction of
glyph orientation switch depending on line orientation. This is particularly
noted in epigraphic contexts in ancient scripts of the greater Mediterranean
area, but also occurs occasionally elsewhere. This general mirroring of
body text is *not* part of Unicode plain text. There are no UCD properties
defined for this, normative or informative, with either granularity at
the per-character basis or the per-script basis. And there is no algorithm
defined in the Unicode Standard to deal with this issue of paleography.
Note that for the most part, this general mirroring is not a *bi*directional
problem at all. It is a dextroverse versus sinistroverse layout issue, as
nearly all of this kind of epigraphic text does not occur in *bi*directional
contexts at all -- but rather in text where everything goes one direction.
(Lest the nitpickers immediately cite boustrophedon -- boustrophedon is
*also* not *bi*directional text -- it is a convention that alternates
dextroverse lines with sinistroverse lines, but does not mix directions on
Then there is the *specific* issue of bidirectional mirroring. That is
*different*. It is a normative part of the Unicode Bidirectional Algorithm,
it is controlled in applicability by specific rules and by exact
of the set of characters that have the Bidi_Mirrored=Y property in the UCD.
That property applies to all paired brackets (except 2 Arabic ornate
parentheses, for legacy reasons) and a set of non-symmetric mathematical
operators (but not to arrow symbols). The applicability of bidirectional
mirroring is mandatory and required by the Unicode Bidirectional
Algorithm, and is essential in the layout of *modern* text, because of
the very general problem of the interpretation of opening and closing for
directionally oriented brackets occurring in pairs, in text where mixed
directional runs may occur together on the same line of text.
These two concerns are *not* the same and should not be confused.
They are, however, commonly confused, because they both involve
"mirroring" of glyphs and have something to do with line layout direction.
> I (maybe naively) thought that the BidiMirrored=No property for
> hieroglyphs, runes, etc. in the UCD was volunteer.
It is not "volunteer". It is out of scope.
> If it was not, do you think that the unicode consortium would consider
> some (if not all) of the following actions :
> * accepting proposals to “BidiMirror” relevant ancient scripts with
> no modern usage
This will not happen.
> * changing the BidiAlgorithm and BidiMirrored property (or
> BidiMirroredv2) to take into account Mirrored RTL scripts
This will not happen.
> * Distinguish between “never mirrored” caracters (Han), and “Sometimes
> mirrored, unknown mirrored” (Latin? Most Indic ? Cyrillic ?)
That is an issue for how to deal with the paleographic issues of
reversed direction body text. People can certainly head down that
direction and create databases of information about which scripts
do this, in which contexts and time periods. But it is completely
out of scope for the UBA. Note that even in scripts that have this
behavior paleographically, the occurrence of RTL versus LTR versions
may differ statistically over time and eventually die out in favor
of one direction or the other. See Old Italic. For that matter,
see ancient Greek, which had RTL, LTR, and boustrophedon, but
which eventually settled on strictly LTR layout.
More information about the Unicode