Proposal for BiDi in terminal emulators

Eli Zaretskii via Unicode unicode at unicode.org
Fri Feb 1 07:26:00 CST 2019


> From: Egmont Koblinger <egmont at gmail.com>
> Date: Fri, 1 Feb 2019 13:40:48 +0100
> Cc: unicode at unicode.org
> 
> I now understand that presentation forms isn't an ideal possible
> approach, and the recommendation should be improved here.
> 
> Until it happens, I'm uncertain whether using presentation form
> characters is a decent low hanging fruit that significantly improves
> the readability in some situations (e.g. "good enough" in some sense
> for Arabic), or is a dead end we shouldn't propagate.

IMNSHO, you shouldn't try solving this problem on your own.  Instead,
use a shaping engine, such as HarfBuzz, to do that for you, since the
emulator does know which fonts it uses, and can access their
properties.  The only problem a terminal emulator does need to solve
in this regard is what to do when N codepoints yield M /= N glyphs
that the shaping engine tells you to emit, or, more generally, when
the width on display after shaping is different from N times the
character cell width.

> I still do not agree however that the entire responsibility can be
> shifted to the emulator. There are certain important bits of
> information that are only available to the application, and not the
> emulator – as with many other aspects, such as reordering,
> copy-pasting, searching in the data in BiDi-aware text editors using
> the terminal's explicit mode, which are all pushed to the application
> because the emulator cannot do them correctly.

As soon as you attempt to target applications that move cursor and use
cursor addressing, you are in trouble, and should IMO refrain from
trying to support such applications.  For example, Emacs doesn't even
write whole lines to the screen, it compares the internal
representation of what's on the screen and what should be there, and
only emits the parts that should be modified.  (It does that to
minimize screen writes, which might be expensive, especially if
writing to a remote terminal.)  In such cases, the emulator doesn't
stand a chance of doing TRT, because the application doesn't provide
enough context for it to reorder text correctly.

So I don't think a bidi-aware terminal emulator can support any
application more complex than those which write full lines to the
terminal, like 'cat', 'sed', 'diff', 'grep', etc.

> I believe we should further study the situation, e.g. see whether
> ECMA-48's SAPV (8.3.18) parameters 5..8 (to explicitly specify whether
> to use isolated/initial/medial/final form for each character) are
> flexible enough to convey all this information, or perhaps a new, more
> powerful means should be crafted.

Once again, I think it's impractical to expect applications to emit
these controls.  The emulator must do this part of the job.


More information about the Unicode mailing list