Proposal for BiDi in terminal emulators

Egmont Koblinger via Unicode unicode at
Fri Feb 1 07:16:03 CST 2019


On Thu, Jan 31, 2019 at 4:10 PM Eli Zaretskii <eliz at> wrote:

> The reordering happens before TABs are converted to cursor motion,
> does it not?

No, not at all.

You cannot "mix" handling the input and reordering, since the input is
not available as a single step but arrives continuously in a stream.

Consider a heavy BiDi text such as (I'm making up some random
gibberish, uppercase being RTL):
foo BAR FSI BAz quUX 1234 PDI whatEVer

Someone prints it to the terminal, but due to the internals, the
terminal doesn't receive this in one single step but in two
consecutive ones, broken in the middle. Maybe the app split it in half
(e.g. a shell script printed fragments one by one using printf without
a trailing newline). Maybe the emitter is a "dd" printing blocks of
let's say 4kB and this line happens to cross a boundary. Maybe a
transport layer such as ssh split it for whatever reason.

Then would you take the first half of this text, let's say
foo BAR FSI BAz quU
even with unbalanced BiDi controls, then reorder it, and continue from
it? Continue how? How to remember to reorder the second half too, but
not the first half once again in order to avoid "double BiDi"?

What to do with explicit cursor movement, would they jump to the
visual positon? This would break absolutely basic principles, e.g.
jumping twice to the same location to overwrite a letter twice in a
row may actually end up overwriting two different letters, since
everything was potentially rearranged after the first overwrite
happened? Any application having any existing preconception about
cursor movement would uncontrollably fall apart.

This approach is doomed to fail big time (and was the reason I had to
drop ECMA TR/53's DCSM "presentation" mode).

The only reasonable way is if you have two layers. The bottom layer
does the emulation almost exactly as it used to do, with no BiDi
whatsoever (except for tiny additions, e.g. it tracks BiDi-related
properties such as the paragraph direction). The upper layer displays
the data, and this upper layer performs BiDi solely for display
purposes: using the lower layer's data as input, but not modifying it.

This is, by the way, also what current emulators that shuffle the
characters arond do.

Let's also mention that the lower layer (emulation) should be as fast
as possible. e.g. VTE can handle input in the ballpark of 10MB/s.
Reordering, that is, running BiDi for display purposes needs to happen
much more rarely, maybe 20-60 times per second. It would be a
performance killer having to run the BiDi algorithm upon every
received chunk of data – in fact, to eliminate any possible behavior
difference due to timing difference, it'd need to happen after every
printable character received.

There's absolutely no way we could reorder first, and then handle
TAB's cursor movement. TAB's cursor movement happens in the lower
layer, reordering happens in the upper one.


More information about the Unicode mailing list