Proposal for BiDi in terminal emulators

Egmont Koblinger via Unicode unicode at
Thu Jan 31 03:58:54 CST 2019


On Wed, Jan 30, 2019 at 5:31 PM Adam Borowski <kilobyte at> wrote:

> The program (emacs in this case) can do arbitrary reordering of characters
> on the grid, it also has lots of information the terminal doesn't.  For
> example, what are you going to do when there's a line longer than what fits
> on the screen?  Emacs will cut and hide part of it; any attempts to reorder
> that paragraph by the terminal are outright broken as you don't _have_ the
> paragraph.  Same for a popup window on the middle of the screen partially
> obscuring some text underneath.

This is absolutely correct so far.

> And if you argue "so make emacs print your
> new code to disable formatting", so do thousands of other programs that are
> less sophisticated than emacs.

Yes, I do argue that emacs will need to print a new escape sequence.
Which is much-much-much-much-much better than having to tell users to
go into the settings of their macOS Terminal / Konsole /
gnome-terminal etc. and disable BiDi there, isn't it?

Could you please give me a brief idea about those "thousands of other
programs" that will need to be adjusted? What other apps can do BiDi?
Not even Vim/NeoVim can do it.

If an app doesn't support BiDi, it's broken anyways when encountering
RTL text. It'll still be broken, just broken differently. Did you mean
all these programs as those thousands?

For ncurses apps there's also a workaround that you could apply:
create a terminfo where the ti/te entries not only switch to/from the
alternate screen but also disable/enable BiDi. In that case all these
thousand ones will be "fixed" (that is: broken in the "old" way rather
than broken in the "new" way).

On the other hand, what you absolutely can *not* do automatically by
emitting escape sequences at the right times, is to enclose the output
of much lighter utilities like "echo", "cat", "grep", "head" and so on
with any kind of BiDi controls.

> On the other hand, all that the program can output is a sequence of Unicode
> codepoints.  These don't include shaping information

With "presentation form" characters, yes, they can, they do including
shaping information.

> and are not supposed
> to.  The shaping is explicitly meant to be done by the terminal,


> and it's
> the terminal who's equipped with _most_ of the needed data

Why? It's the app that knows the context characters, it's the app that
knows the language.

What is it that the terminal knows, but the app doesn't although
should, or what is it that the terminal doesn't know if presentation
form characters are used?

What is it that the app knows but cannot pass to the terminal?
Shouldn't we then extend the protocol so that it can pass these, too?


More information about the Unicode mailing list