Proposal for BiDi in terminal emulators

Egmont Koblinger via Unicode unicode at unicode.org
Fri Feb 1 10:42:10 CST 2019


Hi,

I'm trying to respond to every question, but I'm having a hard time
keeping up :-)

Thanks a lot for all the precious input about shaping!

Here's my suggestion, for version 0.2 of the recommendation:

- No longer encourage any use of presentation form characters.

- State that it's the terminal emulator's task to perform shaping,
both in implicit and explicit modes.

- Leave it for a future enhancement to handle trickier cases in
explicit mode, such as shaping of a word that's only partially
visible, or prevent shaping when two words happen to touch each other
and are visually separated by other means (e.g. background color).
Leave it for further research whether we could use ZWJ/ZWNJ here,
whether we could use ECMA's SAPV 5-8 & 21-11, or whether we should
invent something new (perhaps even telling the terminal emulator what
neighboring previous/next characters to imagine there for the purpose
of shaping)...

Let me know if you have any remaining problems/concerns/etc.

As for the implementation in VTE: initially I'll still use
presentation form characters, solely because that's a low hanging
fruit approach (low investment, high gain). I've already implemented
it in about an hour (a bit of further hacks will be necessary to
extend it to explicit mode, but still easily doable), whereas
switching to HarfBuzz is expected to take weeks of heavy work. We'll
tackle that in a subsequent version. And if anyone's happy to help,
there's already some bounty for harfbuzz support :)

Thanks again for the great guidance!

cheers,
egmont

On Tue, Jan 29, 2019 at 1:50 PM Egmont Koblinger <egmont at gmail.com> wrote:
>
> Hi,
>
> Terminal emulators are a powerful tool used by many people for various
> tasks. Most terminal emulators' bugtracker has a request to add RTL /
> BiDi support. Unicode has supported BiDi for about 20 years now.
> Still, the intersection of these two fields isn't solved. Even some
> Unicode experts have stated over time that no one knows how to do it
> properly.
>
> The only documentation I could find (ECMA TR/53) predates the Unicode
> BiDi algorithm, and as such no surprise that it doesn't follow the
> current state of the art or best practices.
>
> Some terminal emulators decided to run the BiDi algorithm for display
> purposes on its lines (rather than paragraphs, uh), not seeing the big
> picture that such a behavior turns them into a platform on top of
> which it's literally impossible to implement proper BiDi-aware text
> editing (vim, emacs, whatever) experience. In turn, vim, emacs and
> friends stand there clueless, not knowing how to do BiDi in terminals.
>
> With about 5 years of experience in terminal emulator development, and
> some prior BiDi homepage developing experience with the kind mentoring
> of one of the BiDi gurus (Aharon, if you're reading this, hi there!),
> I decided to tackle this issue. I studied and evaluated the
> aforementioned documentation and the behavior of such terminals,
> pointed out the problems, and came up with a draft proposal.
>
> My work isn't complete yet. One of the most important pending issues
> is to figure out how to track BiDi control characters (e.g. which
> character cells they belong to), it is to be addressed in a subsequent
> version. But I sincerely hope I managed to get the basics right and
> clean enough so that work can begin on implementing proper support in
> terminal emulators as well as fullscreen text applications; and as we
> gain experience and feedback, extending the spec to address the
> missing bits too.
>
> You can find this (draft) specification at [1]. Feedback is welcome –
> if it's an actionable one then preferably over there in the project's
> bugtracker.
>
> [1] https://terminal-wg.pages.freedesktop.org/bidi/
>
>
> cheers,
> egmont (GNOME Terminal / VTE co-developer)



More information about the Unicode mailing list