Proposal for BiDi in terminal emulators

Adam Borowski via Unicode unicode at unicode.org
Wed Jan 30 10:31:42 CST 2019


On Wed, Jan 30, 2019 at 05:56:00PM +0200, Eli Zaretskii via Unicode wrote:
> > - It doesn't do Arabic shaping.
> 
> It doesn't do _any_ shaping.  Complex script shaping is left to the
> terminal, because it's impossible to do shaping in any reasonable way
> without controlling the fonts being used and accessing the font
> information, and this is not possible when you run on a terminal

It's the inverse of the situation with RTL reordering.  The interface
between the program and the terminal is a character cell grid (really, a
sequence of printables and \e-based codes, but that's a technical detail).

The program (emacs in this case) can do arbitrary reordering of characters
on the grid, it also has lots of information the terminal doesn't.  For
example, what are you going to do when there's a line longer than what fits
on the screen?  Emacs will cut and hide part of it; any attempts to reorder
that paragraph by the terminal are outright broken as you don't _have_ the
paragraph.  Same for a popup window on the middle of the screen partially
obscuring some text underneath.  And if you argue "so make emacs print your
new code to disable formatting", so do thousands of other programs that are
less sophisticated than emacs.

On the other hand, all that the program can output is a sequence of Unicode
codepoints.  These don't include shaping information, and are not supposed
to.  The shaping is explicitly meant to be done by the terminal, and it's
the terminal who's equipped with _most_ of the needed data (it might lack
context just outside screen's end or under an overlapped window, but that's
not specific to complex shaping -- same can happen for the other half of a
CJK character).  You know if the font used supports shaping, you can have
access to a graphic view (as opposed to the array of codepoints) -- heck,
it's only you who know the text is rendered on a screen rather than a
Braille device.  And if you miss an opportunity to shape something, the
result is still readable to the user, merely not as good as it could be.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Remember, the S in "IoT" stands for Security, while P stands
⢿⡄⠘⠷⠚⠋⠀ for Privacy.
⠈⠳⣄⠀⠀⠀⠀


More information about the Unicode mailing list