Bidi paragraph direction in terminal emulators

Philippe Verdy via Unicode unicode at unicode.org
Sun Feb 10 07:54:39 CST 2019


Le sam. 9 févr. 2019 à 20:55, Egmont Koblinger via Unicode <
unicode at unicode.org> a écrit :

> Hi Asmus,
>
> > On quick reading this appears to be a strong argument why such emulators
> will
> > never be able to be used for certain scripts. Effectively, the model
> described works
> > well with any scripts where characters are laid out (or can be laid out)
> in fixed
> > width cells that are linearly adjacent.
>
> I'm wondering if you happen to know:
>
> Are there any (non-CJK) scripts for which a mechanical typewriter does
> not exist due to the complexity of the script?
>

Look into South Asian scripts (Lao, Khmer, Tibetan...) and large
syllabaries (CANS, Ethiopian).
Even Arabic is challenging and does not work very well (or is very ugly)
with typewriters or monospaced fonts, except if we use "simplified" Arabic.
Hebrew is a bit better but also has issues if you need to support all its
diacritics.

Finally even Latin is not easy to fit with its ligatures, and multiple
diacritics, some of them with complex layouts and applicable to pairs of
letters, or seomtimes larger groups).
The monospace restriction is a strong limitator: but then I don't see why a
"terminal" could not handle fonts with variable metrics, and why it must be
modeled only as a regular grid of rectangular cells (all of equal size)
containing only one "character" (or cluster?). It is perfectly possible to
have a terminal handling text as collection of "logical lines", split
(horizontally?) as multiple spans covering one or more cells, each span
containing one or more characters (or a full cluster) rendered correctly.

But then you recreate the basic HTML standard (just discard the "document"
and "body" level which would be implicit in a terminal, keep the "block"
and "inline" elements, and flow the text (note that rendered lines could as
well variable heights, depending on the height of their unbreakable spans
and their vertical alignment...). But then you need specific controls to
make proper vertical alignments (basically you need a "tabulator" in the
terminal with a way to define the start of a tabulator scope and its end,
and then reference tabulations by id when defining them in the middle of
the text; this tabulator would be more powerful than just the TAB control
which only uses an implicit/predefined tabulator).

Then for editors in terminals you need a way to query the position of some
items and make "logical" moves: the simple (line/column) coordinates on a
grid are not usable. In HTML we would do that with form input elements (the
form is flowed normally but is navigatable and input elements will have
their own editable areas).

So using controls, you would try to mimic again what HTML already provides
you for free (and without complex specifications and redevelopment).

So my opinion is that all legacy terminal protocosl will remain broken and
it is more viable to work with the W3C to define a basic HTML profile
suitable for terminals, but that will benefit of all the improvements made
in HTRML to support i18n, including required ones (BiDi, variable-width
fonts needed for complex scripts, accessibility...), but without the extra
elements that were added in HTML5 for semantic document structures (HTML5
still speaks about the "document" level, but there's little defined for
documents that are infinite streams that you can start reading from random
position and possibly never terminated):

All we need is a subset of HTML5 with only a few block elements without
terminator tags ("p" would be implicit) and the inline elements for all the
rest, and this becomes a viable "terminal protocol" which would deprecate
all the legacy VT-like protocols (and would put an end to the desire of
adding many new controls or duplicate reencodings in Unicode for specific
styles.

The only block elements that would be useful on top of this are forms and
form inputs, to create editable fields and some attributes to allow editing
or disallow them. Scripting would be an option (only for local data
validation or filtering some inputs that must not be sent to the server, or
to allow accessibility features, input methods and orthographic helpers).
Then with that we are no longer blocked by the old terminal limitations
(but it will still be possible for a terminal emulator to create a
reasonnable layout to map it to a grid-based terminal, and then offer some
helper tools to show a selectable popup view for things that cannot be
rendered on the basic grid).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20190210/b799387c/attachment.html>


More information about the Unicode mailing list