Proposal for BiDi in terminal emulators

Richard Wordingham via Unicode unicode at unicode.org
Sat Feb 2 19:30:26 CST 2019


On Sat, 2 Feb 2019 23:02:10 +0100
Egmont Koblinger via Unicode <unicode at unicode.org> wrote:

> Hi Richard,
> 
> On Sat, Feb 2, 2019 at 9:57 PM Richard Wordingham
> <richard.wordingham at ntlworld.com> wrote:
> 
> > Seriously, you need to give a definition of 'visual order' for this
> > context.  Not everyone shares your chiralist view.  
> 
> When I look at the Unicode BiDi algorithm, or go to an online demo at
> https://unicode.org/cldr/utility/bidic.jsp, or look at the FriBidi API
> etc., their very basic functionality is that I pass the logical order
> (as the string is expected to be stored in text files etc.), and the
> result of the algorithm is the visual order.

That first reference doesn't even use the word 'visual'.  When I look
in Standard Annex 9, 'Unicode Bidirectional Algorithm', I find, 'In
combination with the following rule, this means that trailing
whitespace will appear at the visual end of the line (in the paragraph
direction)'.  Paragraph direction, of course, can be left-to-right or
right-to-left.  Your best hope there is, 'No bidirectional formatting.
This implies that the system does not visually interpret characters
from right-to-left scripts.'  It's a shame that that statement is not
true; one could build a system using N'ko decimal digits that only
visually interpreted characters from right-to-left scripts.

> What else do I need to further specify in the concept of "visual
> order"?

All I am saying is that your proposal should define what it means by
visual order.

<snip>
> This is the low level issue I'm trying to address, to make sure that
> letters of words are always shown in the correct order. There's no way
> you could do shaping underneath this level, it makes no sense to talk
> about shaping, zero-width (non)joining, special Khmer symbols and
> whatnot on reversed words, right?

> The order of the letters need to be
> fixed first, which is what I'm doing, and then all the bells and
> whistles needed for shaping might come on top of this.

Shaping for RTL scripts happens on strings stored in logical order.
These are then laid out right to left, though the dominant usage of
the term 'advance width' for right-to-left glyph sequences feels
perversely different from the use for left to right glyph sequences.

Passing text in the form of characters in left-to-right order is an
annoying distraction, presumably forced on you by the attempt to
maximise compatibility with existing systems.

Casting text into grids of 'characters' requires consideration of all
types of writing elements.  The division into panes is an awkward
complication; panes in the application not shared with the terminal is
even worse for shaping.

Richard.


More information about the Unicode mailing list