Line wrapping of mixed LTR/RTL text

Eli Zaretskii via Unicode unicode at unicode.org
Tue Aug 28 11:28:25 CDT 2018


> Date: Tue, 28 Aug 2018 13:44:58 +0300
> From: Cosmin Apreutesei via Unicode <unicode at unicode.org>
> 
> There is this sentence in UAX#9 which provides a clue: "[...] trailing
> whitespace will appear at the visual end of the line (in the paragraph
> direction).". I'm not sure what that means, but by doing some tests
> with fribidi and libunibreak I noticed that the whitespace always
> sticks to the logical end of the word (so visually to the right for
> LTR runs and to the left for RTL runs), regardless of the base
> paragraph direction.

That is not so if the line ends after the whitespace: in that case the
whitespace is trailing, and will appear at the visual end of the
line.  Only if you add some character after the whitespace will the
whitespace "jump" to the other side of the word.

> Quick example showing the problem. The following text:
> 
> لمفاتيح ABC DEF
> 
> with RTL base direction would wrap (for a certain line width) as:
> 
> ABC  لمفاتيح
> DEF
> 
> with two spaces between the Latin and Arabic text, one from the Latin
> text and one from the Arabic text.

No, it should show the space after ABC to the left of ABC,
i.e. immediately before the line end.

What UAX#9 tells you is that you need to decide that the line will
wrap after the space that follows "ABC", the reorder the line as if it
ended after that space, which will produce this:

لمفاتيح ABC 

(with the trailing space to the left of "ABC").  Then you should
display "DEF" on the next line.

IOW, the correct order is:

  . find levels
  . wrap in logical order
  . reorder wrapped lines



More information about the Unicode mailing list