Line wrapping of mixed LTR/RTL text

Cosmin Apreutesei via Unicode unicode at unicode.org
Tue Aug 28 05:44:58 CDT 2018


Hello everyone,

I'm having a bit of trouble implementing line wrapping with bidi and I
would like to ask for some advice or hints on what is the proper way
to do this.

UAX#9 section 3.4 says that bidi reordering should be done after line
wrapping. But in order to do line wrapping correctly I need to be able
to visually ignore some whitespace, and I'm not sure exactly which
whitespace must be ignored.

There is this sentence in UAX#9 which provides a clue: "[...] trailing
whitespace will appear at the visual end of the line (in the paragraph
direction).". I'm not sure what that means, but by doing some tests
with fribidi and libunibreak I noticed that the whitespace always
sticks to the logical end of the word (so visually to the right for
LTR runs and to the left for RTL runs), regardless of the base
paragraph direction. Is it safe to use this assumption and always
remove the whitespace at the logical end of the last word of the line?
Or is it more complicated than that?

Quick example showing the problem. The following text:

لمفاتيح ABC DEF

with RTL base direction would wrap (for a certain line width) as:

ABC  لمفاتيح
DEF

with two spaces between the Latin and Arabic text, one from the Latin
text and one from the Arabic text. Since the line logically ends with
the "C" and LTR direction, I should have to probably remove the space
after the "C" (and, as a rule, just remove the whitespace at the
logical end of the word, regardless of paragraph's direction or word's
direction). Is this the right way to do it?

Screenshots attached.

Thanks!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1.png
Type: image/png
Size: 12005 bytes
Desc: not available
URL: <http://unicode.org/pipermail/unicode/attachments/20180828/caac4976/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2.png
Type: image/png
Size: 14359 bytes
Desc: not available
URL: <http://unicode.org/pipermail/unicode/attachments/20180828/caac4976/attachment-0001.png>


More information about the Unicode mailing list