UAX #9: applicability of higher-level protocols to bidi plaintext

Richard Wordingham via Unicode unicode at
Sat Jul 14 08:15:37 CDT 2018

On Sat, 14 Jul 2018 13:09:11 +0300
Shai Berger via Unicode <unicode at> wrote:

> On Fri, 13 Jul 2018 11:22:51 +0300
> Eli Zaretskii via Unicode <unicode at> wrote:
> > 
> > Different applications will have different needs here, so there's
> > definitely a need to provide applications and users with some
> > control of paragraph direction, and the way to do this is define
> > high-level protocols controlled by some optional variables.  A
> > well-known example of that is the paragraph-direction buttons in
> > Word and similar processors (although they don't produce plain
> > text, so the analogy is limited).  
> I have no argument with this, but I do think that in such cases it is
> wrong for the app to pretend that it is still treating the text as
> plain.

The problem with your concept of 'plain text' is that there is almost no
such thing.  To display text, one has to choose a basic writing
direction - direction within lines (LTR, RTL, TTB or BTT) and direction
from line to line (TTB, BTT, LTR or RTL) - and that's ignoring
boustrophedon variants and specialised cases such as 'round robin' or
the spiral of the Phaistos disc.

If the display concept is to treat lines as being of unbounded length,
one needs a left margin, a right margin, or perhaps one centres each
line.  Centred text does not strike me as 'plain'.  Centred text is the
only one that can handle paragraphs of different directionality well in
this concept.

Lines of unbounded length is the natural choice for editors for
programming languages - lines are often syntactically significant.
They are also syntactically relevant for emails in point by
point discussions.

The default BiDi rule for the basic directionality of paragraphs usually
works when there is a left margin and a right margin, though buffering
makes it impossible to bound the amount of memory required.  Note that
several key utilities limit the number of combining marks or the length
of Indic syllables.


More information about the Unicode mailing list