UAX #9: applicability of higher-level protocols to bidi plaintext
Shai Berger via Unicode
unicode at unicode.org
Mon Jul 16 17:51:12 CDT 2018
Hi Eli and all,
On Sat, 14 Jul 2018 14:07:50 +0300
Eli Zaretskii via Unicode <unicode at unicode.org> wrote:
> From: Shai Berger <shai at platonix.com>
> > I have no argument with this, but I do think that in such cases it
> > is wrong for the app to pretend that it is still treating the text
> > as plain.
> What is "plain text" in this context?
Plain text here is the thing described in subsection "Plain Text" in the
core unicode standard, Chapter 2 Section 2 "General Structure: Unicode
Design Principles". In terms of composition, it is "a pure sequence of
character codes"; in terms of function, it is "public, standardized,
and universally readable".
> Does, for example, text with bidi formatting controls count as
So long as the bidi controls are Unicode characters, I'd say "yes" --
according to the definitions above. The one thing I would disagree with
is calling them "formatting controls" -- as I believe they encode
semantics, not appearance.
And I should add, in response to the other points raised in this
thread, from the same page in the core standard: "If the same plain text
sequence is given to disparate rendering processes, there is no
expectation that rendered text in each instance should have the same
appearance. Instead, the disparate rendering processes are simply
required to make the text legible according to the intended reading."
That paragraph ends with the following summary, emphasized in the
Plain text must contain enough information to permit the text
to be rendered legibly, and nothing more.
The last answer in http://www.unicode.org/faq/bidi.html violates this
dictum, as I have showed here with different examples. As long as it
stands, the Unicode standard fails its own criteria.
More information about the Unicode