Directionality controls for malicious code

Wáng Yifán 747.neutron at gmail.com
Wed Dec 1 22:33:54 CST 2021


> If not, and since there are relatively few scripts of RtoL characters,
> is there any legitimate use of BiDi controls outside of script runs of
> those scripts.

I feel in this paragraph you assume that every script is either LTR or
RTL, but at least CJK scripts are allowed to be written both LTR and
RTL (although defaulted as LTR).

> If not, then could the Bidi control characters be made to have their scx
> property value be all the RtoL scripts, and software such as git could
> warn or forbid text of mixed scripts?

It's rather useful to warn IMO, but prohibition is unrealistic
considering that most modern rich text formats employ ASCII characters
for format control. For instance, if somebody want to show an Arabic
snippet surrounded by HTML tags inside an otherwise English comment
(or vice versa), I bet the primitive bidi algorithm that doesn't
understand <...> is a consecutive HTML grammar will mess up the
graphical order with 150% probability* that hardly readable without
bidi controls.

* A character has a chance to be misrendered one more time after the
first misrendering.

2021年12月1日(水) 3:42 Karl Williamson via Unicode <unicode at corp.unicode.org>:




>
> It is possible to make text appear to be other than what it really is by
> using BiDi controls.
>
> Such text may be be in the form of computer code, which could allow a
> trojan horse attack by sneaking stuff past human code reviewers.
>
> I have not studied the BiDi algorithm, so this may be naive.
>
> Is there any legitimate use of BiDi controls in text that doesn't have a
> mixture of LtoR and RtoL strings?
>
> If not, and since there are relatively few scripts of RtoL characters,
> is there any legitimate use of BiDi controls outside of script runs of
> those scripts.
>
> If not, then could the Bidi control characters be made to have their scx
> property value be all the RtoL scripts, and software such as git could
> warn or forbid text of mixed scripts?
>
> Or could a new property be created that allowed for machine detection of
> malicious use?
>
> Karl Williamson



More information about the Unicode mailing list