Directionality controls for malicious code

Mark Davis ☕️ mark at macchiato.com
Thu Dec 2 12:51:19 CST 2021


The UBA explicitly carves out room for specialized text handling in
https://unicode.org/reports/tr9/#Higher-Level_Protocols. The goal of that
is to allow editors to handle bidi ordering in a sensible (and not
misleading) fashion in environments such as programming language editing,
specifically so that tokens are 'self-contained' and the ordering among
tokens is clear.

(There needs, however, to be more and clearer examples and guidance in
the UBA, #31, #36, and #39.)

Mark


On Thu, Dec 2, 2021 at 10:33 AM Eli Zaretskii via Unicode <
unicode at corp.unicode.org> wrote:

> > Date: Thu, 2 Dec 2021 16:19:12 +0100
> > From: Daniel Bünzli <daniel.buenzli at erratique.ch>
> > Cc: unicode at corp.unicode.org
> >
> > I'm not familiar enough with the bidi algorithm but for example it seems
> that unbounded RLO or RLI in a span should be forbidden unless they are
> properly balanced with a matching PDI or PDF
>
> The UBA mandates that all embeddings end at paragraph end, i.e. at a
> newline.  So unterminated embeddings and isolates behave exactly as
> terminated ones do, and requiring the embeddings and isolates to be
> properly terminated will only catch sloppy malicious tinkering with
> these controls, it won't catch the non-sloppy ones.
>
> > But I'm sure the problem is much more complex than that and I'd be
> curious if people in the know of the algorithm have an idea on how to go
> about it.
>
> I did have some ideas, and implemented detection of suspicious
> reordering for Emacs.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20211202/f112edd4/attachment.htm>


More information about the Unicode mailing list