<div dir="ltr"><div class="gmail_default" style="font-family:times new roman,serif">The UBA explicitly carves out room for specialized text handling in <a href="https://unicode.org/reports/tr9/#Higher-Level_Protocols">https://unicode.org/reports/tr9/#Higher-Level_Protocols</a>. The goal of that is to allow editors to handle bidi ordering in a sensible (and not misleading) fashion in environments such as programming language editing, specifically so that tokens are 'self-contained' and the ordering among tokens is clear.</div><div class="gmail_default" style="font-family:times new roman,serif"><br></div><div class="gmail_default" style="font-family:times new roman,serif">(There needs, however, to be more and clearer examples and guidance in the UBA, #31, #36, and #39.)</div><div class="gmail_default" style="font-family:times new roman,serif"><br></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><font face="'times new roman', serif"><div style="background-color:transparent;margin-top:0px;margin-left:0px;margin-bottom:0px;margin-right:0px"><div></div></div><div style="background-color:transparent;margin-top:0px;margin-left:0px;margin-bottom:0px;margin-right:0px">Mark</div></font><div><div><font face="'times new roman', serif"><i><span style="font-style:normal"><i></i></span><i></i></i></font></div></div></div></div></div></div></div></div></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Dec 2, 2021 at 10:33 AM Eli Zaretskii via Unicode <<a href="mailto:unicode@corp.unicode.org">unicode@corp.unicode.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">> Date: Thu, 2 Dec 2021 16:19:12 +0100<br>

> From: Daniel Bünzli <<a href="mailto:daniel.buenzli@erratique.ch" target="_blank">daniel.buenzli@erratique.ch</a>><br>

> Cc: <a href="mailto:unicode@corp.unicode.org" target="_blank">unicode@corp.unicode.org</a><br>

> <br>

> I'm not familiar enough with the bidi algorithm but for example it seems that unbounded RLO or RLI in a span should be forbidden unless they are properly balanced with a matching PDI or PDF<br>

<br>

The UBA mandates that all embeddings end at paragraph end, i.e. at a<br>

newline.  So unterminated embeddings and isolates behave exactly as<br>

terminated ones do, and requiring the embeddings and isolates to be<br>

properly terminated will only catch sloppy malicious tinkering with<br>

these controls, it won't catch the non-sloppy ones.<br>

<br>

> But I'm sure the problem is much more complex than that and I'd be curious if people in the know of the algorithm have an idea on how to go about it. <br>

<br>

I did have some ideas, and implemented detection of suspicious<br>

reordering for Emacs.<br>

</blockquote></div>