Directionality controls for malicious code

Fri Dec 3 03:00:21 CST 2021

On 3 December 2021 at 08:22:00, Eli Zaretskii via Unicode (unicode at corp.unicode.org) wrote:

> I don't see how it would help. For example, if you examine the
> examples provided in that paper, you will see that the directional
> format controls were inserted inside comments, but in a way that made
> parts of the comments to look like part of the code.

Yes. The idea is to disallow in the grammar of your language visual reorderings to occur across certain textual boundaries specific to your language.

If you take C multi-line comments /* … */ the idea is that: 

1. No text logically between the /* and */ should visually be able to get on the left of /* 
2. No text logically between the /* and */ should visually be able to get on the right of */
3. No text logically before the /* should visually be able to get on the right of /*
4. No text logically after the */ should visually be able to get on the left of */ 

I'd say that a short way of saying that is that the text logically inside the /* and */ should be made to behave as an UBA paragraph – since no reorderings occur accross paragraphs. Violations of that property should result in a syntax error or a warning.

So I would like foolproof tools that allow to a) detect violations of these constraints and b) enforce them. 

For example in the case above, for enforcing them, would it be sufficient to insert a LRI (or RLI, or FSI) after /* and a PDI before */ ? Would that make sure that the properties 1-4 are satisfied for all contexts and contents of comments ?

I hope the above makes more clear the points of my message.

Best,

Daniel