Proposing new arrow characters with Bidi_Mirrored=Yes

Joao S. O. Bueno gwidion at gmail.com
Thu Apr 10 11:45:41 CDT 2025


I don't believe that potentially adding more homographic characters
can worsen the panorama for attacks using them  - currently there are
already plenty of similar characters, and trusting the visual
appearance of a character for security won't ever mean anything again.
  Other security measures have to be taken for this kind of attack -
and having more ways to spell out a forward arrow is certainly not
changing the current landscape where an homograph, or close homograph
for that matter, to every ascii letter can probably be found nearly 20
times in the unicode charset.

On Thu, Apr 10, 2025 at 8:50 AM Marius Spix via Unicode
<unicode at corp.unicode.org> wrote:
>
> I'm afraid, that this would open Pandora's box for more homograph attacks. ∃ (U+2203) is already a confusable for E in a BiDi context. But this would add more characters like ꟼ (U+A7FC), ⅃ (U+2143) or ↋ (U+218B) to the list.
>
> Gesendet: Donnerstag, 10. April 2025 um 00:02
> Von: "Nitai Sasson via Unicode" <unicode at corp.unicode.org>
> An: unicode at corp.unicode.org
> Betreff: Re: Proposing new arrow characters with Bidi_Mirrored=Yes
> Okay, this tangent about ligatures is totally off-topic. There are other cases where arrows are used as operators or relations within text, so mirroring arrows are still needed even if they aren't the best solution for the specific issue of showing "->" as an arrow. Eli, we can continue in private emails if you want, I don't want to spam this thread with it.
>
> I am absolutely devastated because I just spent hours writing this email and Proton Mail just made it disappear. So until I can recover that draft, please make do with this condensed rewritten version:
>
> Firstly, you guys should know that Unicode BiDi is FANTASTIC. It makes everything work. It makes everything easy. Developers are happy to fix bidi bugs within a day because it's so easy. This is an absolute win for everyone.
>
> Have some examples:
> https://github.com/pachli/pachli-android/pull/906 - fixes linked issue, see how minor the code change is and how great the benefit!
> https://github.com/OSMCha/osmcha-frontend/pull/766 - this previously-linked issue was similarly easy
> https://github.com/openstreetmap/openstreetmap-website/pull/5835 - as was this (though it took me a while to find the right option), in the end it was just one line change for a HUGE improvement for all BiDi users
>
> You're wondering if it was a good move to make characters mirror at all? YES IT WAS. Templates like "%s (%s) %s" work universally. Imagine what would be required to make that template work if parentheses didn't mirror in RTL! How many developers would agree to make the effort to do that, and introduce that complexity into their codebase? Nearly none, and RTL users would be left in the dirt. Unicode's BiDi is a godsend.
>
> Arrows are the only exception. They are the one thing that Unicode does not give developers an easy BiDi solution for. And they are what I want to fix.
>
>
>
> The Proposal Idea
> =================
>
> (with credit to Mark E. Shoulson)
>
> Define a new combining character:
>
> <BDM> Bi-Directional Mirror (working title)
>
> Binds to the preceding character, and effectively gives it the property Bidi_Mirrored=Yes.
> Only has an effect on characters with Neutral directionality. Does nothing to characters with strong or weak LTR or RTL directionality.
>
> Examples:
> Within RTL text:
> U+05D0 א HEBREW LETTER ALEF
> U+2192 → RIGHTWARDS ARROW
> <BDM> Bi-Directional Mirror
> U+05D1 ב HEBREW LETTER BET
>
> Renders as: א←ב
> (Without <BDM>: א→ב)
> Arrow direction is flipped because it's resolved RTL
>
> Within LTR text:
> U+0041 A LATIN CAPITAL LETTER A
> U+2192 → RIGHTWARDS ARROW
> <BDM> Bi-Directional Mirror
> U+0042 B LATIN CAPITAL LETTER B
>
> Renders as: A→B
> (Without <BDM>: A→B)
> Arrow direction is maintained because it's resolved LTR
>
>
> That's it. I honestly cannot think of any issue or potential pitfall with this solution. The only substantial negative feedback I've seen is that control characters haven't always turned out well in the past. That's a reason to be cautious, not to reject the idea outright.
>
>



More information about the Unicode mailing list