Proposing new arrow characters with Bidi_Mirrored=Yes
Phil Smith III
lists at akphs.com
Tue Apr 8 12:33:41 CDT 2025
Disclaimer: I've done nothing personally with BIDI or RTL, so may be completely off-base here.
My first reaction to this was "Wow, one of those weird and super-painful edge conditions", but the more I think about it, wouldn't whoever enters the arrow just use the right^wcorrect one? Does text get converted from LTR to RTL? If so, isn't that part of the translator's responsibility?
-----Original Message-----
From: Unicode <unicode-bounces at corp.unicode.org> On Behalf Of Asmus Freytag via Unicode
Sent: Tuesday, April 8, 2025 1:25 PM
To: unicode at corp.unicode.org
Subject: Re: Proposing new arrow characters with Bidi_Mirrored=Yes
Users just type what gives them the correct appearance. (In Arabic, that infamously includes typing the wrong character, because it happens to look correct and is on your regional keyboard).
The question then is "what software processes are unavoidable and known to interfere with this user choice" for arrows in a bidirectional context?
Any proposal would need to be based on a careful vetting of scenarios, like the one given below (where "->" is turned into an arrow character) to see whether there's enough of an issue there and whether addressing it with character coding is the right answer (or perhaps the only answer).
Without a solid body of evidence of where the current approach is failing for lack of a solution that requires new characters, the issue remains stuck in the category of "good idea". Something that looks like it might be useful, but with obvious complications that would make it a terrible idea unless these are outweighed by real, practical and demonstrable gains (and for which no other alternative exists).
Even then, the problem with encoding duplicate characters based on layout properties is that "users just type what gives them the correct appearance" at the time they enter the character. The only context a user has is the text being typed. If that happens to give the correct direction, a user wouldn't know to shift to a different character, just in case the context might change.
If replacing "->" by an arrow character can change its direction, isn't it up to the autocorrect software to analyze the bidi context and select the correct arrow? The rule should be to select whatever substitution gives the same appearance (direction) as what the user would see for the string they typed.
A./
On 4/8/2025 9:33 AM, Mark E. Shoulson via Unicode wrote:
> My initial reaction on reading the subject was "*eyeroll* like we need
> MORE arrow characters!" But then again, there is some point to these
> arrows (sorry). I do feel like there are already _so many_ arrow
> characters that duplicating all the ones with a horizontal component
> to have a mirrored version would be a bit much, but there does seem to
> be some utility in what is being proposed here. Naturally, this makes
> me think, "well, how about we just make a _few_ such duplicates?" but
> that's a slippery slope and will only lead to people protesting "But
> there's a mirrored →, why can't I have a mirrored ⇰???" Not sure what
> the best answer is. (Unless maybe mirrored characters were a Bad Idea
> to start with?)
>
> Here's a possibly disastrous idea: arrows mirror when they are within
> the domain of a Directional Override character (U+202D, U+202E). This
> would entail creating a new category of character which is subject to
> this optional mirroring behavior, which then might be applied to other
> characters (hmm, like some emoji, to get people running to the left or
> something?) and I get the feeling that anything that touches the BiDi
> algorithm might just be asking for trouble.
>
> A similar[ly bad] idea might be to have markup-type characters,
> something like <MIRRORED SELECTOR> or some such, to indicate that an
> attached character should be mirrored (or a pair of them that indicate
> direction).
>
> I don't even want to know about handling this in TTB contexts...
>
> ~mark
>
> On 4/8/25 10:34 AM, NeatNit via Unicode wrote:
>> Hi, I hope this is the right place to bring this up. I could not find
>> any discussions on this other than the document I quote.
>>
>> Quick intro: characters with the property Bidi_Mirrored=Yes will be
>> visually mirrored within RTL text, such as Hebrew or Arabic. An easy
>> example is the Greater Than symbol: A>B and א>ב.
>>
>> Arrow characters do not have this property: A→B but א→ב.
>>
>> I've found this discrepancy mentioned in this document:
>>
>> https://www.unicode.org/L2/L2022/22026r-non-bidi-mirroring.pdf
>>
>>> In particular, arrow and arrow-like characters
>>> each often has a mirror character. One could
>>> argue that they should have had the
>>> Bidi_Mirrored=Yes property value, but they
>>> don’t, and cannot now get that.
>> Even if it weren't for Unicode's stability policies, there are two
>> distinct usages of arrow symbols:
>>
>> To indicate directions, e.g. "Turn left (←) and then right (→)" - in
>> this case the arrow refers to the physical direction and should not
>> be mirrored in RTL. The existing arrow characters serve this purpose
>> well: "פנה שמאלה (←) ואז ימינה (→)"
>>
>> As an operator: "Convert A->B and assign C<-D" - in this case the
>> arrow direction should be mirrored if it appears in RTL text.
>> Currently this can only be emulated with ASCII "->" as I've just
>> demonstrated. Result: "המר א->ב וקבע ג<-ד".
>>
>> Therefore I think there should be new characters, "Forward Arrow" and
>> "Backward Arrow", to serve the latter case. They would use the same
>> glyphs as existing arrows, but have the Bidi_Mirrored=Yes property.
>>
>> Please let me know if this is likely to happen, and what I would have
>> to do to make a proper proposal. And if any of you are convinced
>> enough that you would like to make a proposal on my behalf, you are
>> welcome to do so!
>>
>> The same reasoning can be applied to many other characters besides
>> these basic arrows. At minimum, all arrow and arrow-like characters
>> should be included. I haven't made a thorough search to find all
>> affected characters, at least not yet.
>>
>> Note that some software, such as the Discourse forum software,
>> convert "->" to "→" in user content, obviously unaware of this issue.
>> These proposed bidi-mirrored arrow characters would be an appropriate
>> replacement in such cases. Today, that simple search-and-replace must
>> be replaced with parsing the text using the full Unicode Bidi
>> algorithm to select the correct arrow, and even then some cases would
>> be impossible to determine without knowing the base direction or more
>> context which is not always available.
>>
>> Awaiting your comments.
>>
>> Thanks,
>> Nitai
>>
> On 4/8/25 10:34 AM, NeatNit via Unicode wrote:
>> Hi, I hope this is the right place to bring this up. I could not find
>> any discussions on this other than the document I quote.
>>
>> Quick intro: characters with the property Bidi_Mirrored=Yes will be
>> visually mirrored within RTL text, such as Hebrew or Arabic. An easy
>> example is the Greater Than symbol: A>B and א>ב.
>>
>> Arrow characters do not have this property: A→B but א→ב.
>>
>> I've found this discrepancy mentioned in this document:
>>
>> https://www.unicode.org/L2/L2022/22026r-non-bidi-mirroring.pdf
>>
>>> In particular, arrow and arrow-like characters
>>> each often has a mirror character. One could
>>> argue that they should have had the
>>> Bidi_Mirrored=Yes property value, but they
>>> don’t, and cannot now get that.
>> Even if it weren't for Unicode's stability policies, there are two
>> distinct usages of arrow symbols:
>>
>> To indicate directions, e.g. "Turn left (←) and then right (→)" - in
>> this case the arrow refers to the physical direction and should not
>> be mirrored in RTL. The existing arrow characters serve this purpose
>> well: "פנה שמאלה (←) ואז ימינה (→)"
>>
>> As an operator: "Convert A->B and assign C<-D" - in this case the
>> arrow direction should be mirrored if it appears in RTL text.
>> Currently this can only be emulated with ASCII "->" as I've just
>> demonstrated. Result: "המר א->ב וקבע ג<-ד".
>>
>> Therefore I think there should be new characters, "Forward Arrow" and
>> "Backward Arrow", to serve the latter case. They would use the same
>> glyphs as existing arrows, but have the Bidi_Mirrored=Yes property.
>>
>> Please let me know if this is likely to happen, and what I would have
>> to do to make a proper proposal. And if any of you are convinced
>> enough that you would like to make a proposal on my behalf, you are
>> welcome to do so!
>>
>> The same reasoning can be applied to many other characters besides
>> these basic arrows. At minimum, all arrow and arrow-like characters
>> should be included. I haven't made a thorough search to find all
>> affected characters, at least not yet.
>>
>> Note that some software, such as the Discourse forum software,
>> convert "->" to "→" in user content, obviously unaware of this issue.
>> These proposed bidi-mirrored arrow characters would be an appropriate
>> replacement in such cases. Today, that simple search-and-replace must
>> be replaced with parsing the text using the full Unicode Bidi
>> algorithm to select the correct arrow, and even then some cases would
>> be impossible to determine without knowing the base direction or more
>> context which is not always available.
>>
>> Awaiting your comments.
>>
>> Thanks,
>> Nitai
>>
More information about the Unicode
mailing list