<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>On 4/8/25 1:56 PM, NeatNit via Unicode wrote:</p>
<blockquote type="cite"
cite="mid:174413497595.8.8342648198990262066.669978151@sl.neatnit.net">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">Users just type what gives them the correct appearance.
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">
</pre>
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">Even then, the problem with encoding duplicate characters based on layout properties is that "users just type what gives them the correct appearance" at the time they enter the character. The only context a user has is the text being typed. If that happens to give the correct direction, a user wouldn't know to shift to a different character, just in case the context might change.
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">
</pre>
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">wouldn't whoever enters the arrow just use the right^wcorrect one? Does text get converted from LTR to RTL? If so, isn't that part of the translator's responsibility?
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">
You guys are mostly right: in a context of users typing in text and manually choosing to insert an arrow, they would choose the arrow that looks correct, and it doesn't matter if they use a mirroring or non-mirroring arrow. This is not the issue I mean to solve.
</pre>
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">The question then is "what software processes are unavoidable and known to interfere with this user choice" for arrows in a bidirectional context?</pre>
</blockquote>
</blockquote>
(The above quoted-quotes from Asmus)<span
style="white-space: pre-wrap">
</span>
<blockquote type="cite"
cite="mid:174413497595.8.8342648198990262066.669978151@sl.neatnit.net">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">The issue is with software that programmatically inserts arrows in text that comes from unpredictable sources. Developers usually never think of this case, causing the arrow to point in the wrong direction. Real world examples:
<a class="moz-txt-link-freetext" href="https://github.com/deevroman/better-osm-org/issues/241">https://github.com/deevroman/better-osm-org/issues/241</a> - solved by bidi-isolating both sides of the arrow, and programmatically selecting the correct arrow based on the layout direction
<a class="moz-txt-link-freetext" href="https://github.com/OSMCha/osmcha-frontend/issues/765">https://github.com/OSMCha/osmcha-frontend/issues/765</a> - solved by bidi-isolating both sides of the arrow, and relying on the fact that the interface is always LTR
<a class="moz-txt-link-freetext" href="https://meta.discourse.org/t/wrong-arrow-direction-in-rtl-text-contexts/360760">https://meta.discourse.org/t/wrong-arrow-direction-in-rtl-text-contexts/360760</a> - which I've already mentioned, **no simple way to solve it** without mirroring arrows!
Obviously I don't expect developers to suddenly know to switch to the mirroring arrows overnight, if they are added. But I would love to be able to tell them "all you have to do to fix it is replace this character with that one".
</pre>
</blockquote>
</blockquote>
Ah! OK, now we're talking. I see the use case. I haven't read
details on the software in question, but I take it the point is that
you're presenting a route and there's a list of waypoints and it's
presented as "And now go from point A → point B" and needs to be
localized/internationalized. This actually... sounds like a
reasonable use? I mean, it makes sense why this wouldn't be served
by the current situation and why people would want something
smarter.<br>
<blockquote type="cite"
cite="mid:174413497595.8.8342648198990262066.669978151@sl.neatnit.net">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">If replacing "->" by an arrow character can change its direction, isn't it up to the autocorrect software to analyze the bidi context and select the correct arrow? The rule should be to select whatever substitution gives the same appearance (direction) as what the user would see for the string they typed.
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">
The problem is this replacement is done (as far as I know) outside of any rendering context, when the text is just a sequence of character codes. It's not reasonable to know which direction the text goes. Sometimes it's completely impossible, if the text direction depends on context that isn't available at the time of replacement.</pre>
</blockquote>
This gets back to the problem that some arrows should be mirrored
("and then turn left (←)") and some should not. That would require
some user-smarts.<span style="white-space: pre-wrap">
</span><span style="white-space: pre-wrap">
</span>
<blockquote type="cite"
cite="mid:174413497595.8.8342648198990262066.669978151@sl.neatnit.net">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">Here's a possibly disastrous idea: arrows mirror when they are within the domain of a Directional Override character (U+202D, U+202E).
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">
Let's say this was implemented... Would it help solve the issues linked above in some way?</pre>
</blockquote>
<p>(this quoted-quote is from me)</p>
<p>Now that I see your intended situation, I think what I was
imagining would not, in fact, help you. Just like there are
directionality-isolates and embeddings, there are also
directionality overrides so you can force ordinarily LTR text to
be RTL or vice-versa, like this. (the last two words in the last
sentence were typed and are encoded in the same order the letters
would be in English, but probably show up reversed for you.) And
I was thinking that with a right-to-left override region, arrows
would be reversed. But that wouldn't help you here, except if you
sorta joined the two halves of your expression by having them
start and end an override region. But that would be messy and
defeat the purpose of having them in different spans and generally
treating the two parts as independent pieces of information that
are being joined by an arrow.</p>
<p>In retrospect, my original thought was a pretty stupid idea,
since it essentially winds up assuming that the writer knows when
the arrow should point this way or that... in which case they
could have used the correct arrow in the first place! The
advantage of what you're proposing is that the decision should be
handled by the BiDi/mirroring algorithm, the same algorithm that
decides what direction your parentheses face.<span
style="white-space: pre-wrap">
</span></p>
<blockquote type="cite"
cite="mid:174413497595.8.8342648198990262066.669978151@sl.neatnit.net">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">A similar[ly bad] idea might be to have markup-type characters, something like <MIRRORED SELECTOR> or some such, to indicate that an attached character should be mirrored (or a pair of them that indicate direction).
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">
I actually love that idea! It would solve the issue for all arrows (and any other glyphs in ExtraMirroring.txt), while only introducing one or two new code point. Maybe also <NON MIRRORED SELECTOR> to disable mirroring even on character with Bidi_Mirroring=Yes.</pre>
</blockquote>
<p>And this would work better, if we take it to mean "the character
this is attached to is _subject_ to mirroring." But markup-type
characters in Unicode are a grey area and those which exist are
not widely loved either. As Marcus Scherer writes:</p>
<p>
<blockquote type="cite">Encoding characters that look the same but
behave differently is a bad idea. We have tried this, for
example with letter-behavior clones of some of the typographic
quotes (U+02BB, U+02BC). People use them inconsistently, because
they can't tell the difference while typing or reading, and so
we get problems with having to treat both equally in some
places, text search, spoofing, "why does it say I am using an
invalid character?", etc.
<div><br>
</div>
<div>Unicode also has some magic invisible control characters
that were supposed to change the behavior of affected
characters in ways that violated their identity. These control
codes are Deprecated with prejudice.</div>
</blockquote>
</p>
<p>The directionality isolates and overrides and such are in this
category of control characters, though I think not actually
deprecated because they're needed(?) but still looked at a bit
askance, and you don't want your kids playing with them...</p>
<p>And Marcus' point about "Encoding characters that look the same
but behave differently is a bad idea" is an extremely good one,
too.<br>
</p>
<p><span style="white-space: pre-wrap">
</span><span style="white-space: pre-wrap">
</span></p>
<blockquote type="cite"
cite="mid:174413497595.8.8342648198990262066.669978151@sl.neatnit.net">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">I don't even want to know about handling this in TTB contexts...
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">
What is TTB? Couldn't quickly find it.</pre>
</blockquote>
<p>Top-To-Bottom. Vertical text. Just one more way for things to
be confused.</p>
<p>~mark<br>
</p>
<br>
</body>
</html>