Re-evaluate directionality of Arabic Forms-B characters
Egmont Koblinger
egmont at gmail.com
Wed Aug 14 05:23:48 CDT 2024
Hello,
I'd like to speak up here too that I believe this proposed change is a
bad one which should be rejected.
Directionality of Forms-B characters matter if you run UBA on a string
containing such characters. I believe this shouldn't happen under
normal circumstances, it signals some bigger underlying problem, like
double-BiDi'ing a piece text. If / whenever this happens, the root
cause should be addressed, rather than mitigating the symptoms.
In (arguably broken) contexts where this happens anyway, the proposed
fix would fix the layout of Arabic words, but would leave Hebrew text
broken (reversed). It's unacceptable for any RTL-related fix to only
address scripts that happen to have the (fundamentally unrelated)
concept of shaping and not ones that don't.
The suggested change would make "shaping" do more than just shaping,
it would also influence a way more fundamental thing: the order of the
characters. "Shaping" should remain "shaping" only.
> 3. The only use case that is important is the support for Arabic in tty, old
> terminals, and terminals with no Bidi/lettershaping support.
This sentence is incorrect, for two vastly different reasons.
One is that there's absolutely no problem in old terminals, in
terminals that don't know anything about BiDi. There the
directionality of Forms-B characters is irrelevant and changing it
wouldn't change anyhing. It's exactly the opposite: The problem occurs
in terminals that _do_ perform BiDi-shuffling. Based on the
conversation in the VTE bugtracker, I believe this was a simple
oversight by OP who wanted to mention the other category.
The second problem is that the claim that there's no other use case is
not backed up at all. We can't know what other existing software such
a backwards-incompatible change would break.
> And when those programs use Forms-B they assume they
> have the same directionality as LTR characters
Unicode is clear that Forms-B characters have RTL directionality
(which I believe is a good thing, because this way the correct
ordering of the letters remains orthogonal to shaping). If a piece of
software assumes otherwise then that piece of software is not
Unicode/UBA-conformant. The solution is to adjust those software to
match Unicode, not the other way around.
Please see my detailed arguments in the discussion already linked by
OP (i.e. not just that particular linked comment but the entire
discussion).
I am truly hoping that after a careful analysis of the situation you
will conclude that Unicode's current behavior here is much better than
the proposed one would be.
Thanks a lot,
Egmont Koblinger
(VTE and GNOME Terminal co-developer, author and VTE-implementer of
the "BiDi in Terminal Emulators" proposal)
More information about the Unicode
mailing list