Different Bidirectional Character Types

Hans Åberg haberg-1 at telia.com
Sat Jul 2 14:46:52 CDT 2022

> On 2 Jul 2022, at 11:54, Richard Wordingham via Unicode <unicode at corp.unicode.org> wrote:
> On Sat, 2 Jul 2022 11:01:00 +0200
> Hans Åberg via Unicode <unicode at corp.unicode.org> wrote:
>>> On 1 Jul 2022, at 14:15, Andreas Prilop via Unicode
>>> <unicode at corp.unicode.org> wrote:
>>> Reference:
>>> https://unicode.org/reports/tr9/#Bidirectional_Character_Types
>>> Why do Hebrew letters and Arabic letters have different
>>> bidirectional character types?  
>> I cannot parse this, but in Hebrew, Arabic, and Persian, text is
>> written RTL, but numbers LTR. For example, trying A123 in a
>> translator supporting those scripts, I get: א123 أ ١٢٣
>> ا ۱۲۳
> For numbers, using natural language, you don't mean LTR, but 'with the
> most significant digit on the left'.

I asked some Arab speaking how they think about it when writing numbers, and they said they indeed think about it as writing LTR, and not RTL with changed endianness. In a file with RTL/LTR markers, by this, the digits get the same order. I assumed this is how Unicode represents it, but it would be nice with clarification.

More information about the Unicode mailing list