Different Bidirectional Character Types

Sat Jul 2 05:13:53 CDT 2022

> Date: Sat, 2 Jul 2022 10:54:46 +0100
> From: Richard Wordingham via Unicode <unicode at corp.unicode.org>
> 
> On Sat, 2 Jul 2022 11:01:00 +0200
> Hans Åberg via Unicode <unicode at corp.unicode.org> wrote:
> 
> > > On 1 Jul 2022, at 14:15, Andreas Prilop via Unicode
> > > <unicode at corp.unicode.org> wrote:
> > > 
> > > Reference:
> > > https://unicode.org/reports/tr9/#Bidirectional_Character_Types
> > > 
> > > Why do Hebrew letters and Arabic letters have different
> > > bidirectional character types?  
> > 
> > I cannot parse this, but in Hebrew, Arabic, and Persian, text is
> > written RTL, but numbers LTR. For example, trying A123 in a
> > translator supporting those scripts, I get: א123 أ ١٢٣
> > ا ۱۲۳
> > 
> > 
> 
> For numbers, using natural language, you don't mean LTR, but 'with the
> most significant digit on the left'.  It is a convention that the when
> encoding 'four and twenty' using digits, the most significant digit is
> stored first.  N'ko decimal numbers have the most significant digit on
> the right, with the result that N'ko digits have bidi class
> Right_To_Left, as do N'ko letters.
> 
> As to parsing the question, at the literal level Hebrew letters have
> bidi class Right_To_Left (R) while Arabic letters have bidi class
> Arabic_Letter (AL); Moroccan decimal digits (e.g U+0030) have bidi
> class European_Number (EN), Egyptian decimal digits have bidi class
> Arabic_Number (AN), Urdu decimal digits have bidi class European_Number
> (EN) and Hindi decimal digits (e.g. U+0966) have bidi class
> Left_to_Right (L).  When one throws dollar signs, which have bidi
> class European_Terminator (ET) into the mix, these differences matter to
> the bidi algorithm.

I think a simpler answer is that Arabic letters (bidi class AL) in
some cases make European Numbers (EN) behave like Arabic Numbers (AN);
see rule W2 of UAX#9.  And Arabic Numbers then affect how other "weak"
characters are reordered, see W6.

IOW, these distinctions are needed to produce the expected reordered
order in each case.