رد: Wrong sequence for Arabic ligature marks(FC5E-FC62, FCF2-FCF4)

Richard Wordingham richard.wordingham at ntlworld.com
Sat Feb 19 06:52:37 CST 2022


On Sat, 19 Feb 2022 10:20:31 +0000
Saeed Hubaishan via Unicode <unicode at corp.unicode.org> wrote:

> But we have a problem with some program whom get thier data from
> unicode like "MediaWiki" and "phpBB" they reorder لَّ
> to
> لَّ
In codepoints, <U+0644 LAM, U+0651 SHADDA, U+064E FATHA> to <U+0644,
U+064E, U+0651>. No process compliant with Unicode shall *deliberately*
render them differently - the sequences are canonically equivalent.

> with maybe rendered in some old windows fonts  like
> لِّ
> 
> you can try this with wikipedia

This sequence is <U+644, U+651, U+650 KASRA>, which is not canonically
normalised. Using the Naskh font Amiri, kasra is by default placed below
lam. However, if I enable OpenType feature ss05, which for this font is
described (unless the labels have been scrambled) as "Kasra is placed
below Shadda instead of base glyph", the kasra is indeed placed
immediately below the shadda. Unicode allows both renderings.

I'm not sure that Unicode provides any plain text mechanism to
distinguish the two renderings.

In answer to Eli, the Amiri font is the one I downloaded to
get LAM and HAH to automatically ligate; I got it from Ubuntu package
fonts-hosny-amiri.  The font is published under the SIL Open Font
Licence.

Richard.



More information about the Unicode mailing list