lorna_evans at sil.org
Mon Apr 26 16:50:40 CDT 2021
I've got a situation that I'm not sure how to handle...or even if
Unicode or the rendering engines need update.
In a language using Syriac there is a /rish seyame/ which can be
followed by U+0739 or U+0738
/rish /= 072A
/seyame /= 0308
In TUS, chapter 9, it says:
> In Modern Syriac usage, when a word contains a /rish /and a /seyame/,
> the dot of
> the /rish /and the /seyame /are replaced by a /rish /with two dots
> above it.
Then, there's a table which indicates this ligature is obligatory:
> Table 9-17. Syriac Ligatures
> Ligature Classes. As in other scripts, ligatures in Syriac vary
> depending on the font style.
> Table 9-17 identifies the principal valid ligatures for each font
> style. When applicable, these
> ligatures are obligatory, unless denoted with an asterisk (*).
> rish seyame Right-joining Right-joining Right-joining BFBS (no
> asterisk, so it is obligatory)
Finally, in "Developing OpenType Fonts for Syriac Script"
In the "Glossary section" it says:
> *Ligature* - A combination of glyphs that join to form a single glyph.
> For example, the 'rish seyame' (U072a + U0308) combinations of glyphs
> are mandatory ligatures for Syriac. Other ligatures are optional.
So, it seems clear that 072a+0308 is a mandatory ligature. The problem
I'm seeing is that when this ligature is followed by U+0739 or U+0738
AND an application does normalization, it changes the sequence to U+072A
U+0739 U+0308 and that breaks the ligature.
You can see why they are reordering it when you see 0308 is 230 and
U+0738 or U+0739 are 220.
0308;COMBINING DIAERESIS;Mn;*230*;NSM;;;;;N;NON-SPACING DIAERESIS;;;;
0738;SYRIAC DOTTED ZLAMA HORIZONTAL;Mn;*220*;NSM;;;;;N;;;;;
0739;SYRIAC DOTTED ZLAMA ANGULAR;Mn;*220*;NSM;;;;;N;;;;;
All of the Syriac fonts that I see only handle this sequence *U+072A
U+0308 U+0739* and not the reordered *U+072A U+0739 U+0308*
Are the fonts wrong, should they be able to handle U+072A U+0739 U+0308?
Or, is there a special normalization rule for this?
How should /rish seyame/ followed by a below mark like U+0738 or U+0739
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Unicode