Compatibility decomposition for Hebrew and Greek final letters

Eli Zaretskii eliz at
Thu Feb 19 05:30:22 CST 2015

> From: Michael Everson <everson at>
> Date: Thu, 19 Feb 2015 11:21:19 +0000
> On 19 Feb 2015, at 10:55, Eli Zaretskii <eliz at> wrote:
> > Does anyone know why does the UCD define compatibility decompositions
> > for Arabic initial, medial, and final forms, but doesn't do the same
> > for Hebrew final letters, like U+05DD HEBREW LETTER FINAL MEM?  Or for
> > that matter, for U+03C2 GREEK SMALL LETTER FINAL SIGMA?
> > 
> > The relevant application where this would matter is text search, where
> > these letters might be folded to the same code point for the purposes
> > of comparison.
> Such comparisons happen at a different level, I think. 

Sorry, I'm not sure I follow: different from what?

In any case, regardless of the level, if there's no data to support
such "folding", how can applications implement it (except by inventing
its own data)?

Also, perhaps there are some deep linguistic reasons why such folding
might be inappropriate, and that's why the UCD doesn't define such


More information about the Unicode mailing list