<div dir="ltr"><div><a href="https://stackoverflow.com/questions/79104685/in-unicode-why-%e0%a5%98-is-excluded-from-composition-whereas-%c3%85-is-not">This question on Stack Overflow</a> sent me on a wild Google spree yesterday trying to find the reason why certain characters are included in the Composition_Exclusion set, particularly the Devanagari, Bengali, Gurmukhi, and Oriya letters with nukta, but I wasn’t able to locate any relevant documents from back then.</div><div><br></div><div>As I understand it (and I believe this was even the wording used in previous versions of UAX #15), the script-specific exclusions exist because for a handful of characters the fully decomposed form is the preferred representation in regular usage. This makes sense to me for the precomposed Hebrew letters because with so many combining marks with unique CCC values, it just seems easier to deal exclusively with combining character sequences and not have some random marks
“glue” themselves to the base letter. The two-part Tibetan subjoined letters are similar in this regard.</div><div><br></div><div>However, the Indic nuktas seem entirely unproblematic and in fact not all precomposed letters with nukta are composition-excluded: Devanagari has
ऩ,
ऱ, and
ऴ for example.</div><div><br></div><div>Does anyone remember what lead to these specific decisions or knows where to find the relevant documents if they exist?<br></div></div>