U+0F81 Canonical Combining Class?

mail at dzfrias.dev mail at dzfrias.dev
Tue Jul 29 12:47:54 CDT 2025


The Tibetan Unicode block contains a number of characters (U+0F73, U+0F75, U+0F81) that have a canonical combining class value of zero, and have non-empty decomposition mappings. This is not out of the ordinary, but upon inspecting the code points that they map to, I found that the canonical combining class of each decomposition code point is greater than zero.

In the case of U+0F81, the decomposition mapping is: U+0F71 U+0F80. Both U+0F71 and U+0F80 have canonical combining class values greater than zero, so U+0F81 decomposes solely into combining marks, yet has a canonical combining class value of zero.

What is the reasoning behind this discrepancy? It is my understanding that U+0F81 (TIBETAN VOWEL SIGN REVERSED II, ཱྀ) is supposed to be a combining mark. Also, the Tibetan block is the only block that contains code points with this behavior. It is likely that I'm misunderstanding the semantics of the canonical combining class system.


Diego Frias


More information about the Unicode mailing list