Composition / Decomposition of Tibetan oM (0F00)

Élie Roux via Indic indic at unicode.org
Mon Mar 19 04:13:39 CDT 2018


Dear Richard,

> Sacred syllable v. run of the mill syllable?

Hmm, ok let's ask more direct questions, which are on two different
aspects of the problem:

1. There are a lot of sacred syllables in Tibetan, why choose this one
in particular? Hung (U+0F67 U+0F71 U+0F74 U+0F82) is at least as sacred
and as widespread...

2. Why isn't U+0F00 considered a composition of U+0F68 U+0F7C U+0F7E in
UnicodeData.txt? What I see is:

0F00;TIBETAN SYLLABLE OM;Lo;0;L;;;;;N;;;;;

while I believe it should contain

0F00;TIBETAN SYLLABLE OM;Lo;0;L;0F68 0F7C 0F7E;;;;N;;;;;

(same for 0F02 and 0F03).

> For example, under the UCA default collation, U+0F00 and <U+0F68,
> U+0F7C, U+0F7E> are no more different than upper and lower case in
> English.

Hmmm thanks a lot for that! This seems to be somewhat new, but indeed I
can see

0F00  ; [.2F19.0020.0004][.2F30.0020.0004][.0000.00C4.0004] # TIBETAN
SYLLABLE OM

in http://www.unicode.org/Public/UCA/10.0.0/allkeys.txt

So I guess I'm even more eager to have some clues on my question number
2, if the UCA acknowledges that the composed and decomposed characters
have the same weight, why doesn't UnicodeData list them as
composition/decomposition?

Thank you,
-- 
Elie


More information about the Indic mailing list