ignoring characters in collation (for Tibetan)
Élie Roux
elie.roux at telecom-bretagne.eu
Mon Jun 8 05:54:26 CDT 2015
Dear all,
When sorting, Tibetan, 0F35 and 0F37 should be completely ignored by the
collation algorithm.
An example with rules for Dzongkha in CLDR:
- line 14 there is འདན<འདབ<འདམ
- I want to sort འད༵བ, I want it to be equal weight to འདབ, as 0F37
should be ignored
- when sorting ད འདན འད༵བ འདམ ན འ ཡ (correct order) I get ད འདན འདམ ན འ
འད༵བ ཡ (not correct)
so it seems འད༵བ is not treated as equal to འདབ. Is there any way to
specify this with the current spec/implementation? If I have to
duplicate all collation elements to give them a 0F35/0F37 variant, the
table will just explode (it's already huge).
Thank you very much,
--
Elie
More information about the Indic
mailing list