Combining Class of Thai Nonspacing_Marks
Gerriet M. Denkmann
gerrietm at icloud.com
Mon Apr 3 21:39:57 CDT 2017
> On Mon, 3 Apr 2017 14:12:51 +0700
> "Gerriet M. Denkmann" <gerrietm at icloud.com> wrote:
>> The Combining Class is used for normalisation of strings.
>> Normalisation of strings is important for filenames in filesystems.
>> As far as I know, a Thai consonant (Lo, Other_Letter) can have
>> several Nonspacing_Marks. This cluster of nonspacing marks can
>> contain at most one top/bottom vowel and at most one tone/other mark.
>> There is no syntactically meaning in the order of these nonspacing
> You're confusing the modern Thai language with the Thai script. It
> seems that the Lao-style usage of NIKHAHIT as a vowel is known from
> older Thai writing, and when used this way it could of course take a
> tone mark. It also seems that the pressure to have both MAITAIKHU and
> a tone mark on a consonant has been accepted for at least one minority
I stand corrected. I do know nothing about other languages written with Thai characters.
So the rule should be:
A consonant may have zero or one tone/other marks and also zero or one top/bottom vowels.
NIKHAHIT + tone mark (no top/bottom vowel)
MAITAIKHU + tone mark (no top/bottom vowel)
The order of these has no semantical meaning.
All top/bottom vowels should have Combining Class 103,
other marks should have Combining Class x (with 103 < x < 107),
tone marks should have Combining Class 107.
Is anybody working on or is responsible for these things?
More information about the Unicode