USE Indic Syllabic Category
Richard Wordingham via Unicode
unicode at unicode.org
Sat Feb 23 08:07:53 CST 2019
On Sat, 23 Feb 2019 14:46:27 +0800
梁海 Liang Hai via Unicode <unicode at unicode.org> wrote:
> >>> once the USE acknowledges that subjoined consonants may follow
> >>> vowels
> >> I expect to update the USE spec to address this soon.
> > That seems welcome news. I still don't know what the problem with
> > supporting them has been.
> USE wasn’t designed to allow such a syllable structure. Tai Tham’s
> being supported by USE is kind of an oversight. And although it’s
> appropriate to allow conjoined consonants to follow post-base-spacing
> vowel signs, it’s not really a trivial debate whether USE should
> allow conjoined consonants to non-post-base-spacing (ie, pre-base,
> above-base, and below-base) vowel signs—considering the ambiguity.
What are your thoughts on the handling of 'medial consonants'? My
best surmise is that the Unicode classification is intended for
subscript consonants that prototypically occur between a phonetically
and orthographically syllable-initial consonant and the possibly
implicit vowel. Significantly, clusters of medial consonants can occur.
However, I am not sure why they should be treated any differently from
subscript consonants. My best hypotheses are that:
1) They can lose any segmental significance in the pronunciation of a
word, e.g. being reduced to encoding features, as in Burmese.
2) Their visual positioning in the onset cluster does not relate to the
phonetic order; for example, medial RA may be written before the
cluster without any anchor in the vertical stack.
>From the prototypical behaviour, the USE has deduced the rule that a
medial consonant must be followed by a vowel, albeit implicit. An
implicit vowel does not count if it is removed by a virama (as opposed
to a pure killer). You have suggested that the Indic Syllabic
Category should reflect the structure of strings in scripts more
closely. Do you agree that this deduction goes beyond the implications
of the Unicode categorisation as a medial consonant? Or do you think
that the Unicode concept of 'medial consonant' should be changed.
My feeling is that I should report to Microsoft that the
characterisation of U+1A55 TAI THAM CONSONANT SIGN MEDIAL RA and U+1A56
TAI THAM CONSONANT SIGN MEDIAL LA, both with InSC=Consonant_Medial, as
medial consonants, is wrong for the USE.
There are three ways that these signs fail to correspond to the USE's
model of a medial consonant:
1. The Tai Tham sequences <SAKOT, WA> and <SAKOT, LOW YA> can act as
vowels in Tai Tham languages.
2. The implicit vowel following them can be silenced. Now normally
this should not be a problem, for the vowel killers are categorised as
'pure_killer' (U+1A7A) and 'syllable_modifier' (U+1A7C). The potential
issue revealed itself when U+1A7A was mistagged as 'halant', implying
3. MEDIAL RA can precede a resonant consonant, as in ᨲᩕ᩠ᨶᩬᨾ <HIGH TA,
MEDIAL RA, SAKOT, NA, SIGN OA BELOW, MA> /tʰanɔːm/ (MFL Rev 1 p269).
More information about the Unicode