What is the time frame for USE shapers to provide support for CV+C ?

Richard Wordingham via Unicode unicode at unicode.org
Thu May 9 14:04:38 CDT 2019


On Thu, 9 May 2019 11:55:23 -0400
Ed Trager via Unicode <unicode at unicode.org> wrote:
 
> ** A good use case is the Tai Tham word U+1A27 U+1A6A U+1A60 U+1A37 ,
> transcribed to Central Thai script as จูบ, (*to kiss*). Currently,
> people are writing this as U+1A27 U+1A60 U+1A37 U+1A6A ("จบู") which
> violates the "phonetic ordering" but is the current workaround
> because USE is still broken for TAI THAM.
> 
> REFERENCE DOCUMENT:
> http://www.unicode.org/L2/L2018/18332-tai-tham-ad-hoc-report.pdf

How is this a good test case?  The 6th preliminary recommendation
reads, "To represent a cluster, regardless of the phonetic order CCV or
CVC, a consonant sign should always be encoded before the vowel sign,
unless the vowel sign has inline advance and is apparently followed by
the consonant sign".  If this recommendation is adopted, then the
spelling "U+1A27 U+1A6A U+1A60 U+1A37" will be  wrong.

Now, SIGN U and SIGN UU before subscript BA, HIGH PA and LOW YA aren't
always written as though they followed the subscript consonants in
phonetic order.  Sometimes the vowel sign is written in the bottom left
of the syllable.  Presumably we'll need 3 or 4 new signs:

TAI THAM UNAMBIGUOUS UB

TAI THAM UNAMBIGUOUS UUB

TAI THAM UNAMBIGUOUS UY

TAI THAM UNAMBIGUOUS UUY (?)

I'm not sure that the fourth one can occur.

An example of the contrast is shown in the attached files luynam.png,
with first orthographic syllable <LA, SIGN U, SAKOT, LOW YA>, and
yukya.png, with the first orthographic syllable <HIGH HA, SAKOT, LOW
YA, SIGN U>. 

I wonder how we'd be supposed to encode ᩉᩖᩩ᩠᩶ᨿ (currently <HIGH HA,
MEDIAL LA, SIGN U, TONE-2, SAKOT, LOW YA> 'to crawl'?  The simplest
way would be to encode it as <HIGH HA,
MEDIAL LA, SAKOT, LOW YA, SIGN U, TONE-2>, which currently encodes
the unlikely ᩉᩖ᩠ᨿᩩ᩶. Will good fonts be expected to move the vowel left
and down from the subscript LOW YA to the MEDIAL LA?  Or will we need to
encode it with *TAI THAM UNAMBIGUOUS UY?

Richard.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: luynam.png
Type: image/png
Size: 2132 bytes
Desc: not available
URL: <http://unicode.org/pipermail/unicode/attachments/20190509/13358005/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: yukya.png
Type: image/png
Size: 2406 bytes
Desc: not available
URL: <http://unicode.org/pipermail/unicode/attachments/20190509/13358005/attachment-0001.png>


More information about the Unicode mailing list