[EXTERNAL] Subscript Manual WA (was: Zawgyi Tonemarks in Latin Script)

Andrew Glass Andrew.Glass at microsoft.com
Fri Feb 19 16:49:12 CST 2021


Thank you for the nice examples, Richard.
Indeed this is up to fonts to enable. Fonts could add a locl feature for Sanskrit to enable this example. That would depend on software to pass in the OT language tag appropriately. Or, fonts, could simply optimize for Sanskrit by default.

Cheers,

Andrew

-----Original Message-----
From: Unicode <unicode-bounces at unicode.org> On Behalf Of Richard Wordingham via Unicode
Sent: 18 February 2021 16:16
To: unicode at unicode.org
Subject: Re: [EXTERNAL] Subscript Manual WA (was: Zawgyi Tonemarks in Latin Script)

On Thu, 18 Feb 2021 19:36:37 +0000
Andrew Glass via Unicode <unicode at unicode.org> wrote:

The lack isn't where I thought it was - it turns out that the shaper specification already supports the non-medial subscript WA!  I tweaked the OpenType lookup in Padauk to to generate the ɡlyph for <U+1039,
U+101D> to check where the problem lay, but didn't realise that the
HarfBuzz test program hb-view would by default use the Graphite shaping!  When I selected the OpenType renderinɡ, I got the correct rendering from the tweaked font.

The problem is that *fonts* seem not to be including the subscript WA, because it isn't required for *Modern Burmese*.  It so happens that the major fonts' rendering of MEDIAL WA is suitable for <VIRAMA, WA> - the pain of overlapping glyph ranges!
 
> Great question Richard, can you provide some examples? Do we have an 
> agreed encoding mechanism for this?

I'll give a detailed answer, though the renderers already have the solution.

The need for a distinction was put forward by Michael Everson at al. in at least the following:

L2/06-029
L2/06-077 p2 (a.k.a. WG2 N3043)
L2/06-213


L2/06-077 p3 states, "Note that kwa with MEDIAL WA may take a teardrop or triangular WA shape, which is never the case with true subjoined WA (which is rare, though it occurs in Sanskrit)."

Martin Hosken put forward other arguments, but I'm not sure that they were found convincing.

As to examples, just look at the absolutives in
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.alamy.com%2Fstock-photo-burmese-writing-pali-canon-buddhist-canon-tripitaka-library-of-stone-21244784.html&data=04%7C01%7CAndrew.Glass%40microsoft.com%7Ccf0ec3894e04469b75a908d8d46ccf3e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637492911849319036%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=wChZZLe8tuIEjeWOibBFS2K8hAMrC5B1ZH1r2PAJcAU%3D&reserved=0 .
I'd been goinɡ to say look for -itvā for both Pali and Sanskrit, but this forms seems commoner in word lists than actual text. 

TUS 13.0 Section 16.3 p647 says, "In Pali and Sanskrit texts written in the Myanmar script, as well as in older orthographies of Burmese, the consonants ya, ra, wa, and ha are sometimes rendered in subjoined form.
In those cases, U+1039 ္ myanmar sign virama and the regular form of the consonant are used."

Thus, examples abound, and the encoding is defined.  The codechart currently shows a teardrop shape for U+103D MYANMAR CONSONANT SIGN MEDIAL WA - that would not be suitable for <VIRANA, WA>.

Richard.




More information about the Unicode mailing list