Zawgyi Tonemarks in Latin Script

Vinodh Rajan vinodh.vinodh at gmail.com
Wed Feb 17 15:12:37 CST 2021


>
>  If you Romanize text, why would you use marks of the original script? I
> think Romanization schemes typically map marks to some combining
> marks commonly used for Latin letters or some punctuation or special
> characters.
>

 I was composing a document literally yesterday, which required me to do
this.

[image: image.png]
I had to choose a font that has does not contain the dotted circle to
circumvent the rendering engines.

Thai has three viramas (sort of) and it makes sense to use the original
marks in the romanization to retain the differentiation.  I can of course
invent three new diacritic marks that work with Latin letters. But it is a
one-off thing, It doesn't make sense to include a note explaining my ad-hoc
conventions just for that one word. It's just too laborious.

Vinodh

On Wed, Feb 17, 2021 at 6:45 PM Richard Wordingham via Unicode <
unicode at unicode.org> wrote:

> On Wed, 17 Feb 2021 05:40:54 +0000
> James Kass via Unicode <unicode at unicode.org> wrote:
>
> > Unable to repro this here.  The string "kး" does not display with the
> > dotted circle.  Tried this on Windows 7 with both BabelPad and
> > LibreOffice.  (And now in the compose panel of Mozilla Thunderbird.)
>
> That is curious.  Which font were you using?
>
> In Word on Windows 10, using the font Myanmar text for the whole
> string, in LibreOffice and Firefox on Ubuntu 16.04 (so at least one of
> them falls back to HarfBuzz Version 1.2.7), and with the Padauk font
> using HarfBuzz Version 2.7.2, I get a dotted circle even for an ASCII
> letter plus U+1038 MYANMAR SIGN VISARGA.
>
> Of course, there's no problem with HarfBuzz if one uses the Zawgyi-One
> font, which is one of the few to support the sequence <U+1E45,
> U+1038>.
>
> > Maybe file a bug with the renderer developer?
>
> They could argue that it's not the sort of sequence that they will
> support.  (Am I right in thinking that a Unicode-compliant renderer may
> deliberately misrender unsupported sequences?) Unfortunately, the
> Unicode technical annexes support the principle of separating a base
> character from its marks when the extended script property doesn't
> support their combination. (I've already complained to Mark Davis about
> this.)  After all, if you want a candrabindu on the Latin letter 'l',
> or 'v', or 'y', you use U+0310 COMBINING CANDRABINDU.
>
> Richard.
>
>

-- 
http://www.virtualvinodh.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20210217/a3811644/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 103982 bytes
Desc: not available
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20210217/a3811644/attachment-0001.png>


More information about the Unicode mailing list