Long standing problem with Vedic tone markers and post-base visarga/anusvara
James Kass via Unicode
unicode at unicode.org
Tue Dec 31 17:01:15 CST 2019
On 2019-12-21 6:27 AM, Shriramana Sharma via Unicode wrote:
> However, even the simplest Vedic sequence (not involving Sama Vedic or
> multiple tone marker combinations) like दे॒वेभ्य॑ः throws up a dotted
> circle, and one is expected (see developer feedback in that bug
> report) to input the visarga before tone markers, hoping the software
> is intelligent enough to skip over the visarga (or anusvara) place the
> tone marker over the preceding syllable correctly. Why it is necessary
> to put the visarga first in input only to have to skip over it in
> shaping is beyond me.
य॔ः -- visarga last
यॆ॔ः -- "
यः॔ -- visarga before accent (U+0954)
यॆः॔ -- "
य॑ः -- visarga last
यॆ॑ः -- "
यः॑ ---- visarga before svarita (U+0951)
यॆः॑ ---- "
U+0951 and U+0954 have canonical combining class of 230. Putting
VISARGA (CCC=0) after those CCC=230 marks generates the dotted circle
for VISARGA. Putting VISARGA before those CCC=230 marks generates the
dotted circle for U+0954 but drops the dotted circle for U+0951. In
both cases where VISARGA comes before, the mark positioning is broken.
(Mangal font, Win 7)
As far as I can tell, the simplest solution would be for the Indic
shaping engines to suppress the dotted circle for VISARGA (or ANUSVARA)
where appropriate. Entering/storing VISARGA or ANUSVARA at the end of
the syllable makes sense since that's where it goes, visually and logically.
More information about the Unicode