Tengwar on a general purpose translation site
Mark E. Shoulson
mark at kli.org
Tue Mar 15 16:51:04 CDT 2022
On 3/15/22 17:07, Richard Wordingham via Unicode wrote:
> On Sun, 13 Mar 2022 17:41:20 -0600
> Doug Ewell via Unicode <unicode at corp.unicode.org> wrote:
>> Richard Wordingham wrote:
>>> Under the 2001 scheme, which proposes an encoding in the SMP, not
>>> in a PUA, the tehtar would merit being letters, just like the
>>> non-spacing letter U+0D4E MALAYALAM LETTER DOT REPH.
>> The section “Rendering” in the 2001 document seems to me to make the
>> same statements about modes and tehtar as the CSUR proposal.
> Under the former, cons1-tehta-cons2 has tehta displayed on cons1. In
> the 2001 proposal, a Sindarin font would display the tehta on cons2.
If you ask me, it's pretty clear that tehtar are/should be combining
characters, like accents or Hebrew vowels. And yes, then Sindarin gets
encoded with a non-obvious ordering. But really, in the context of all
the various input-method pain people get put through for other scripts,
is that really so terrible? Even Hebrew codes the furtive PATAH after
the letter even though it's pronounced before it. (That's only one
vowel, and not a very common one at that, but still.)
But you didn't ask me (which was probably a smart move), and it's far
too soon to be actually concerned about this anyway. Next time the
proposal is updated for serious consideration we can drag this all out.
More information about the Unicode