Tengwar on a general purpose translation site

Mark E. Shoulson mark at kli.org
Tue Mar 15 16:51:04 CDT 2022

On 3/15/22 17:07, Richard Wordingham via Unicode wrote:
> On Sun, 13 Mar 2022 17:41:20 -0600
> Doug Ewell via Unicode <unicode at corp.unicode.org> wrote:
>> Richard Wordingham wrote:
>>> Under the 2001 scheme, which proposes an encoding in the SMP, not
>>> in a PUA, the tehtar would merit being letters, just like the
>>> non-spacing letter U+0D4E MALAYALAM LETTER DOT REPH.
>> The section “Rendering” in the 2001 document seems to me to make the
>> same statements about modes and tehtar as the CSUR proposal.
> Under the former, cons1-tehta-cons2 has tehta displayed on cons1.  In
> the 2001 proposal, a Sindarin font would display the tehta on cons2.

If you ask me, it's pretty clear that tehtar are/should be combining 
characters, like accents or Hebrew vowels.  And yes, then Sindarin gets 
encoded with a non-obvious ordering.  But really, in the context of all 
the various input-method pain people get put through for other scripts, 
is that really so terrible? Even Hebrew codes the furtive PATAH after 
the letter even though it's pronounced before it.  (That's only one 
vowel, and not a very common one at that, but still.)

But you didn't ask me (which was probably a smart move), and it's far 
too soon to be actually concerned about this anyway.  Next time the 
proposal is updated for serious consideration we can drag this all out.


More information about the Unicode mailing list