Tengwar on a general purpose translation site

Richard Wordingham richard.wordingham at ntlworld.com
Tue Mar 15 16:07:32 CDT 2022


On Sun, 13 Mar 2022 17:41:20 -0600
Doug Ewell via Unicode <unicode at corp.unicode.org> wrote:

> Richard Wordingham wrote:
> > It's a possible case where untrammelled permission to use new
> > letters may not have been given.  

> By whom? Nobody owns Cyrillic; nobody has claimed IP rights to it.
> It’s used to write several dozen languages. Russia has no more claim
> to it than the US or UK has to the Latin script.

By whoever or whatever added the *new* letters.
> > The description at
> > https://www.evertype.com/standards/csur/tengwar.html implies that
> > that tehta codepoints are applied to the previous consonant, which
> > implies a visual order encoding, as opposed to the 2001 phonetic
> > order encoding.  While a phonetic order encoding seems appealing
> > for a language with two modes mostly differing as CV v. VC
> > ligaturing, the scheme does seem to need language tagging for
> > tolerable rendering.  
> 
> That seems clear enough.
> 
> > Under the 2001 scheme, which proposes an encoding in the SMP, not
> > in a PUA, the tehtar would merit being letters, just like the
> > non-spacing letter U+0D4E MALAYALAM LETTER DOT REPH.  
> 
> The section “Rendering” in the 2001 document seems to me to make the
> same statements about modes and tehtar as the CSUR proposal.

Under the former, cons1-tehta-cons2 has tehta displayed on cons1.  In
the 2001 proposal, a Sindarin font would display the tehta on cons2.

> >> The Tengwar proposal, like many CSUR proposals (but unlike most
> >> “real” Unicode proposals in recent years), lacks a list of Unicode
> >> properties in UnicodeData.txt format. But in general, the
> >> distinction between an “encoding” and a “provisional encoding”
> >> seems overly pedantic for CSUR, which was always a fun, part-time
> >> project, and on which most work ended almost 20 years ago.  
> >
> > Nothing to do with interoperability, then?  

I was referring to the next best thing to proper encoding.  If I
encounter Ewellic text, and have a font that supports Ewellic, it
should support the text.  It's rather disappointing to see that the
CSUR doesn't have a single mapping from codepoints to Tengwar
characters.

Richard.



More information about the Unicode mailing list