Latin Letters Capital and Small Theta

Sat Jun 11 17:20:12 CDT 2016

People are facing the recurrent idea that the Greek theta used to 
write the Rromani language in International Standard orthography—as 
well as a number of other languages—will be or ought to be encoded 
as a separate casing pair in Unicode.

LATIN CAPITAL LETTER THETA and LATIN SMALL LETTER THETA
were part of Michael Eversonʼs 2012 proposal at
http://www.unicode.org/L2/L2012/12138-n4262-unifon.pdf
as the intended code points U+A7B0 and U+A7B1. While some characters
were retained, others were rejected, among which the Latin Theta pair,
but no mention is found of this rejection in the Non-Approval Notices.

Two years later this proposal was sustained by
Denis Moyogo Jacqueryeʼs additional proposal at
http://www.unicode.org/L2/L2014/14202-latin-theta-delta.pdf
with a new rationale, as being required in writing systems of several
natural languages.

On the sole criterium of glyphic resemblance there exist already 
two matching characters in Unicode:
03F4 GREEK CAPITAL THETA SYMBOL
03B8 GREEK SMALL LETTER THETA

Does the UTC consider it as feasible to meet the issue by implementing 
a tailored casing pair for the related locales, and adding somewhere an
annotation for the information of font designers, or can people expect to
see one day a successful proposal for LATIN CAPITAL LETTER THETA and 
LATIN SMALL LETTER THETA? Yet to date, this is not found in the Pipeline. 
(Though experience showed that a given character being rejected in one 
proposal is without prejudice to its being accepted as a part of a later 
proposal. That happened to the LATIN CAPITAL LETTER SMALL CAPITAL I, found 
already in Mr Eversonʼs 2012 proposal and now added to Unicode in 2016.)

The Greek Theta as an IPA character was incidentally discussed already in 
the following thread:
Unicode Mail List Archive: gamma as a phonetic symbol. 
(Sat Sep 27 2008 - 11:43:57 CDT). Retrieved June 10, 2016, from 
http://www.unicode.org/mail-arch/unicode-ml/y2008-m09/0072.html

According to Mr Everson in this thread, «Theta is perhaps the 
hardest to argue for» disunification:
http://www.unicode.org/mail-arch/unicode-ml/y2008-m09/0076.html

Why so, is however non-obvious to me because the capital does not 
match the glyphic expectations for the Romani International Standard 
Latin script subset as referred to in
https://en.wikipedia.org/wiki/Romani_alphabets#International_Standard
and more detailedly in
https://fr.wikipedia.org/wiki/Th%C3%AAta_latin
(available yet in French only, but anyway one might wish to check 
the picture).

Consequently AFAIK to date the Greek Capital Theta Symbol is preferred 
as uppercase, not the Greek Capital Theta. Using the Symbol variant
brings some odds in data processing due to the lack of round-trip casing 
relationship. This adds to the overall problem of cross-script usage. 
Using several scripts to write one language contradicts one of the design 
principles of Unicode.

I note too, that in its International Standard Alphabet form, Romany is not 
supported by the blocks up to Latin Extended-A, unlike TUS 8.0 states on 
page 296. This brings up the need to underscore that Unicode added the 
H with háček (U+021E U+021F) for Finnish Romany in the Latin Extended-B

block.

However U+03F4 ( ϴ ) GREEK CAPITAL THETA SYMBOL was among the 
subset of potentially obsolete characters found in the Archives of 
this List in the following e-mail:
http://www.unicode.org/mail-arch/unicode-ml/y2009-m01/0558.html

Solving this issue now is important in that the French Standard 
Keyboard Layout will support Rromani Standard Latin script (along 
with all European Latin script using languages). This topic being 
about plain character encoding, Iʼve finally decided to submit it 
to your kind advice.

Marcel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20160612/13a9a9ad/attachment.html>