Latin Letters Capital and Small Theta

Philippe Verdy verdy_p at
Mon Jun 13 12:52:14 CDT 2016

This is general. characters may initially be encoded with a single case
where the demonstrated use for only IPA usage (which is single cased). To
get dual cased letters, we need to find examples of use in the orthography
of a language where all other letters are dual cased. Well this was tur for
the German sharp S but for long it was not demonstrated that the lowercase
and uppercase was different.

With Rromany (which has multiple orthographies in multiple scripts), the
problem is that there's no formal standard and the rromany communities
around countries have adapted their orthography with usages found in other
ntational languages. There's no real academy and in fact the language is
very fragmented, and its tradition is fact more oral than written There are
authors of written texts but each one has adopted a convention more or less
based on the standard orthography of another language where they live. So
there are variants of the orthography in multiple scripts, at least Latin,
Cyrillic, Greek, Devanagari (probably also Arabic in North-Eastern India,
Pakistan, Iran; many be also Georgian: the rromany people are spread in a
very large area from Southern Asia, Central Asia, Western Asia, to Europe
and North Africa). The orthographies are more or less adaptations of the
phonetics of the oral tradition.

For those authors that want to better represent the language phonetics it's
natural that they'll want to borrow the IPA theta symbol when chossing the
Latin script (and in the Greek-based orthography they'll correctly
differentiate the Greek Tau and Theta letters for the same purpose). I
wonder which letters they choose to differentiate Tau and Theta in Cyrillic
(there'a a sizeable rromany community in Bulgaria, Macedonia, Serbia...).
But in the Latin script, authors have also used digraphs (T vs. TH) since
long (just like other European languages, including English or French, even
if French does not differentiate the phonetics and the H in TH is in fact
completely mute!).

There's actually no stable translitterators because there are competing
orthographies depending on authors, and no formal agreements between
authors and no academic institution which is widely recognized (there are
severla local cummunities that may have authored some writing guides, but I
don't think these are very strong to be authoritative: the tradition is
still strongly oral and what is important is not the way the language is
written but how it is pronounced and sung: music and songs is an essential
part of the rromany culture, and what unites them across countries, even if
there are some religion splits).

It's normal for Unicode to accept the existence of Latin orthographies that
will use the Theta letter as a normal dual cased letter if we can
demonstrate that authors need it and publications were easily made and
relatively easy to find. Those publicatiosn are part of our wold cultures
and needs to be preserved and correctly represented, even if we don't have
any formal academy. It is even more important than encoding many new emojis
for fun (that are recent inventions but don't have the same level of
historic background).

Being able to write all languages even if their historic tradition is oral,
is an important and respectable goal, notably when these are living
languages with a large speaking community. It's not something new: various
native African languages have also adopted IPA symbols in their Latin
orthography, and wanted to have dual case. So now we also have dual-cased
Latin letters Alpha, Epsilon, Open O... It does not matter if IPA only
needs lowercase, but it has become a strong common base used for
orthographies of languages with oral traditions, and natural for them to
expand the IPA set with capital letters for the Latin script (and another
proof that IPA is not a separate script but a subset of the Latin script).

2016-06-13 14:41 GMT+02:00 Frédéric Grosshans <frederic.grosshans at>

> Le 12/06/2016 02:20, Doug Ewell a écrit :
>> Marcel Schneider wrote:
>> While some characters were retained, others were rejected, among which
>>> the Latin Theta pair, but no mention is found of this rejection in the
>>> Non-Approval Notices.
>> Lots of characters in proposals are rejected without rising to the level
>> of explicit disapproval: "Look, we said NO, and don't ask us again." The
>> Non-Approval Notices page starts with an extensive description of the
>> difference.
>> At the same time, note that a few proposals, such as LATIN CAPITAL LETTER
>> SHARP S, have risen phoenix-like from the ranks of non-approvaldom to
>> become genuine encoded characters.
> And,  if I I remember correctly, to proposal for the Latin letter theta
> yet has given example of the current usage of ttheta in latin orthography,
> like in Rromani (,
> ). I guess a proposal based on the Rromani orthography, (and with input for
> the user community, of course!) would easily be accepted.
>    Cheers,
>         Frédéric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list