Origin of the digital encoding of accented characters for Esperanto

Ken Whistler kenwhistler at att.net
Mon Mar 23 11:58:45 CDT 2015



On 3/23/2015 8:35 AM, William_J_G Overington wrote:
> Origin of the digital encoding of accented characters for Esperanto
>
> Twelve accented characters (uppercase versions and lowercase versions 
> of six accented letters) used for Esperanto are encoded in Unicode.

WJO is referring to U+0109, U+011D, U+0125, U+0135, U+015D, U+016D (and
their uppercase pairs).

>
> These may well be in Unicode as legacy encoded characters from one or 
> more earlier standards.

No.

>
> Does anyone know please how Esperanto characters first became encoded 
> digitally?

In the Unicode Standard, the fact that these all occur in the Latin 
Extended-A block is
a clue. The Latin Extended-A block dates back to Unicode 1.0. You can 
easily verify
that by referring to the archival record. See:

http://www.unicode.org/versions/Unicode1.0.0/

And in fact, the exact set in the Latin Extended-A block can be traced 
even further
back than the publication of Unicode 1.0 in 1991. That same repertoire 
was included
in the charts distributed for public review in the Unicode 1.0 final 
review draft
in December, 1990. So we know that the inclusion of the 12 accented 
characters
for Esperanto in that set dates back at least that far -- which should 
eliminate a
lot of fruitless alternative speculative theories about their origins in 
Unicode.

>
> For example, was it that someone who was interested in Esperanto 
> happened to be a member of a committee that was working on encoding 
> accented characters?

Well, sort of. See further explanation below.

>
> Or did one or more people, or a group of people, or an Esperanto 
> society, lobby for the characters to become included?

No.

>
> Or what?

Well, the answer is sort of "or what". The repertoire of accented 
characters included in the
Latin Extended-A block for the final review draft of Unicode 1.0 in 
December, 1990
was largely culled from the even earlier list of Latin letters proposed for
encoding in the 2nd DP (Draft Proposal) for ISO/IEC 10646-1. Their 
inclusion in
the Unicode Standard 1.0 repertoire was one of the early compatibility 
decisions,
to ensure that repertoire that national bodies had thought important 
enough to
be included in the early 10646 balloting was accounted for in some way in
the first Unicode Standard draft.

The list of accented Latin letters in the Latin Extended-A block 
consisted of the
union of all of the then-extant ISO 8859 8-bit standard repertoire for 
various
Latin alphabets, *plus* the additional letters culled from the 2nd DP 
10646-1.

For the record, the 2nd DP 10646 was JTC1/SC2 N2066 (=WG2 N551), dated
December 1, 1989. In that era, documents were only distributed by paper,
and I don't know of an extant online copy, so it is rather difficult to 
track down!

<speculation>
In any event, in that document from 1989, I consider it likely that the 
person
who probably originally assembled the lists of various European language 
alphabets and
included them in the drafts for balloting was Hugh McGregor Ross, the
then British editor of 10646 and a person with a passion for details about
lesser-used writing systems. Mr. Ross is, unfortunately, recently deceased,
so we cannot ask him directly. But I suspect that examination of the
early drafts of 10646 and papers related to it would confirm this 
speculation
on my part.
</speculation>

--Ken

>
> It does not seem axiomatic that accented characters for Esperanto 
> would necessarily be included in a digital encoding of the accented 
> characters needed for the languages of Europe.
>
> William Overington



More information about the Unicode mailing list