Janusz S. Bień jsbien at
Tue Sep 27 23:59:24 CDT 2016

I wrote already

On Mon, Sep 19 2016 at  8:40 CEST, jsbien at writes:


> Searching the Unicode site I found only one use of 'grapheme' alone:

Anybody is aware of any other occurences?

On Tue, Sep 27 2016 at 16:28 CEST, christoph.paeper at writes:
> Janusz S. Bień <jsbien at>:
>     On Sun, Sep 18 2016 at 12:26 CEST, jsbien at writes:
>     Quote/Cytat - Christoph Päper <christoph.paeper at> (pią,
>         16
>         wrz 2016, 23:51:38):
>                 Janusz S. Bień <jsbien at>:
>                 1. Graphemes, if I understand correctly, are language
>                 dependent, …
>             That’s true in linguistic terminology – … –, but not in
>             technical (i.e.
>             Unicode) jargon.
>     And what is "grapheme" in "technical (i.e. Unicode) jargon"?
> It depends on the script (hence Unicode block), but not the writing
> system or language. The line is not always drawn consistently.

Please prove this claim by explicit quotations from the standard.

In my opinion there is no such thing as "grapheme" in "technical
(i.e. Unicode) jargon".

>     From the Unicode glossary:
>         Grapheme. […] (2) What a user thinks of as a character.
>         User-Perceived Character. What everyone thinks of as a
>         character in their script.
>     Does 'Grapheme' (2) make sense with "a (single?) user"? 
> No linguistic term makes sense with only a *single* user
> (“Privatsprache”).

That's obvious.

> It’s a very vague definition, but not quite
> incorrect for “a typical user”.

Exactly - "a typical user" is quite different from "a user". Do we agree
that the wording of "grapheme" (2) should be corrected?

>     BTW, it is rather well know that the term "phoneme" was proposed
>     first by a Polish linguist Jan Niecisław Ignacy Baudouin de
>     Courtenay (…). It is much less know that he proposed also the term
>     "grapheme".
> Yes, he introduced both terms, but the definitions have changed quite
> a bit through history and among schools. Entire books have been
> published about that, e.g. (in German) Manfred Kohrt (1985):
> “Problemgeschichte des Graphembegriffs und des frühen Phonembegriffs”
> (ISBN 3-484-31061-8) – I wish I knew a more recent one.

The question is whether all these linguistic discussions are relevant to

>     Alexander Berg's "English Historical Linguistics vol. I" page 230
>     […]:
>     […] the available definitions [of “grapheme”]
>     can be divided into two groups, corresponding to two main senses,
>     and reflecting "conflicting linguistics views of the status of
>     writing" (Henderson 1985:142):
>     1. a letter or cluster of letters referring to or corresponding
>     with a
>     single phoneme;
>     2. the minimal distinctive unit of a writing system.
>     For me the first meaning (…) is the primary, i.e. more useful,
>     meaning, as is has some practical applications e.g. for describing
>     Polish hyphenation rules.
> Type 1 has also been called “phono-graphemes” (with or without the
> hyphen). 

Seems a good term, I was not aware of it. Do you happen to remember who
introduced it?

> The conflicting views quoted from the 30 years old work by Henderson
> still exist.

There is no doubt about it.

> Many scholars – yourself included, it seems – infer a
> structural primacy of spoken language over written language from its
> historic primacy.

I do not, but it is completely irrelevant to the problem of the Unicode
use of the "grapheme" term.

Best regards


Prof. dr hab. Janusz S. Bien -  Uniwersytet Warszawski (Katedra Lingwistyki Formalnej)
Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department)
jsbien at, jsbien at,

More information about the Unicode mailing list