A last missing link for interoperable representation

Ken Whistler via Unicode unicode at unicode.org
Tue Jan 8 15:43:08 CST 2019


James,

On 1/8/2019 1:11 PM, James Kass via Unicode wrote:
> But we're still using typewriter kludges to represent stress in Latin 
> script because there is no Unicode plain text solution.

O.k., that one needs a response.

We are still using kludges to represent stress in the Latin script 
because *orthographies* for most languages customarily written with the 
Latin script don't have clear conventions for indicating stress as a 
part of the orthography.

When an orthography has a well-developed convention for indicating 
stress, then we can look at how that convention is represented in the 
plain text representation of that orthography. An obvious case is 
notational systems for the representation of pronunciation of English 
words in dictionaries. Those conventions *do* then have plain text 
representations in Unicode, because, well, they just have various 
additional characters and/or combining marks to clearly indicate lexical 
stress. But standard written English orthography does *not*. (BTW, that 
is in part because marking stress in written English would usually 
*decrease* legibility and the usefulness of the writing, rather than 
improving it.)

Furthermore, there is nothing inherent about *stress* per se in the 
Latin script (or any other script, for that matter). Lexical stress is a 
phonological system, not shared or structured the same way in all 
languages. And there are *thousands* of languages written with the Latin 
script -- with all kinds of phonological systems associated with them. 
Some have lexical tones, some do not. Some have other kinds of 
phonological accentuation systems that don't count as lexical stress, 
per se.

And there are differences between lexical stress (and its indication), 
and other kinds of "stress". Contrastive stress, which is way more 
interesting to consider as a part of writing, IMO, than lexical stress, 
is a *prosodic* phenomenon, not a lexical one. (And I have been using 
the email convention of asterisks here to indicate contrastive stress in 
multiple instances.) And contrastive stress is far from the only kind of 
communicatively significant pitch phenomenon in speech that typically 
isn't formally represented in standard orthographies. There are numerous 
complex scoring systems for linguistic prosody that have been developed 
by linguists interested in those phenomenon -- which include issues of 
pace and rhythm, and not merely pitch contours and loudness.

It isn't the job of the Unicode Consortium or the Unicode Standard to 
sort that stuff out or to standardize characters to represent it. When 
somebody brings to the UTC written examples of established orthographies 
using character conventions that cannot be clearly conveyed in plain 
text with the Unicode characters we already have, *then* perhaps we will 
have something to talk about.

--Ken




More information about the Unicode mailing list