A last missing link for interoperable representation

James Kass via Unicode unicode at unicode.org
Wed Jan 9 00:58:51 CST 2019


Ken Whistler wrote,

 > It isn't the job of the Unicode Consortium or the Unicode Standard
 > to sort that stuff out or to standardize characters to represent it.

Agreed, it isn’t.

 > When somebody brings to the UTC written examples of established
 > orthographies using character conventions that cannot be clearly
 > conveyed in plain text with the Unicode characters we already have,
 > *then* perhaps we will have something to talk about.

If a text is published in all italics, that’s style/font choice.  If a 
text is published using italics and roman contrastively and 
consistently, and everybody else is doing it pretty much the same way, 
that’s a convention.

Typewriting is mechanical writing.  Computer keyboards, input methods, 
and Unicode are technological advances in mechanical writing.  
Typesetting for publishing is mechanical writing for the purpose of mass 
production and distribution of texts.

 From a printed Webster’s,
lexicon (lek´ si kən) [ < Gr. ����������, word. ]  1.  a dictionary  2.  
a special vocabulary

There’s a convention in English writing to express foreign words using 
italics.  Not just in published dictionaries, but also in running text 
where foreign words and phrases are deployed.

Other italics conventions include ship names such as the SS ������’�� 
��������, or titles such as ���� ���������� �������� ������, which is 
properly spelled with a “Ç” in “Ça”.  (Math kludge fail.)  Of course, 
since that song title is in a foreign language, it should be italicized 
anyway.

Quoting from,
http://navalmarinearchive.com/research/ship_names.html
“Names of specific ships and other vessels are both capitalized and 
italicized (or capitalized entirely - "all caps" - in text documents 
denying italics such as email, use of a mechanical typewriter.)”

There were technological constraints denying italics in mechanical 
typewriters.  There’s a technical consortium denying italics in Latin 
computer plain text, for better or worse.  (Trying to state the obvious 
here without being judgmental.)

The use of italics in English writing to mark stress is another existing 
convention.  Italics don’t interfere with legibility in English fiction 
when used to indicate stress in dialogue between the characters.  
Rather, the italics add information enabling the reader to approximate 
how the author intended the dialogue to be *spoken*. And ��ℎ���� 
information cannot be preserved in Unicode plain text without the math 
kludge or using asterisks and slashes as ���� ���������� mark-up.

“������������ is important” vs. “Stress ���� important”.

I look forward to the continuing evolution of plain text and would 
welcome the ability to use italics in plain text without kludges. <i>But 
I’m not holding my breath.</i>

Anybody making a formal proposal for italics encoding can be assured 
that the proposal would be received with something less than 
enthusiasm.  But stranger things have happened.

Many of us here are old enough to remember when something like <PICTURE 
OF A COW> was a non-starter because in-line pictures were out of scope 
for a computer plain text standard.  But now I could plop a picture of a 
cow (or worse) right into this plain text e-mail, if I were so 
inclined.  That’s progress for you.

It’s too bad they called it ��ℎ�� ��ℎ���������� ������������ ���� 
���������� instead of “The Chicago Manual of Correct American English 
Orthographic Conventions for Text Publishing”, eh?  Maybe “Style” 
sounded more classy.  But it *does* tend to make it simpler for people 
to dismiss such distinctions as being merely stylistic.

But if the distinction is merely stylistic, we wouldn’t have needed to 
develop typewriter or computer plain text kludges for them in order to 
express ourselves properly.

(Apologies for length and Happy New Year!)



More information about the Unicode mailing list