A last missing link for interoperable representation

David Starner via Unicode unicode at unicode.org
Mon Jan 14 16:58:24 CST 2019

On Mon, Jan 14, 2019 at 2:09 AM Tex via Unicode <unicode at unicode.org> wrote:
> The arguments against italics seem to be:
> ·        Unicode is plain text. Italics is rich text.
> ·        We haven't had it until now, so we don't need it.
> ·        There are many rich text solutions, such as html.
> ·        There are ways to indicate or simulate italics in plain text including using underscore or other characters, using characters that look italic (eg math), etc.
> ·        Adding Italicization might break existing software
> ·        The examples of  existing Unicode characters that seem to represent rich text (emoji, interlinear annotation, et al) have justifications.

There generally shouldn't be multiple ways of doing things. For
example, if you think that searching for certain text in italics is
important, then having both HTML italics and Unicode italics are going
to cause searches to fail or succeed unexpectedly, unless the
underlying software unifies the two systems (an extra complexity).
Searching for certain italicized text could be done today in rich text
applications, were there actual demand for it.

> ·        Plain text still has tremendous utility and rich text is not always an option.

Where? Twitter has the option of doing rich text, as does any closed
system. In fact, Twitter is rich text, in that it hyperlinks web
addresses. That Twitter has chosen not to support italics is a choice.
If users don't like this, they could go another system, or use
third-party tools to transmit rich text over Twitter. The use of
underscores or <i> </i> markings for italics would be mostly
compatible with human twitterers using the normal interface.

Source code is an example of plain text, and yet adding italics into
comments would require but a trivial change to editors. If the user
audience cared, it would have been done. In fact, I suspect there
exist editors and environments where an HTML subset is put into
comments and rendered by the editors; certainly active links would be
more useful in source code comments than italics.

Lastly, the places where I still find massive use of plain text are
the places this would hurt the most. GNU Grep's manpage shows no sign
that it supports searching under any form of Unicode normalization.
Same with GNU Less. Adding italics would just make searching plain
text documents more complex for their users. The domain name system
would just add them to the ban list, and they'd be used for spoofing
in filenames and other less controlled but still sensitive

Kie ekzistas vivo, ekzistas espero.

More information about the Unicode mailing list