A last missing link for interoperable representation
Mark E. Shoulson via Unicode
unicode at unicode.org
Mon Jan 14 19:56:37 CST 2019
In some of this discussion, I'm not sure what is being proposed or
forbidden here... I don't know that anyone is advocating removing the
"don't use these for words!" warning sticker on the mathematical
italics. The closest-to-sensible suggestions I've heard are things like
a VS to italicize a letter, a combining italicizer so to speak (this is
actually very similar to the emoji-style vs text-style VS sequences).
*If* the VS is ignored by searches, as apparently it should be and some
have reported that it is, then VS-type solutions would NOT be a problem
when it comes to searches (and don't go whining about legacy software.
If Unicode had to be backward-compatible with everything we wouldn't
have gone beyond ASCII). So I'm not sure what you mean when you speak
of "Unicode italics". Do you mean using the mathematical italics as
we've been seeing? Or having a whole new plane of italic characters for
everything that could conceivably be italicized? Those would probably
both be mistakes, I agree.
On 1/14/19 5:58 PM, David Starner via Unicode wrote:
> On Mon, Jan 14, 2019 at 2:09 AM Tex via Unicode <unicode at unicode.org> wrote:
>> The arguments against italics seem to be:
>> · Unicode is plain text. Italics is rich text.
>> · We haven't had it until now, so we don't need it.
>> · There are many rich text solutions, such as html.
>> · There are ways to indicate or simulate italics in plain text including using underscore or other characters, using characters that look italic (eg math), etc.
>> · Adding Italicization might break existing software
>> · The examples of existing Unicode characters that seem to represent rich text (emoji, interlinear annotation, et al) have justifications.
> There generally shouldn't be multiple ways of doing things. For
> example, if you think that searching for certain text in italics is
> important, then having both HTML italics and Unicode italics are going
> to cause searches to fail or succeed unexpectedly, unless the
> underlying software unifies the two systems (an extra complexity).
> Searching for certain italicized text could be done today in rich text
> applications, were there actual demand for it.
>> · Plain text still has tremendous utility and rich text is not always an option.
> Where? Twitter has the option of doing rich text, as does any closed
> system. In fact, Twitter is rich text, in that it hyperlinks web
> addresses. That Twitter has chosen not to support italics is a choice.
> If users don't like this, they could go another system, or use
> third-party tools to transmit rich text over Twitter. The use of
> underscores or <i> </i> markings for italics would be mostly
> compatible with human twitterers using the normal interface.
> Source code is an example of plain text, and yet adding italics into
> comments would require but a trivial change to editors. If the user
> audience cared, it would have been done. In fact, I suspect there
> exist editors and environments where an HTML subset is put into
> comments and rendered by the editors; certainly active links would be
> more useful in source code comments than italics.
> Lastly, the places where I still find massive use of plain text are
> the places this would hurt the most. GNU Grep's manpage shows no sign
> that it supports searching under any form of Unicode normalization.
> Same with GNU Less. Adding italics would just make searching plain
> text documents more complex for their users. The domain name system
> would just add them to the ban list, and they'd be used for spoofing
> in filenames and other less controlled but still sensitive
More information about the Unicode