Encoding italic

David Starner via Unicode unicode at unicode.org
Mon Jan 21 02:29:42 CST 2019


On Sun, Jan 20, 2019 at 11:53 PM James Kass via Unicode
<unicode at unicode.org> wrote:
>  Even though /we/ know how to do
> it and have software installed to help us do it.

You're emailing from Gmail, which has support for italics in email.
The world has, in general, solved this problem.

>  > How do you envision this working?
>
> Splendidly!  (smile)  Social platforms, plain-text editors, and other
> applications do enhance their interfaces based on user demand from time
> to time.  User demand, at least on Twitter, seems established.

Then it would take six months, tops, for Twitter to produce and
release a rich-text interface for Twitter. Far less time than waiting
for Unicode to get around to it.

> When corporate
> interests aren't interested, third-party developers develop tools.

Where are these tools? As I said, third-party developers could develop
tools to convert a _underscore_ or /slash/ style italics to real
italics and back without waiting on Twitter or Unicode.

> Copy/pasting from a web page into a plain-text editor removes any
> italics content, which is currently expected behavior.  Opinions differ
> as to whether that represents mere format removal or a loss of meaning.
> Those who consider it as a loss of meaning would perceive a problem with
> interoperability.

Copy/pasting from a web page into a plain-text editor removes any
pictures and destuctures tables, which definitely loses meaning.

It also removes strike-out markup, which can have an even more
dramatic effect on meaning than removing italics. As you pointed out
below, it removes superscripts and subscripts; unless you wish to
press for automatic conversion of those to Unicode, that's going to
continue happening. It drops bold and font changes, and any number of
other things that can carry meaning.

> Copy/pasting an example from the page into plain-text results in “ma1,
> ma2, ma3, ma4”, although the web page displays the letters as italic and
> the digits as (italic) superscripts.  IMO, that’s simply wrong with
> respect to the superscript digits and suboptimal with respect to the
> italic letters.

The superscripts show a problem with multiple encoding; even if you
think they should be Unicode superscripts, and they look like Unicode
superscripts, they might be HTML superscripts. Same thing would happen
with italics if they were encoded in Unicode.

-- 
Kie ekzistas vivo, ekzistas espero.



More information about the Unicode mailing list