Encoding italic

David Starner via Unicode unicode at unicode.org
Thu Jan 31 04:03:57 CST 2019


On Thu, Jan 31, 2019 at 12:56 AM Tex <textexin at xencraft.com> wrote:
>
> David,
>
> "italics has never been considered part of plain text and has always been considered outside of plain text. "
>
> Time to change the definition if that is what is holding you back.

That's not a definition; that's a fact. Again, it's like the 8-bit
byte; there are systems with other sizes of byte, but you usually
shouldn't worry about it. Building systems that don't have 8-bit bytes
are possible, but it's likely to cost more than it's worth.

> As has been said before, interlinear annotation, emoji and other features of Unicode which  are now considered plain text were not in the original definition.

https://www.w3.org/TR/unicode-xml/#Interlinear (which used to be
Unicode Technical Report #20) says "The interlinear annotation
characters were included in Unicode only in order to reserve code
points for very frequent application-internal use. ... Including
interlinear annotation characters in marked-up text does not work
because the additional formatting information (how to position the
annotation,...) is not available. ... The interlinear annotation
characters are also problematic when used in plain text, and are not
intended for that purpose."

Emoji, as have been pointed out several times, were in the original
Unicode standard and date back to the 1980s; the first DOS character
page has similes at 0x01 and 0x02.

> If Unicode encoded an italic mechanism it would be part of plain text, just as the many other styled spaces, dashes and other characters have become plain text despite being typographic.

If Unicode encoded an italic mechanism, then some "plain text" would
include italics. Maybe it would be successful, and maybe it would join
the interlinear annotation characters as another discouraged poorly
supported feature.

> As with the many problems with walls not being effective, you choose to ignore the legitimate issues pointed out on the list with the lack of italic standardization for Chinese braille, text to voice readers, etc.

Text to voice readers don't have problems with the lack of italic
standardization; they have problems with people using mathematical
characters instead of actual letters.

> The choice of plain text isn't always voluntary.

The choice of using single-byte character sets isn't always voluntary.
That's why we should use ISO-2022, not Unicode. Or we can expect
people to fix their systems. What systems are we talking about, that
support Unicode but compel you to use plain text? The use of Twitter
is surely voluntary.

-- 
Kie ekzistas vivo, ekzistas espero.


More information about the Unicode mailing list