Italics get used to express important semantic meaning, so unicode should support them

Zach Lym indolering at gmail.com
Tue Dec 15 21:18:41 CST 2020


> Finally, what I'm envisioning — and I'm not sure how closely this
> matches Christian Kleineidam's intention (where did he go, anyway?) —
> is not Yet Another Presentation Layer or a Shiny New Toy for people to
> use in their tweets, but more of a sombre hint that "in the original
> source document, this text had an alternative presentation; indicate
> this to the user in an appropriate way, if applicable". It's meant for
> preservation, not decoration. That's why I hear the "spirit of
> Unicode".

For those of us that can recall the exuberance of the XHTML movement,
<i>, <b> and friends were all deemed to be insufficiently semantic and
slated to be replaced by <em> and <strong>.  Of course, this was a
distinction without a difference and now we just have extra tags that
are more verbose and less literal.

But that begs the question: if the authors of a rich text standard
can't agree on what counts as semantic, how would Unicode decide?
What about <mark>, <strikethrough>, or as I previously suggested
<blink>?  <blink> was added to HTML because it was the only
styling that could be displayed in plaintext console environments.
So if <blink> doesn't make your cutoff, then I guess the bar is personal
taste?

The line between semantics and styling is inherently fuzzy, but every
attempt at encoding similarly fuzzy semantics within Unicode is
something humanity must deal with for the rest of all time.  Take the
newline vs paragraph separators, a noble attempt at trying to encode
what essentially amounts to the plaintext/typewriter hack of using
\n\n to insert whitespace after a paragraph.  No-one uses either of
them, not even Markdown (which does use <em> and <strong>) because
most plain text doesn't make the distinction, users can't input it via
a keyboard, and no one else supports it.  Yet myself and a colleague
had to spend waaaay too much of our short lives figuring out what to
support as breaking separators in WASI text streams.

What puzzles me is why this discussion wasn't moderated to the null
bin.  This *exact* question is answered in the FAQ and is regularly
shot down.


-Zach Lym



More information about the Unicode mailing list