Encoding italic (was: A last missing link)
James Kass via Unicode
unicode at unicode.org
Sat Jan 19 19:18:19 CST 2019
Victor Gaultney wrote,
> If however, we say that this "does not adequately consider the harm done
> to the text-processing model that underlies Unicode", then that exposes a
> weakness in that model. That may be a weakness that we have to accept for
> a variety of reasons (technical difficulty, burden on developers, UI
> cost, maturity).
Unicode's character encoding principles and underlying text-processing
model remain robust. They are the foundation of modern computer text
processing. The goal of ¹ needs to accommodate
the best expectations of the end users and the fact that the consistent
approach of the model eases the software people's burdens by ensuring
that effective programming solutions to support one subset or range of
characters can be applied to the other subsets of the Unicode
repertoire. And that those solutions can be shared with other
developers in a standard fashion.
Assigning properties to characters gives any conformant application
clear instructions as to what exactly is expected as the app encounters
each character in a string. In simpler times, the only expectation was
that the application would splat a glyph onto a screen (and/or sheet of
paper) and store a binary string for later retrieval. We've moved forward.
'Unicode encodes characters, not glyphs' is a core principle. There's a
legitimate concern whenever anyone is perceived as heading into the
general direction of turning the character encoding into a glyph
registry, as it suggests a possible step backwards and might lead to a
slippery slope. For example, if italics are encoded, why not fraktur
The notion that any given system can't be improved is static.³ ("System"
refers to Unicode's repertoire and coverage rather than its core
principles. Core principles are rock solid by nature.)
¹ /ne plus ultra/
² "Conversely, significant differences in writing style for the same
script may be reflected in the bibliographical classification—for
example, Fraktur or Gaelic styles for the Latin script. Such stylistic
distinctions are ignored in the Unicode Standard, which treats them as
presentation styles of the Latin script." Ken Whistler,
³ "Static" can be interpreted as either virtually catatonic or radio
noise. Either is applicable here.
More information about the Unicode