Characters that should be displayed?
Jukka K. Korpela
jkorpela at cs.tut.fi
Mon Jun 30 00:00:53 CDT 2014
2014-06-30 0:48, David Starner wrote:
> On Sun, Jun 29, 2014 at 2:02 PM, Jukka K. Korpela <jkorpela at cs.tut.fi> wrote:
>> They might be seen as “not displayable by normal rendering”, so yes. On the
>> practical side, although Private Use characters should not be used in public
>> information interchange, they are increasingly popular in “icon font”
> Since when is HTML necessarily public information interchange?
Since 1990. ☺
Seriously, HTML was designed for public information interchange, and
this is still its dominant use and regularly implied when discussing
HTML. Besides, even when the use not public in a strict sense, it is
generally based on client technologies that have no provisions for
private agreements, in the sense of agreeing on meanings for Private Use
codepoints. Web browsers and other HTML renderers have special
interpretations for some characters (markup-significant characters,
special treatment of some input characters, etc.) but no mechanism for
adding rules that say something about Private Use characters.
The reason why “icon font” tricks mostly work is that browser treat most
codepoints so that they try to render them using some fonts, under the
influence of CSS, and in CSS you can nowadays pretty reliably, but not
100% reliably, use @font-face to specify a specific font to be used.
The issue here, however, is what happens when the trick fails, for one
reason or another. Private Use codepoints are mostly attempts at
presenting some glyphs, rather than accidental occurrences of data that
is best ignored (like control characters mostly are, e.g. NUL inserted
by server-side software or authoring tool).
> I can't
> imagine where you would better use private use characters then in HTML
> where a font can be named but you don't have enough control over the
> format to enter the data in some other format.
Applications that operate on plain text and use one fixed but
configurable font are a much better example. If you need to use, say, a
currency symbol that has not yet been added to Unicode but can be
included in the font, then a Private Use codepoint is the only good way
(and the only other way is to put the glyph into a code position
allocated for some defined character, like “¤”—this would work in
practice, but it’s really not recommended).
In HTML, on the other hand, you can instead use images, and CSS lets you
scale the images to the font size if desired
More information about the Unicode