Characters that should be displayed?

Jukka K. Korpela jkorpela at cs.tut.fi
Mon Jun 30 00:00:53 CDT 2014


2014-06-30 0:48, David Starner wrote:

> On Sun, Jun 29, 2014 at 2:02 PM, Jukka K. Korpela <jkorpela at cs.tut.fi> wrote:
>> They might be seen as “not displayable by normal rendering”, so yes. On the
>> practical side, although Private Use characters should not be used in public
>> information interchange, they are increasingly popular in “icon font”
>> tricks.
>
> Since when is HTML necessarily public information interchange?

Since 1990. ☺

Seriously, HTML was designed for public information interchange, and 
this is still its dominant use and regularly implied when discussing 
HTML. Besides, even when the use not public in a strict sense, it is 
generally based on client technologies that have no provisions for 
private agreements, in the sense of agreeing on meanings for Private Use 
codepoints. Web browsers and other HTML renderers have special 
interpretations for some characters (markup-significant characters, 
special treatment of some input characters, etc.) but no mechanism for 
adding rules that say something about Private Use characters.

The reason why “icon font” tricks mostly work is that browser treat most 
codepoints so that they try to render them using some fonts, under the 
influence of CSS, and in CSS you can nowadays pretty reliably, but not 
100% reliably, use @font-face to specify a specific font to be used.

The issue here, however, is what happens when the trick fails, for one 
reason or another. Private Use codepoints are mostly attempts at 
presenting some glyphs, rather than accidental occurrences of data that 
is best ignored (like control characters mostly are, e.g. NUL inserted 
by server-side software or authoring tool).

> I can't
> imagine where you would better use private use characters then in HTML
> where a font can be named but you don't have enough control over the
> format to enter the data in some other format.

Applications that operate on plain text and use one fixed but 
configurable font are a much better example. If you need to use, say, a 
currency symbol that has not yet been added to Unicode but can be 
included in the font, then a Private Use codepoint is the only good way 
(and the only other way is to put the glyph into a code position 
allocated for some defined character, like “¤”—this would work in 
practice, but it’s really not recommended).

In HTML, on the other hand, you can instead use images, and CSS lets you 
scale the images to the font size if desired

Yucca





More information about the Unicode mailing list