Dealing with Unencodeable Characters

Doug Ewell doug at ewellic.org
Thu Oct 6 14:06:07 CDT 2016


Charlotte Buff wrote:

> Private use characters are an obvious choice but of course their
> meaning is user-defined, so while all other emoji in my Shift JIS
> document would receive an unambiguous Unicode mapping, Shibuya 109
> would remain vague and very limited in interchange options.

But that's exactly what private-use characters were invented for: so you
can represent characters in a given character encoding framework which
are not encoded for some reason.

Of course you need a private agreement of some kind, but it can be as
simple as "Hey, everybody, in the attached document (or in any documents
I create) U+FF109 means SHIBUYA 109." Private agreements don't have to
be secret or limited-distribution, and they don't have to be excessively
formal.

Unicode rejected the "compatibility symbols" because they would have
amounted to private-use characters defined by Unicode, where the formal
names and definitions of the characters were not specified but, shhh, we
all know what they REALLY mean. This would have been the Wrong Thing to
Do on many levels.

 
--
Doug Ewell | Thornton, CO, US | ewellic.org



More information about the Unicode mailing list