Dealing with Unencodeable Characters

Doug Ewell doug at
Thu Oct 6 14:06:07 CDT 2016

Charlotte Buff wrote:

> Private use characters are an obvious choice but of course their
> meaning is user-defined, so while all other emoji in my Shift JIS
> document would receive an unambiguous Unicode mapping, Shibuya 109
> would remain vague and very limited in interchange options.

But that's exactly what private-use characters were invented for: so you
can represent characters in a given character encoding framework which
are not encoded for some reason.

Of course you need a private agreement of some kind, but it can be as
simple as "Hey, everybody, in the attached document (or in any documents
I create) U+FF109 means SHIBUYA 109." Private agreements don't have to
be secret or limited-distribution, and they don't have to be excessively

Unicode rejected the "compatibility symbols" because they would have
amounted to private-use characters defined by Unicode, where the formal
names and definitions of the characters were not specified but, shhh, we
all know what they REALLY mean. This would have been the Wrong Thing to
Do on many levels.

Doug Ewell | Thornton, CO, US |

More information about the Unicode mailing list