Regarding Unicode for new Symbol

Richard Wordingham richard.wordingham at
Sun May 24 04:25:53 CDT 2015

On Sun, 24 May 2015 10:25:50 +0530
baskar raj <baskar115 at> wrote:

> i just gave "and" as an example (verdy), i am just curious to know if
> we propose a symbol for a word does Unicode encode it or accept when
> it is already used by a small community of users, shall we claim in
> letter like symbols (00–4F). (any possibility)
> or we can only implement in private use area until it is recognized -
> which is not possible for small mediums to get widely recognized
> other than bigger names like Microsoft or Apple proposing.

In general, a private use character can be promoted by including it in a
generally useful font and providing soft keyboards that allow its use.

There are two major exceptions to this - combining marks and characters
that require a rendering engine.  It might even be possible to get round
these problems in many cases with a *lot* of ingenuity in the soft
keyboards.  I believe AAT fonts are a solution for the Apple
world, but OpenType may be more difficult, and may need tackling
application by application and rendered by renderer even with open
source software. Another possible method would be to subvert the
rendering engine.

For open source applications, fonts using (SIL) Graphite often work.
While Tai Tham was being encoded, I successfully used the PUA for
generating word lists and successfully converted them to Unicode once
the encoding was approved.  My viewing tools were limited, and I was
delighted when OpenOffice started supporting Graphite and when a
version of Firefox appeared that also supported Graphite.

There is another solution, which is *bad* but can work well for a short
period.  That solution is for a font to hijack a code point with the
desired properties relevant to rendering.  One solution along these
lines, which may not yet be usable, would be to use a character with the
right properties and then use a variation sequence to substitute one's
own unrelated glyph.  Gaps in character assignments tend to be used for
these purposes (Lao is a good example), but renderer support varies.  I
remember that Windows XP initially didn't support U+0BB6 TAMIL LETTER
SHA when using its native rendering stack.


More information about the Unicode mailing list