Private Use areas

Philippe Verdy via Unicode unicode at
Thu Aug 23 10:39:15 CDT 2018

You make a confusion: I do not propose "hacking" existing codes, but
instead adding new codes for private variations. It's then up to PUV
sequence authors to choose an appropropriate base character that can have
the properties they want to be inherited by the private-use variation
sequence, or to choose a base character that will provide some reasonnable
reading if rendererd as is (by renderers or fonts not implementing the
pricate viaration sequence, give nthat they will also append a symbol for
the PUV itself after the standard character).

Also I do not want to change anything to any existing variation sequences
(using VS1 and so on) and their encoding policies, requiring a prior
registration and standardisation.

Le jeu. 23 août 2018 à 11:42, Richard Wordingham via Unicode <
unicode at> a écrit :

> On Wed, 22 Aug 2018 11:58:58 +0200
> Philippe Verdy via Unicode <unicode at> wrote:
> > For now there's still no way to have variant sequences unless they are
> > registered and standardized by Unicode but registration should be not
> > needed (forbidden) for sequences containing PUV.
> I believe this scheme is no worse than hack encodings that using Latin
> character codes for other characters.  These schemes often work.
> (Indeed, the currently best method of getting Tai Tham displayed as rich
> text that I can find is to use a transliteration-type encoding and a
> special font, though I can now get pretty close using the proper
> character codes in the order laid down in the proposals.)
> The major problems I can see with appropriating variation sequences
> are:
> (1) It might be restricted to base characters - I have no
> experimental evidence on whether this would happen.  Fonts can happily
> convert base characters to combining characters, though this works
> best if Latin line-breaking rules take effect.
> (2) The appropriated variation sequence might be assigned a meaning -
> but this is no worse than the general ambiguity of PUA characters.
> (3) Some base characters get special treatment.  For example, I had
> to change my transliteration scheme because hyphen-minus is treated
> specially by MS Edge - I was using it as a digraph disjunctor - and
> so clusters were not being formed.  In this case, I would have come
> unstuck as soon as line-wrapping started, so it was a bad choice anyway.
> Or are there significant renderers that deliberately ignore variation
> selectors in unregistered, unstandardised variation sequences?  I don't
> recall any problems from when we were discussing variation
> sequences for chess pieces.
> For supplementing a script, it might be best to start at
> VARIATION-SELECTOR-256, and work down if need be with specialist
> characters.
> Richard.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list