Question about Perl5 extended UTF-8 design

Richard Wordingham richard.wordingham at
Thu Nov 5 13:19:10 CST 2015

On Thu, 5 Nov 2015 18:25:05 +0100
Philippe Verdy <verdy_p at> wrote:

> But these extra code points could be used to represent someting else
> such as unique object identifier for internal use in your
> application, or virtual object pointers, or or shared memory block
> handles, file/pipe/stream I/O handles, service/API handles, user ids,
> security tokens, 64-bit content hashes plus some binary flags,
> placeholders/references for members in an external unencoded
> collection or for URIs, or internal glyph ids when converting text
> for rendering with one or more fonts, or some internal serialization
> of geometric shapes/colors/styles/visual effects...)

No-one's claiming it is for a Unicode Transformation Format (UTF).  A
possibly relevant example of a something else is a non-precomposed
grapheme cluster, as in Perl6's NFG.  (This isn't a PUA encoding, as
the precomposed characters are created on the fly.)


More information about the Unicode mailing list