Encoding/Use of pontial unpaired UTF-16 surrogate pair specifiers
Doug Ewell
doug at ewellic.org
Sat Jan 30 15:46:39 CST 2016
Chris Jacobs wrote:
>>> UTF16 has no way to define a code point that is D800-DFFF; this is
>>> an issue if I want to apply some sort of encryption algorithm and
>>> still have the result treated as text for transmission and encoding
>>> to other string systems.
>
> This is not an issue at all. You don't have to restrict the input to
> text to be able to generate an output that can be treated as text.
I gathered that J wanted to generate arbitrary output that could be
interpreted as UTF-16 code units. I admit to being less than 100% sure
of this.
Certainly there is no shortage of algorithms to map arbitrary byte input
to text output, usually limited to some subset of ASCII. One interesting
approach for the Unicode era was Markus Scherer's "Base16k" concept, at
https://sites.google.com/site/markusicu/unicode/base16k .
--
Doug Ewell | http://ewellic.org | Thornton, CO
More information about the Unicode
mailing list