Is the binaryness/textness of a data format a property?

Martin J. Dürst via Unicode unicode at
Sun Mar 22 18:29:03 CDT 2020

On 23/03/2020 03:56, Markus Scherer via Unicode wrote:
> On Sat, Mar 21, 2020 at 12:35 PM Doug Ewell via Unicode <unicode at>
> wrote:
>> I thought the whole premise of GB18030 was that it was Unicode mapped into
>> a GB2312 framework. What characters exist in GB18030 that don't exist in
>> Unicode, and have they been proposed for Unicode yet, and why was none of
>> the PUA space considered appropriate for that in the meantime?
> My memory of GB18030 is that its code space has 1.6M code points, of which
> 1.1M are a permutation of Unicode. For the rest you would have to go beyond
> the Unicode code space for 1:1 round-trip mappings.

This matches my recollection. What's more, there are no characters 
allocated in the parts of the GB 18030 codespace that doesn't map to 
Unicode, and there is as far as I understand no plan to use that space. 
It's just there because that was the most straightforward way to extend 
GB 2312/GBK.

Regards,   Martin.

More information about the Unicode mailing list