Is the binaryness/textness of a data format a property?
Eli Zaretskii via Unicode
unicode at unicode.org
Sat Mar 21 15:26:24 CDT 2020
> From: "Doug Ewell" <doug at ewellic.org>
> Cc: <unicode at unicode.org>
> Date: Sat, 21 Mar 2020 13:33:18 -0600
>
> > Emacs uses some of that for supporting charsets that cannot be mapped
> > into Unicode. GB18030 is one example of such charsets. The internal
> > representation of characters in Emacs is UTF-8, so it uses 5-byte
> > UTF-8 like sequences to represent such characters.
>
> When 137,468 private-use characters aren't enough?
Why is that relevant to the issue at hand?
> I thought the whole premise of GB18030 was that it was Unicode mapped into a GB2312 framework. What characters exist in GB18030 that don't exist in Unicode, and have they been proposed for Unicode yet
I don't remember off hand, but last time I looked at GB18030, there
were a lot of them not in Unicode.
> and why was none of the PUA space considered appropriate for that in the meantime?
Because many fonts already use them? I don't really know why it was
decided to use codepoints above 0x1FFFFF, it's just that this is how
Emacs works for quite some time. You asked for examples of usage, and
I provided one.
More information about the Unicode
mailing list