21-bit codepoints versus JIS?

suzuki toshiya mpsuzuki at hiroshima-u.ac.jp
Fri Nov 8 02:05:09 CST 2024


Although I appreciate the people making huge efforts for a communication,
my resource is too short to participate that - please let me change the
subject for this off-topic discussion...

--

Dear Jim,

> I haven't had any occasion to poke around at 21-bit Unicode
> codepoints. The JIS standards only have 303 kanji with them; all added
> in the JIS X 0213 standard introduced in 2000.

I understand your background is academic study of Japanese language, but
is there any special reason to mention to JIS X 0213, during the discussion
of general purpose encoding scheme of UTF-8?

In Japan, many running systems keep the restriction of JIS X 0208,
especially in public sectors. Also, the customers of Japanese printing
factories are often expected to make a data with JIS X 0208 and PUA glyphs,
instead of full repertoire of CJK Unified Ideograph.

On the other hand, Japanese young people emit many non-JIS characters to
their SNS accounts, like so-called "emoji", or Indic or Arabic characters
to design their favorite face symbols.

I think, the popularity of "21-bit Unicode codepoint" in Japanese text is
highly dependent with the category of the text.

Regards,
mpsuzuki


On 2024/11/08 15:04, Jim Breen via Unicode wrote:
> On Fri, 8 Nov 2024 at 11:37, Markus Scherer <markus.icu at gmail.com> wrote:
>> On Thu, Nov 7, 2024 at 3:03 PM Jim Breen via Unicode <unicode at corp.unicode.org> wrote:
>>>
>>> On rare occasions, I need to dig into UTF-8 at the bit level. I have a
>>> note pinned near my desk as an aide memoire. It has 3 lines:
>>>
>>> UTF-8
>>> zzzzyyyyyxxxxx
>>> 1110zzzz 10yyyyyy 10xxxxxx
>>
>> 11110nnn 10zzzzzz 10yyyyyy 10xxxxxx
> 
> I haven't had any occasion to poke around at 21-bit Unicode
> codepoints. The JIS standards only have 303 kanji with them; all added
> in the JIS X 0213 standard introduced in 2000.
> 
> [As I wrote in my "A Brief History of Japanese Character Set
> Standards" (https://www.edrdg.org/~jwb/paperdir/kanjicomp.html) "the
> main lasting impact of the JIS X 0213 standard will probably be the
> additional 303 kanji it contributed to Unicode."]
> 
> Jim
> 



More information about the Unicode mailing list