Unicode Teaching in Universities

Doug Ewell doug at ewellic.org
Mon Sep 6 23:34:50 CDT 2021

Martin J. Dürst wrote:

>> Do they still want to use out-of-band character-set designators as
>> font selection hints? Are there still objections to CJK unification?
>> And so on.
> People in Japan are very pragmatic. Unicode works, and so they don't
> see any reasons to complain or object. The fear that "Unicode will
> destroy Japanese kanji", spread in the 1990 by some, clearly hasn't
> come true (as quite some people knew already then).

It would appear, then, that Phake Nick is an outlier, as he (not some unnamed people he has heard from) does seem to have issues with CJK unification, and feels that glyph encoding, or at least separate encoding of "Chinese characters" and "Japanese characters," as the ISO 2022-based encodings provided, would be preferable.

I won't try to go through his entire reply, but here are some selected comments.

First and foremost, https://www.unicode.org/faq/han_cjk.html . The FAQ on Chinese and Japanese reflects what has been agreed upon by numerous experts in China and Japan, as well as experts in the writing systems who live elsewhere. In short, there is wide agreement that differences in preferred font styles do not constitute differences in character identity. Ninety-five percent of Phake Nick's post is about preferred font styles.

Unicode is a character encoding, JUST LIKE the GB and CNS (Chinese) and JIS (Japanese) encodings. NONE of these is an encoding of glyphs. The only difference is that people used to use character-set signaling — in-band or out-of-band — as a hint to display text in a Chinese-type or Japanese-type font. (It had nothing to do with explicit selection of font styles or sizes via "quasi-control characters," whatever those are.) With a universal character set, it is no longer possible to overload character encoding as language tagging or font selection.

It is true that you can't tell, with no context, whether a given Unicode code point represents a "Chinese character" or a "Japanese character." As the FAQ says, "It's the equivalent of asking if 'a' is an English letter or a French one" or whether "chat" is an English word or a French one.

If you really need language tagging, to choose a font or render punctuation or perform spell-checking or text-to-speech or some other process, then use language tagging.

It is not at all true that Han unification erases a distinction between simplified and traditional characters which other encodings preserved, or that Unicode discards mappings between simplified and traditional which other encodings provided. I must have misunderstood those passages.

> In analog era anyone can just write a new characters in ways they
> desire and spread it around, and if the usage picked up then it would
> become part of the language, but it's impossible to do the same
> through Unicode.

Nor through any of the Chinese or Japanese national standards. This is a fact of life with standardized character sets in general, and has nothing to do with Han unification.

Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org

More information about the Unicode mailing list