New CJK characters

john_h_jenkins john_h_jenkins at
Wed Nov 3 17:18:53 CDT 2021

> On Nov 3, 2021, at 3:22 PM, Mark E. Shoulson via Unicode <unicode at> wrote:
> I don't know if IDS sequences can really represent "all" han characters; I'd guess probably not, but there are probably more sophisticated systems that can do better.  There'll probably always be corner cases, though.

They do not. Even more sophisticated systems like CDL don’t. (See L2/21-118 <>.)

I should point out that even sophisticated systems that draw characters based on their IDS (or CDL) are not going to match the quality of a commercial CJK font. 
> But at any rate, it's my understanding that that particular ship has already sailed, and atomic CJK characters is how Unicode does stuff.  Changing that now would be rather more disrupting than just saying "no more precomposed accented letters.”
This is actually touched on in TUS (§ 18.2) and the FAQ (Why doesn't the Unicode Standard adopt a compositional model for encoding Han ideographs? Wouldn't that save a large number of code points? <>). Outside of the momentum issue mentioned, compositional methods don’t work because of “spelling” ambiguity and failure to address issues such as collation, text-to-speech, searching, semantic analysis—basically, everything you want to use text for *other* than rendering. Even in rendering, you aren’t covering the region-specific shapes, at least not with IDS.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list