New CJK characters

James Kass jameskass at code2001.com
Wed Nov 3 21:20:13 CDT 2021


Take a Han character already encoded and call it “𝓎”.  Since 𝓎 is 
encoded, it can be entered in plain-text and The Standard serves us 
well.  Rendering (higher level protocol) checks available fonts for 
coverage.  If 𝓎 is covered, that’s the end of it.  But if 𝓎 isn’t 
covered, the application /could/ query the IDS database and construct a 
glyph on the fly.

If there’s an unencoded character, “𝔃”, it can’t be entered in 
plain-text directly.  IDCs/IDSs are a notational system which can serve 
as placeholders in plain-text.  Maybe 𝔃 will be encoded someday, maybe 
not.  Meanwhile The Standard serves us well because this notational 
system is encoded.  Rendering /could/ construct an /ad hoc/ glyph for 𝔃 
which would be exo-Unicode.  The underlying data wouldn’t be altered.

Any application sophisticated enough to generate reasonable glyphs on 
the fly based on IDSs should be sophisticated enough to check any opened 
files for IDSs which have since become encoded and offer the user the 
option of replacing IDSs with Unicode characters as appropriate.

The document linked by John H. Jenkins earlier, L2/21-118, shows that 
efforts are underway to enhance the IDSs by adding missing IDCs as well 
as presently unencoded components.  The current level of support already 
covers the vast majority of encoded characters. When the enhancements 
are accomplished, only the most bizarre edge cases will remain 
unexpressable as IDSs, AFAICT.

We shouldn’t expect Unicode to say that any conformant application must 
substitute glyphs on the fly for IDSs.  But many users would probably 
welcome sophisticated applications which can do it.



More information about the Unicode mailing list