<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>I'm waiting for some of the old-timers here to give a proper
answer, Unicode history-wise.</p>
<p>As I understood it, the idea of using IDS or something similar
for CJK characters was considered (probably more than once) and it
was decided to do things this way, and so that's the way we're
doing them.</p>
<p>A font wouldn't necessarily have to be able to generate new hanzi
dynamically from IDS descriptions; it could have all the 100,000
or however many glyphs already there, and just render the known
ones like ligatures or something. It means it's still up to
font-designers to add characters when they're needed, but the list
of characters is then open-ended and it's up to font-designers to
decide what they want to support.</p>
<p>OTOH, as is well known, IDS descriptions are not unique. There's
frequently more than one way to slice a character up. Should
*all* be supported? Should there be some way to decide the
"canonical" decomposition? I guess if we're leaving it up to
fonts, it's then up to the font designers again, but that would
break all the non-font uses of Unicode (searching, comparing, etc)
unless there is some canonical representation.</p>
<p>I don't know if IDS sequences can really represent "all" han
characters; I'd guess probably not, but there are probably more
sophisticated systems that can do better. There'll probably
always be corner cases, though.</p>
<p>But at any rate, it's my understanding that that particular ship
has already sailed, and atomic CJK characters is how Unicode does
stuff. Changing that now would be rather more disrupting than
just saying "no more precomposed accented letters."<br>
</p>
<div class="moz-cite-prefix">On 11/2/21 21:03, Abraham Gross via
Unicode wrote:<br>
</div>
<blockquote type="cite"
cite="mid:06a21f1247e942ea71dec7178a8ebe22@disroot.org">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<div data-html-editor-font-wrapper="true" style="font-family:
arial, sans-serif; font-size: 13px;">
<div>
<div>
<div style="font-family: arial, sans-serif;font-size: 13px">
<div>
<div>
<div style="font-family: arial, sans-serif;font-size:
13px">
<div>
<div>
<div style="font-family: arial,
sans-serif;font-size: 13px">
<div>
<div>
<div style="font-family: arial,
sans-serif;font-size: 13px">
<div>
<div>
<div style="font-family: arial,
sans-serif;font-size: 13px">
<div>
<div>
<div style="font-family:
arial, sans-serif;font-size:
13px">
<div>
<div>
<div style="font-family:
arial,
sans-serif;font-size:
13px">I have a
proposal regarding the
future of encoding new
Unihan characters into
Unicode that I'd like
to float by this group
to see if it makes any
sense. ....<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
~mark<br>
</body>
</html>