<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>I'm waiting for some of the old-timers here to give a proper

      answer, Unicode history-wise.</p>

    <p>As I understood it, the idea of using IDS or something similar

      for CJK characters was considered (probably more than once) and it

      was decided to do things this way, and so that's the way we're

      doing them.</p>

    <p>A font wouldn't necessarily have to be able to generate new hanzi

      dynamically from IDS descriptions; it could have all the 100,000

      or however many glyphs already there, and just render the known

      ones like ligatures or something.  It means it's still up to

      font-designers to add characters when they're needed, but the list

      of characters is then open-ended and it's up to font-designers to

      decide what they want to support.</p>

    <p>OTOH, as is well known, IDS descriptions are not unique.  There's

      frequently more than one way to slice a character up.  Should

      *all* be supported?  Should there be some way to decide the

      "canonical" decomposition?  I guess if we're leaving it up to

      fonts, it's then up to the font designers again, but that would

      break all the non-font uses of Unicode (searching, comparing, etc)

      unless there is some canonical representation.</p>

    <p>I don't know if IDS sequences can really represent "all" han

      characters; I'd guess probably not, but there are probably more

      sophisticated systems that can do better.  There'll probably

      always be corner cases, though.</p>

    <p>But at any rate, it's my understanding that that particular ship

      has already sailed, and atomic CJK characters is how Unicode does

      stuff.  Changing that now would be rather more disrupting than

      just saying "no more precomposed accented letters."<br>

    </p>

    <div class="moz-cite-prefix">On 11/2/21 21:03, Abraham Gross via

      Unicode wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:06a21f1247e942ea71dec7178a8ebe22@disroot.org">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <div data-html-editor-font-wrapper="true" style="font-family:

        arial, sans-serif; font-size: 13px;">

        <div>

          <div>

            <div style="font-family: arial, sans-serif;font-size: 13px">

              <div>

                <div>

                  <div style="font-family: arial, sans-serif;font-size:

                    13px">

                    <div>

                      <div>

                        <div style="font-family: arial,

                          sans-serif;font-size: 13px">

                          <div>

                            <div>

                              <div style="font-family: arial,

                                sans-serif;font-size: 13px">

                                <div>

                                  <div>

                                    <div style="font-family: arial,

                                      sans-serif;font-size: 13px">

                                      <div>

                                        <div>

                                          <div style="font-family:

                                            arial, sans-serif;font-size:

                                            13px">

                                            <div>

                                              <div>

                                                <div style="font-family:

                                                  arial,

                                                  sans-serif;font-size:

                                                  13px">I have a

                                                  proposal regarding the

                                                  future of encoding new

                                                  Unihan characters into

                                                  Unicode that I'd like

                                                  to float by this group

                                                  to see if it makes any

                                                  sense. ....<br>

                                                </div>

                                              </div>

                                            </div>

                                          </div>

                                        </div>

                                      </div>

                                    </div>

                                  </div>

                                </div>

                              </div>

                            </div>

                          </div>

                        </div>

                      </div>

                    </div>

                  </div>

                </div>

              </div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    ~mark<br>

  </body>

</html>