<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <div class="moz-cite-prefix">On 12/21/2020 1:08 AM, Martin J. Dürst

      via Unicode wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:e68d936e-8158-0d47-fbb2-c713b3264563@it.aoyama.ac.jp">Hello

      David, others,

      <br>

      <br>

      On 20/12/2020 16:23, David Starner via Unicode wrote:

      <br>

      <blockquote type="cite">On Sat, Dec 19, 2020 at 4:49 AM Otto Stolz

        via Unicode

        <br>

        <a class="moz-txt-link-rfc2396E" href="mailto:unicode@unicode.org"><unicode@unicode.org></a> wrote:

        <br>

        <blockquote type="cite">A notorious German example:

          <br>

              Er hat in Moskau liebe Genossen. (= He’s got dear comrades

          at Moskow)

          <br>

              Er hat in Moskau Liebe genossen. (= He has enjoyed love at

          Moskow)

          <br>

              (And I assure you, the prosody varies accordingly, hence

          the

          <br>

              difference is quite clear in speech, and must be preserved

          <br>

              in writing.)

          <br>

        </blockquote>

        <br>

        She _loves_ him !?! (= I can't believe her emotion towards him

        is love.)

        <br>

        She loves _him_ !?! (= I can't believe that he is the one she

        loves,

        <br>

        and not someone else.)

        <br>

        <br>

        And the prosody varies accordingly, and any accurate

        preservation in

        <br>

        writing would need to record the difference.

        <br>

      </blockquote>

      <br>

      I think the above "and most be preserved in writing" is easy to

      misunderstand, as it is a bit too strong. It wouldn't have been

      preserved on very early computers (or earlier, in telegrams) that

      only used upper case. But there was a very strong expectation that

      it would be preserved on things as simple as a typewriter, and

      definitely also in handwriting.

      <br>

      <br>

      On the other hand, there is no such expectation for your example.

      If prosody has to be reconstructed, that might happen e.g. from

      context (e.g. in a playscript), or the sentences might have been

      rewritten for clarity in the first place.

      <br>

      <br>

      I don't think there is a single writing system that is able to

      denote every aspect of spoken language. When compared with spoken

      language, most writing systems leave something out. (Some may also

      add something, e.g. distinction of some homonyms.)

      <br>

    </blockquote>

    <p><br>

    </p>

    <p>The difference in the examples is rather fundamental. In the

      first, the two meanings are utterly unrelated. In the second, they

      differ less in the fundamental statement, but in the speaker's

      presumed attitude. Stressing a word can sometimes disambiguate how

      a sentence is to be interpreted, but not always. There are many

      more situations where we rely on context to know how to "read" a

      statement where there isn't a simple way to annotate that.</p>

    <p>German can, of course, be written in all lowercase (if you care

      to) and yes, you will come across statements where you don't know

      what is being said, let alone "how" it is being said (or

      intended). There are periodic discussions that have a slightly

      circular quality to them: because of the way German orthography

      uses case, you can write things in a certain way (and Readers

      expect the cues and are used to the style of written language it

      allows).<br>

      <br>

      If all-lowercase use were to be enforced, you would see people

      avoid ambiguous wording; the written language would change. To

      some degree. It's unclear, ahead of actually carrying out the

      experiment, how intrusive these changes would turn out to be.

      Outright pairs of sentences that differ only in case are not

      common, but German has a rather flexible word order, so visible

      cues about which words are nouns may contribute in more

      significant ways to readability than would be the case for

      languages with a more strict word order.</p>

    <p>As things stand, case is necessarily part of the orthography and

      w/o deep semantic analysis, cannot be "computed" as someone had

      suggested. Technically, it could be captured as an attribute (and

      if all implementations supported that seamlessly users wouldn't

      care) However, it can be argued that use of upper case letters is

      just as much part of "spelling" as letter choice and in some

      applications users intuitively treat the uppercase letters as

      extensions of a set, rather than a style. <br>

    </p>

    <p>Disambiguating prosody in some exceptional cases to settle the

      desired interpretation of a statement is not in the same (basic

      orthographic) category -- relegating it to a layer (rich text)

      that is designed to handle many similar tasks is and remains the

      proper approach.<br>

    </p>

    <blockquote type="cite"

      cite="mid:e68d936e-8158-0d47-fbb2-c713b3264563@it.aoyama.ac.jp">

      <br>

      <br>

      <blockquote type="cite">

        <blockquote type="cite">As only the author (and no other stage,

          be it human or automatic) can

          <br>

          know the intended meaning, Unicode is quite right when

          encoding the case

          <br>

          distinction.

          <br>

        </blockquote>

        <br>

        Meh. I could come up with similar examples, though probably a

        bit more

        <br>

        contrived, for just about every bit of markup. Italics/emphasis

        has a

        <br>

        bunch of pretty clear meaning changes, like the example above,

        <br>

        possibly more than casing in English. Fraktur/Antiqua mixing

        allows

        <br>

        for any number of examples; "<fraktur>Er

        was</fraktur> clever." is

        <br>

        different from "<fraktur>Er was clever</fraktur>".*

        Casing certainly

        <br>

        had more of an argument to be encoded in the character set than

        <br>

        italics, historically,

        <br>

      </blockquote>

      <br>

      Exactly.

      <br>

      <br>

      <br>

      <blockquote type="cite">but I can imagine an alternate history,

        maybe

        <br>

        one the leaders in computing history used a non-casing script,

        where

        <br>

        casing was relegated to markup, and a lot of issues would be

        <br>

        easier--no more problems with case-insensitive matching, and the

        <br>

        Turkish i would be a font difference under markup.

        <br>

      </blockquote>

      <br>

      An alternate history indeed. The history we followed gave us

      italics relegated to markup, and avoided the problems with

      italic-insensitive matching. And please note that your alternate

      history does NOT lead to technology that encodes italics

      separately. [And that I was perfectly able to put stress on a word

      in the previous sentence without italics, even if the main purpose

      of that was just to make a point.] Also, it's not clear that

      encoders starting with a non-casing script would have decided to

      relegate casing to markup. It's pretty annoying to markup single

      letters, and to change the markup when a word moves to the start

      of a sentence, and these are the main uses for upper case.

      <br>

    </blockquote>

    <br>

    <p>While users on some level don't care how something is

      implemented, and only about how it is exposed to them, there are

      side effects of making an underlying choice that tend to "leak".

      It is at those points that a technical solution that obeys the

      "least astonishment" principle will ultimately be superior, all

      things being equal. <br>

    </p>

    <p>A./<br>

    </p>

    <blockquote type="cite"

      cite="mid:e68d936e-8158-0d47-fbb2-c713b3264563@it.aoyama.ac.jp">

      <br>

      <br>

      <blockquote type="cite">* Italics marking in English could serve

        the same role in making a

        <br>

        bunch of examples; e.g. "The French man said to stop at the

        coin" and

        <br>

        "The French man said to stop at the <i>coin</i>."

        mean different

        <br>

        things.

        <br>

      </blockquote>

      <br>

      The important thing here is "could". Unicode doesn't invent

      writing systems. And I have to admit that I don't understand the

      difference between these two sentences even with your italic

      markup. But that may be only me.

      <br>

      <br>

      Regards,   Martin.

      <br>

    </blockquote>

    <p><br>

    </p>

  </body>

</html>