<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <div class="moz-cite-prefix">It sounds to me that a general

      principle ought to be that passwords should be limited to

      sequences of "keystrokes", not specific characters. The problem is

      that what that means is becoming device-dependent. But we don't

      really want device-dependent password rules? Do we?<br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">A./<br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">On 4/7/2022 5:37 PM, Martin J. Dürst

      via Unicode wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:88843b04-3809-dc47-0f1c-caebdbab0b28@it.aoyama.ac.jp">Hello

      Tex,

      <br>

      <br>

      I'm surprised I haven't seen any answers to your post yet, I think

      it's a very interesting and important topic.

      <br>

      <br>

      On 2022-04-05 08:23, Tex via Unicode wrote:

      <br>

      <blockquote type="cite">What is the modern recommendation for

        globalization of passwords?

        <br>

        <br>

          <br>

        1)      If your application (web, mobile, desktop, etc.) is used

        worldwide, which characters do you allow or restrict?

        <br>

      </blockquote>

      <br>

      I don't have an example of an own application where I made such

      decisions (in most cases, such decisions are made at a

      framework/library level). But in Japan at least, nobody expects to

      use anything other than ASCII in passwords. There are two

      interrelated reasons for this:

      <br>

      1) Kanji, Hiragana, and Katakana would require conversion, which

      would mean users have to visually check whether they got the right

      character. That's not a good idea for passwords.

      <br>

      2) Conversion choices get stored on the user's system to make

      future choices easier, but that would establish a side channel. An

      attacker may get access to that data, and when comparing

      before/after, can narrow down the choices for passwords

      considerably.

      <br>

      I'd expect this to at least apply for Chinese, too.

      <br>

      <br>

      I'd also guess that many password-related libraries restrict input

      to ASCII. But with the deep penetration of smartphones around the

      world, the need for non-ASCII passwords is definitely increasing.

      As we are working on giving people fully non-ASCII email

      addresses, we shouldn't ignore passwords.

      <br>

      <br>

      <br>

      <blockquote type="cite">2)      How do you deal with writing

        direction?

        <br>

        <br>

        My concerns are that confirming and displaying a password might

        look different depending on how well the browser or OS

        implements RTL writing direction or features like dir=auto. A

        user may then not be able to log in because they are instructed

        to type it in a way that is inconsistent with what they have

        seen on the screen.

        <br>

      </blockquote>

      <br>

      This is definitely a problem, but maybe not such a serious one. On

      such a system, the user may be used to such inconsistencies. The

      user knows what characters they intended to typed, in what order.

      When they do a visual check, they don't need to verify the order,

      they only need to verify character identity. On smartphone, there

      are also many password input methods that only show the last

      character.

      <br>

      <br>

      <br>

      <blockquote type="cite">

        <br>

        3)      Do you allow control or other invisible characters that

        a user may be used to typing in certain phrases? If these are

        allowed, how to indicate to the user that they have been used?

        <br>

      </blockquote>

      <br>

      I'd just say the less allowed, the better.

      <br>

      <br>

      <br>

      <blockquote type="cite">4)      Also, should passwords be Unicode

        normalized? Seems damned if you do and if you don’t. Do text

        input methods generate test the same way or is it possible for a

        user to create a password on one system and then not be able to

        log in on another device?

        <br>

      </blockquote>

      <br>

      The Mac used to do decomposition (NFD), and Windows uses

      composition (NFC), at least for file systems. I'm not sure this is

      still the case.

      <br>

      <br>

      And there are other issues. In Arabic/Persian for example, there

      are different forms of the letter YEH, with different encodings,

      for things that may look the same on screen. An Arabic keyboard

      and a Farsi keyboard may produce different character codes.

      <br>

      <br>

      <br>

      <blockquote type="cite">(Not normalization related, but I have

        experienced difficulty logging in to foreign systems, in hotels

        etc., when the keyboard is different and it takes a while to

        realize I have to abandon muscle memory and remember the actual

        password and look for the keys on the keyboard.)

        <br>

      </blockquote>

      <br>

      The most important point is not "damned if you do and damned if

      you don't", but "whatever you do, make sure you always do exactly

      the same thing".

      <br>

      <br>

      This starts way before you get into normalization. For example, do

      you remove leading/trailing white space? (The user may have copied

      the password from some text file. (That's not very good security,

      but some people still do it.))

      <br>

      <br>

      Another example: Do you always have the same length restriction? I

      remember a case where I had set a password for a site, and on a

      sister site, it only worked after I tried to shorten it. What had

      happened was that when I set it, it got accepted but truncated

      without telling me, which worked well on the same site because the

      same truncation happened again. But the sister site didn't

      truncate, and this produced a mismatch. Make sure you tell people

      about such issues when they are setting a password, don't just

      'fix' things behind the scenes.

      <br>

      <br>

      Also remember that password encryption algorithms work on binary

      data, not on characters. For ASCII-only, that doesn't usually

      cause problems, but when working with Unicode, you want to make

      sure you have a single encoding before the encryption.

      <br>

      <br>

      Please also note that "whatever you do, make sure you always do

      exactly the same thing" and using libraries or frameworks may not

      work well together, because different libraries/frameworks may do

      different things.

      <br>

      <br>

      Regards,   Martin.

      <br>

    </blockquote>

    <p><br>

    </p>

  </body>

</html>