<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">It sounds to me that a general
principle ought to be that passwords should be limited to
sequences of "keystrokes", not specific characters. The problem is
that what that means is becoming device-dependent. But we don't
really want device-dependent password rules? Do we?<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">A./<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 4/7/2022 5:37 PM, Martin J. Dürst
via Unicode wrote:<br>
</div>
<blockquote type="cite"
cite="mid:88843b04-3809-dc47-0f1c-caebdbab0b28@it.aoyama.ac.jp">Hello
Tex,
<br>
<br>
I'm surprised I haven't seen any answers to your post yet, I think
it's a very interesting and important topic.
<br>
<br>
On 2022-04-05 08:23, Tex via Unicode wrote:
<br>
<blockquote type="cite">What is the modern recommendation for
globalization of passwords?
<br>
<br>
<br>
1) If your application (web, mobile, desktop, etc.) is used
worldwide, which characters do you allow or restrict?
<br>
</blockquote>
<br>
I don't have an example of an own application where I made such
decisions (in most cases, such decisions are made at a
framework/library level). But in Japan at least, nobody expects to
use anything other than ASCII in passwords. There are two
interrelated reasons for this:
<br>
1) Kanji, Hiragana, and Katakana would require conversion, which
would mean users have to visually check whether they got the right
character. That's not a good idea for passwords.
<br>
2) Conversion choices get stored on the user's system to make
future choices easier, but that would establish a side channel. An
attacker may get access to that data, and when comparing
before/after, can narrow down the choices for passwords
considerably.
<br>
I'd expect this to at least apply for Chinese, too.
<br>
<br>
I'd also guess that many password-related libraries restrict input
to ASCII. But with the deep penetration of smartphones around the
world, the need for non-ASCII passwords is definitely increasing.
As we are working on giving people fully non-ASCII email
addresses, we shouldn't ignore passwords.
<br>
<br>
<br>
<blockquote type="cite">2) How do you deal with writing
direction?
<br>
<br>
My concerns are that confirming and displaying a password might
look different depending on how well the browser or OS
implements RTL writing direction or features like dir=auto. A
user may then not be able to log in because they are instructed
to type it in a way that is inconsistent with what they have
seen on the screen.
<br>
</blockquote>
<br>
This is definitely a problem, but maybe not such a serious one. On
such a system, the user may be used to such inconsistencies. The
user knows what characters they intended to typed, in what order.
When they do a visual check, they don't need to verify the order,
they only need to verify character identity. On smartphone, there
are also many password input methods that only show the last
character.
<br>
<br>
<br>
<blockquote type="cite">
<br>
3) Do you allow control or other invisible characters that
a user may be used to typing in certain phrases? If these are
allowed, how to indicate to the user that they have been used?
<br>
</blockquote>
<br>
I'd just say the less allowed, the better.
<br>
<br>
<br>
<blockquote type="cite">4) Also, should passwords be Unicode
normalized? Seems damned if you do and if you don’t. Do text
input methods generate test the same way or is it possible for a
user to create a password on one system and then not be able to
log in on another device?
<br>
</blockquote>
<br>
The Mac used to do decomposition (NFD), and Windows uses
composition (NFC), at least for file systems. I'm not sure this is
still the case.
<br>
<br>
And there are other issues. In Arabic/Persian for example, there
are different forms of the letter YEH, with different encodings,
for things that may look the same on screen. An Arabic keyboard
and a Farsi keyboard may produce different character codes.
<br>
<br>
<br>
<blockquote type="cite">(Not normalization related, but I have
experienced difficulty logging in to foreign systems, in hotels
etc., when the keyboard is different and it takes a while to
realize I have to abandon muscle memory and remember the actual
password and look for the keys on the keyboard.)
<br>
</blockquote>
<br>
The most important point is not "damned if you do and damned if
you don't", but "whatever you do, make sure you always do exactly
the same thing".
<br>
<br>
This starts way before you get into normalization. For example, do
you remove leading/trailing white space? (The user may have copied
the password from some text file. (That's not very good security,
but some people still do it.))
<br>
<br>
Another example: Do you always have the same length restriction? I
remember a case where I had set a password for a site, and on a
sister site, it only worked after I tried to shorten it. What had
happened was that when I set it, it got accepted but truncated
without telling me, which worked well on the same site because the
same truncation happened again. But the sister site didn't
truncate, and this produced a mismatch. Make sure you tell people
about such issues when they are setting a password, don't just
'fix' things behind the scenes.
<br>
<br>
Also remember that password encryption algorithms work on binary
data, not on characters. For ASCII-only, that doesn't usually
cause problems, but when working with Unicode, you want to make
sure you have a single encoding before the encryption.
<br>
<br>
Please also note that "whatever you do, make sure you always do
exactly the same thing" and using libraries or frameworks may not
work well together, because different libraries/frameworks may do
different things.
<br>
<br>
Regards, Martin.
<br>
</blockquote>
<p><br>
</p>
</body>
</html>