global password strategies

Thu Apr 7 22:48:50 CDT 2022

Aren’t keystrokes device dependent, since keyboards vary, physically and virtually?

We would have to restrict passwords to the minimal keys that are universal- it is the same problem with a smaller character set.

From: Unicode [mailto:unicode-bounces at corp.unicode.org] On Behalf Of Asmus Freytag via Unicode
Sent: Thursday, April 7, 2022 8:19 PM
To: unicode at corp.unicode.org
Subject: Re: global password strategies

It sounds to me that a general principle ought to be that passwords should be limited to sequences of "keystrokes", not specific characters. The problem is that what that means is becoming device-dependent. But we don't really want device-dependent password rules? Do we?

A./

On 4/7/2022 5:37 PM, Martin J. Dürst via Unicode wrote:

Hello Tex, 

I'm surprised I haven't seen any answers to your post yet, I think it's a very interesting and important topic. 

On 2022-04-05 08:23, Tex via Unicode wrote: 

What is the modern recommendation for globalization of passwords? 

1)      If your application (web, mobile, desktop, etc.) is used worldwide, which characters do you allow or restrict? 

I don't have an example of an own application where I made such decisions (in most cases, such decisions are made at a framework/library level). But in Japan at least, nobody expects to use anything other than ASCII in passwords. There are two interrelated reasons for this: 
1) Kanji, Hiragana, and Katakana would require conversion, which would mean users have to visually check whether they got the right character. That's not a good idea for passwords. 
2) Conversion choices get stored on the user's system to make future choices easier, but that would establish a side channel. An attacker may get access to that data, and when comparing before/after, can narrow down the choices for passwords considerably. 
I'd expect this to at least apply for Chinese, too. 

I'd also guess that many password-related libraries restrict input to ASCII. But with the deep penetration of smartphones around the world, the need for non-ASCII passwords is definitely increasing. As we are working on giving people fully non-ASCII email addresses, we shouldn't ignore passwords. 

2)      How do you deal with writing direction? 

My concerns are that confirming and displaying a password might look different depending on how well the browser or OS implements RTL writing direction or features like dir=auto. A user may then not be able to log in because they are instructed to type it in a way that is inconsistent with what they have seen on the screen. 

This is definitely a problem, but maybe not such a serious one. On such a system, the user may be used to such inconsistencies. The user knows what characters they intended to typed, in what order. When they do a visual check, they don't need to verify the order, they only need to verify character identity. On smartphone, there are also many password input methods that only show the last character. 

3)      Do you allow control or other invisible characters that a user may be used to typing in certain phrases? If these are allowed, how to indicate to the user that they have been used? 

I'd just say the less allowed, the better. 

4)      Also, should passwords be Unicode normalized? Seems damned if you do and if you don’t. Do text input methods generate test the same way or is it possible for a user to create a password on one system and then not be able to log in on another device? 

The Mac used to do decomposition (NFD), and Windows uses composition (NFC), at least for file systems. I'm not sure this is still the case. 

And there are other issues. In Arabic/Persian for example, there are different forms of the letter YEH, with different encodings, for things that may look the same on screen. An Arabic keyboard and a Farsi keyboard may produce different character codes. 

(Not normalization related, but I have experienced difficulty logging in to foreign systems, in hotels etc., when the keyboard is different and it takes a while to realize I have to abandon muscle memory and remember the actual password and look for the keys on the keyboard.) 

The most important point is not "damned if you do and damned if you don't", but "whatever you do, make sure you always do exactly the same thing". 

This starts way before you get into normalization. For example, do you remove leading/trailing white space? (The user may have copied the password from some text file. (That's not very good security, but some people still do it.)) 

Another example: Do you always have the same length restriction? I remember a case where I had set a password for a site, and on a sister site, it only worked after I tried to shorten it. What had happened was that when I set it, it got accepted but truncated without telling me, which worked well on the same site because the same truncation happened again. But the sister site didn't truncate, and this produced a mismatch. Make sure you tell people about such issues when they are setting a password, don't just 'fix' things behind the scenes. 

Also remember that password encryption algorithms work on binary data, not on characters. For ASCII-only, that doesn't usually cause problems, but when working with Unicode, you want to make sure you have a single encoding before the encryption. 

Please also note that "whatever you do, make sure you always do exactly the same thing" and using libraries or frameworks may not work well together, because different libraries/frameworks may do different things. 

Regards,   Martin. 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20220407/661661e9/attachment.htm>