Unicode in passwords

Marc Durdin marc at keyman.com
Thu Oct 1 00:19:35 CDT 2015

That’s a good list. A few other things I’ve seen:

1.       Even if the user sees the character for an instant, complex script characters can be very puzzling as they appear differently and “out of order” when isolated.

2.       The number of dots corresponds to the number of code points, which is misleading with complex scripts or advanced input methods: you won’t necessarily see one dot per keystroke; in some cases, typing a character may replace a dot with another dot or even delete a dot.

3.       Directionality can be frustrating.

I’ve had to assist in situations where a user has set a new Windows password using a custom keyboard, and then been unable to login, e.g. with Remote Desktop, or even with the standard Windows login screen.

iOS, for example, doesn’t even allow the user to select a different input method for password boxes – it seems to always be Latin script only (even if you’ve removed all your Latin script keyboards from Settings).


From: Unicode [mailto:unicode-bounces at unicode.org] On Behalf Of Mark Davis ??
Sent: Thursday, 1 October 2015 3:01 PM
To: Jonathan Rosenne <jonathan.rosenne at gmail.com>
Cc: Unicode Public <unicode at unicode.org>
Subject: Re: Unicode in passwords

I've heard some concerns, mostly around the UI for people typing in passwords; that they get frustrated when they have to type their password on different devices:

  1.  A device may not have keyboard mappings with all the keys for their language.
  2.  The keyboard mappings across devices vary where they put keys, especially for minority script characters using some pattern of shift/alt/option/etc.. So the pattern of keys that they use on one may be different than on another.
  3.  People are often 'blind' to the characters being entered: they just see a dot, for example. If the keyboards for their language are not standard, then that makes it difficult.
  4.  Even if they see, for an instant, the character they type, if the device doesn't have a font for their language's characters, it may be just a box.
  5.  Even if those are not true, the glyph may not be distinctive enough if the size is too small.


— Il meglio è l’inimico del bene —

On Thu, Oct 1, 2015 at 6:11 AM, Jonathan Rosenne <jonathan.rosenne at gmail.com<mailto:jonathan.rosenne at gmail.com>> wrote:

For languages such as Java, passwords should be handled as byte arrays rather than strings. This may make it difficult to apply normalization.

Jonathan Rosenne

From: Unicode [mailto:unicode-bounces at unicode.org<mailto:unicode-bounces at unicode.org>] On Behalf Of Clark S. Cox III
Sent: Thursday, October 01, 2015 2:16 AM
To: Hans Åberg
Cc: unicode at unicode.org<mailto:unicode at unicode.org>; John O'Conner
Subject: Re: Unicode in passwords

On 2015/09/30, at 13:29, Hans Åberg <haberg-1 at telia.com<mailto:haberg-1 at telia.com>> wrote:

On 30 Sep 2015, at 18:33, John O'Conner <jsoconner at gmail.com<mailto:jsoconner at gmail.com>> wrote:

Can you recommend any documents to help me understand potential issues (if any) for password policies and validation methods that allow characters from more "exotic" portions of the Unicode space?

On UNIX computers, one computes a hash (like SHA-256), which is then used to authenticate the password up to a high probability. The hash is stored in the open, but it is not known how to compute the password from the hash, so knowing the hash does not easily allow authentication.

So if the password is

… normalized and then …

encoded in say UTF-8 and then hashed, it would seem to take care of most problems.

You really wouldn’t want “Schlüssel” and “Schlüssel” being different passwords, would you? (assuming that my mail client and/or OS is not interfering, the first is NFC, while the second is NFD)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20151001/a0b10e48/attachment.html>

More information about the Unicode mailing list