Unicode in passwords

Mon Oct 5 22:37:58 CDT 2015

Some additional concerns:

- Input methods for Chinese, Japanese,... need visual feedback to check 
that the correct Han character was selected. That may show (some parts 
of) the password to bystanders.

- Length limitations of 8 bytes are few and far between these days, but 
they still exist. Even where they are gone, they may have been replaced 
with "safe" limitations, say e.g. 50 bytes. That may still be pretty 
restrictive for some languages when using UTF-8.

- There may occasionally be different length limitations for different 
kinds of access with the same password. That can create very difficult 
situations where the length limitation cuts off part of a UTF-8 byte 
sequence.

- Some interfaces try to estimate the 'quality' of a password on 
password creation. Short passwords, or passwords with only lower-case 
Latin may be rejected, others labeled as 'medium safe', and so on. A 
password with lots of bytes may be labeled as 'excellent' even though it 
consists of characters all taken from the same small script, and thus 
has rather low entropy. Of course, there's the effect that at least for 
a while, the bad guys may think it's too bothersome to try non-ASCII 
passwords, so that may temporarily make them somewhat safer.

Regards,   Martin.

On 2015/10/01 14:01, Mark Davis ☕️ wrote:
> I've heard some concerns, mostly around the UI for people typing in
> passwords; that they get frustrated when they have to type their password
> on different devices:
>
>     1. A device may not have keyboard mappings with all the keys for their
>     language.
>     2. The keyboard mappings across devices vary where they put keys,
>     especially for minority script characters using some pattern of
>     shift/alt/option/etc.. So the pattern of keys that they use on one may be
>     different than on another.
>     3. People are often 'blind' to the characters being entered: they just
>     see a dot, for example. If the keyboards for their language are not
>     standard, then that makes it difficult.
>     4. Even if they see, for an instant, the character they type, if the
>     device doesn't have a font for their language's characters, it may be just
>     a box.
>     5. Even if those are not true, the glyph may not be distinctive enough
>     if the size is too small.
>
>
>
> Mark <https://google.com/+MarkDavis>
>
> *— Il meglio è l’inimico del bene —*
>
> On Thu, Oct 1, 2015 at 6:11 AM, Jonathan Rosenne <jonathan.rosenne at gmail.com
>> wrote:
>
>> For languages such as Java, passwords should be handled as byte arrays
>> rather than strings. This may make it difficult to apply normalization.
>>
>>
>>
>> Jonathan Rosenne
>>
>>
>>
>> *From:* Unicode [mailto:unicode-bounces at unicode.org] *On Behalf Of *Clark
>> S. Cox III
>> *Sent:* Thursday, October 01, 2015 2:16 AM
>> *To:* Hans Åberg
>> *Cc:* unicode at unicode.org; John O'Conner
>> *Subject:* Re: Unicode in passwords
>>
>>
>>
>>
>>
>> On 2015/09/30, at 13:29, Hans Åberg <haberg-1 at telia.com> wrote:
>>
>>
>>
>>
>>
>> On 30 Sep 2015, at 18:33, John O'Conner <jsoconner at gmail.com> wrote:
>>
>> Can you recommend any documents to help me understand potential issues (if
>> any) for password policies and validation methods that allow characters
>> from more "exotic" portions of the Unicode space?
>>
>>
>> On UNIX computers, one computes a hash (like SHA-256), which is then used
>> to authenticate the password up to a high probability. The hash is stored
>> in the open, but it is not known how to compute the password from the hash,
>> so knowing the hash does not easily allow authentication.
>>
>> So if the password is
>>
>>
>>
>> … normalized and then …
>>
>>
>>
>> encoded in say UTF-8 and then hashed, it would seem to take care of most
>> problems.
>>
>>
>>
>> You really wouldn’t want “Schlüssel” and “Schlüssel” being different
>> passwords, would you? (assuming that my mail client and/or OS is not
>> interfering, the first is NFC, while the second is NFD)
>>
>