Unicode password mapping for crypto standard

Sat Jan 9 17:30:45 CST 2016

On 1/5/2016 8:26 AM, Markus Scherer wrote:
> I would specify that UTF-8 must be used, without mapping.
> US-ASCII is a proper subset, so need not be mentioned explicitly, nor 
> distinguished in the protocol.
> Mappings would require that all implementations carry relevant data, 
> and are up to date to recent versions of Unicode, or else 
> previously-unassigned code points will cause failures.
> As long as a user types the same password the same way, or with IMEs 
> that produce the same output, they are fine. Strange variants might 
> improve password security.

Right.

In PRECIS, UTF-8 is enforced. However as you point out, the issue is 
that "strange variants" exist, as well as different IMEs and different 
keyboard/keystroke combinations. A case in point is that 0xFF is not a 
valid UTF-8 octet. However, nothing constrains the underlying technology 
not to use 0xFF, so there should be a way for a user (or process) to 
force the use of specific octet strings as inputs. That is why the 
"password-mapping" parameter is proposed as a hint rather than a strict 
rule.

Also as pointed out, PKCS#8 encrypted blobs are used within PKCS #12, 
which has its own Unicode mapping (based on UTF-16LE).

Sean