Unicode in passwords

Norbert Lindenberg unicode at lindenbergsoftware.com
Tue Oct 6 12:39:08 CDT 2015


> On Oct 6, 2015, at 6:04 , Philippe Verdy <verdy_p at wanadoo.fr> wrote:
> 
> In those conditions, normalizing the Java string will leave those lone surrogates (and non-characters) as is, or will throw an exception, depending on the API used. Java strings do not have any implied encoding (their "char" members are also unrestricted 16-bit code units, they have some basic properties but only in BMP, defined in the builtin Character class API: properties for non-BMP characters require using a library to provide them, such as ICU4J).

The Java Character class was enhanced in J2SE 5.0 to support supplementary characters. The String class was specified to be based on UTF-16, and string processing throughout the platform was updated to support supplementary characters based on UTF-16. These changes have been available to the public since 2004. For a summary, see
http://www.oracle.com/technetwork/articles/java/supplementary-142654.html

Norbert


More information about the Unicode mailing list