Counting Codepoints
Richard Wordingham
richard.wordingham at ntlworld.com
Mon Oct 12 15:23:09 CDT 2015
On Mon, 12 Oct 2015 17:29:13 +0200
Philippe Verdy <verdy_p at wanadoo.fr> wrote:
> But between two implementations
> the result of the scanner could still be different because the
> replacement character is not specified. If that result "sanitized"
> string is then used to generate an URI, the URI is also unpredictable
> and will vary between implementations, as well as its effective
> length. If it is used to generate an identifier granting some new
> access, such as a user name, several new user names could be
> generated from the same input.
TUS 8.0 Section 3 Requirement C10 has the following, wise words in its
final paragraph:
"However, such repair of mangled data is a special case, and it must
not be used in circumstances where it would cause security problems."
Richard.
More information about the Unicode
mailing list