Ways to detect that XXXX in JSON \uXXXX does not correspond to a Unicode character?
verdy_p at wanadoo.fr
Fri May 8 06:48:38 CDT 2015
replaced characters, no deleted characters even if there are unpaired
surrogates or non-characters like '\uFFFF').
The RFC is deviating from the currently running implementations.
2015-05-08 13:04 GMT+02:00 Daniel Bünzli <daniel.buenzli at erratique.ch>:
> Le vendredi, 8 mai 2015 à 05:08, Philippe Verdy a écrit :
> > The RFC is jsut informative not normative,
> RFC 7159 is not informational, it is a proposed standard.
> > Try by yourself, you can perfectly send JSON text containing '\uFFFF'
> (non-character) or '\uF800' (unpaired surrogate) and I've not seen any JSON
> implementation complaining about one or the other,
> Well now you have (mine). The RFC is very clear that we are dealing with
> *text-based* data not *binary* data. Maybe programming languages that
> represent their Unicode strings as possibly invalid UTF-16 sequences will
> happily input this but as section 8.2 mentions that may not be the case
> everywhere, software receiving these values "might return different values
> for the length of a string value or even suffer fatal runtime exceptions".
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Unicode