Ways to detect that XXXX in JSON \uXXXX does not correspond to a Unicode character?
Daniel Bünzli
daniel.buenzli at erratique.ch
Fri May 8 06:04:08 CDT 2015
Le vendredi, 8 mai 2015 à 05:08, Philippe Verdy a écrit :
> The RFC is jsut informative not normative,
RFC 7159 is not informational, it is a proposed standard.
> Try by yourself, you can perfectly send JSON text containing '\uFFFF' (non-character) or '\uF800' (unpaired surrogate) and I've not seen any JSON implementation complaining about one or the other,
Well now you have (mine). The RFC is very clear that we are dealing with *text-based* data not *binary* data. Maybe programming languages that represent their Unicode strings as possibly invalid UTF-16 sequences will happily input this but as section 8.2 mentions that may not be the case everywhere, software receiving these values "might return different values for the length of a string value or even suffer fatal runtime exceptions".
Best,
Daniel
More information about the Unicode
mailing list