Ways to detect that XXXX in JSON \uXXXX does not correspond to a Unicode character?

Daniel Bünzli daniel.buenzli at erratique.ch
Fri May 8 06:04:08 CDT 2015


Le vendredi, 8 mai 2015 à 05:08, Philippe Verdy a écrit :
> The RFC is jsut informative not normative,  

RFC 7159 is not informational, it is a proposed standard.  

> Try by yourself, you can perfectly send JSON text containing '\uFFFF' (non-character) or '\uF800' (unpaired surrogate) and I've not seen any JSON implementation complaining about one or the other,  
Well now you have (mine). The RFC is very clear that we are dealing with *text-based* data not *binary* data. Maybe programming languages that represent their Unicode strings as possibly invalid UTF-16 sequences will happily input this but as section 8.2 mentions that may not be the case everywhere, software receiving these values  "might return different values for the length of a string value or even suffer fatal runtime exceptions".  

Best,

Daniel





More information about the Unicode mailing list