Ways to detect that XXXX in JSON \uXXXX does not correspond to a Unicode character?
daniel.buenzli at erratique.ch
Thu May 7 15:29:27 CDT 2015
Le jeudi, 7 mai 2015 à 21:59, Markus Scherer a écrit :
> Some code stores binary data (sequence of arbitrary 16-bit unsigned integers) in a "string", just because it is easy and fairly efficient to transport.
> You should "validate" *text* only when you are certain that it is indeed text.
Section 8.2  of the spec specifically says that only strings that represent sequences of Unicode scalar values (they say "characters") are interoperable and that strings that do not represent such sequences like "\uDEAD" can lead to unpredictable behaviour.
If you want to transmit binary data reliably in json you must apply some form of binary to Unicode scalar value encoding (like in most text based interchange formats).
More information about the Unicode