Ways to detect that XXXX in JSON \uXXXX does not correspond to a Unicode character?
daniel.buenzli at erratique.ch
Fri May 8 07:32:51 CDT 2015
Le vendredi, 8 mai 2015 à 13:48, Philippe Verdy a écrit :
But not *only* for a long time now.
> The RFC is deviating from the currently running implementations.
Well did you test them all ? There's quite a big list here http://www.json.org. Taking a random one mentioned on that page leads me to http://golang.org/pkg/encoding/json/ in which they say that they replace invalid UTF-16 surrogate pairs by U+FFFD. This is really not very surprising since apparently go's strings as text are UTF-8 encoded so when you need to produce your results as UTF-8 then you don't have a lot of solutions... error and/or U+FFFD.
More information about the Unicode