Last Updated: February 25, 2016
·
6.058K
· scott2b

Python's json module does not always produce valid JSON

File this under good-to-know. Especially if you are doing numeric calculations and shoving them into json files.

Consider the following testjson.py file:

import json 
d = { 
    'a': float('inf'), 
    'b': float('-inf'), 
    'c': float('nan') 
} 
print d['a'], d['b'], d['c']
print json.dumps(d)

python testjson.py

inf -inf nan 
{"a": Infinity, "c": NaN, "b": -Infinity}

The trouble here is that Infinity, -Infinity, and NaN are not valid JSON values. Python's own json module will parse them okay, but if you don't know for sure that your JSON files will be handled with Python, then beware, client applications are not likely to be happy with these values.

4 Responses
Add your response

You could file this as a bug on the official python bug listing. That listing handles not just core python but also the standard libraries as well.

over 1 year ago ·

@fjohnson2

Thanks for the comment. I was inclined to think that this was a design decision and not a bug, so I did a bit more digging on the issue. It turns out that this is documented pretty clearly in the JSON module. The solution is to pass allow_nan=False into the call to dump/dumps:

"If allow_nan is False (default: True), then it will be a ValueError to serialize out of range float values (nan, inf, -inf) in strict compliance of the JSON specification, instead of using the JavaScript equivalents (NaN, Infinity, -Infinity)."

So, this is a feature, not a bug. I'm not sure that this is the design choice I would have made, but it is what it is. If you want a JSON-valid representation that allows for Nan, and the infinities, you will need to roll your own decoder to convert those values to strings, or however you want to represent those values.

The json module documentation is here: http://docs.python.org/2/library/json.html

over 1 year ago ·

To further clarify. The underlying problem here, I believe, is a fault in the JSON specification, which does not allow for the valid Javascript values of NaN, Infinity, and -Infinity. The Python team made a design decision to allow for those values since they are, after all, valid Javascript. There is probably a good reason that these values were left out of the JSON spec, but I do not know what it is, and in my ignorance I am calling it a fault.

Now when I say that the Python choice is not the design decision I would have made, I mean that my inclination would be to make the default calls to functions in the json module to produce only valid JSON (despite my belief that those values should have been included in the JSON spec). I think it's great that there is an option to produce this handy valid Javascript version -- I'm just not sure it should be the default.

However, the way it is happens to be simpler for most use cases. In particular, if you are producing JSON with Python's json module and then consuming it directly in a Javascript environment, all is likely to go well. Problems start happening, rather, when you start consuming that JSON with a library that expects valid JSON -- which is more strict that Javascript. In my case, this was happening with Java.

over 1 year ago ·

Yep. I feel you. I read over the JSON spec before your first reply and it did indeed say that Infinity+- and NaNs were not valid JSON. This prompted within me the same curiosity as to why the Python JSON module was designed to output invalid JSON. So I agree with you when you say that your inclination would have been for the JSON module to dump actually valid JSON and not JavaScript instead.

over 1 year ago ·