Skip to content

Error caching datetime - Object is not JSON serializable #3231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jaraco opened this issue Feb 16, 2018 · 4 comments
Closed

Error caching datetime - Object is not JSON serializable #3231

jaraco opened this issue Feb 16, 2018 · 4 comments
Labels
plugin: cache related to the cache builtin plugin type: enhancement new feature or API change, should be merged into features branch

Comments

@jaraco
Copy link
Contributor

jaraco commented Feb 16, 2018

Today, I wished to cache a datetime (the last time I queried http://ifconfig.co because they throttle to .016Hz). When I tried to store the datetime in the cache, I ran into this error:

tests/test_main.py:37: in throttle_ifconfig
    request.config.cache.set('last ifconfig', datetime.datetime.now())
.tox/python/lib/python3.6/site-packages/_pytest/cacheprovider.py:98: in set
    json.dump(value, f, indent=2, sort_keys=True)
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py:179: in dump
    for chunk in iterable:
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py:437: in _iterencode
    o = _default(o)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <json.encoder.JSONEncoder object at 0x10d3a3438>
o = datetime.datetime(2018, 2, 16, 10, 34, 3, 50186)

    def default(self, o):
        """Implement this method in a subclass such that it returns
            a serializable object for ``o``, or calls the base implementation
            (to raise a ``TypeError``).
    
            For example, to support arbitrary iterators, you could
            implement default like this::
    
                def default(self, o):
                    try:
                        iterable = iter(o)
                    except TypeError:
                        pass
                    else:
                        return list(iterable)
                    # Let the base class default method raise the TypeError
                    return JSONEncoder.default(self, o)
    
            """
        raise TypeError("Object of type '%s' is not JSON serializable" %
>                       o.__class__.__name__)
E       TypeError: Object of type 'datetime' is not JSON serializable

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py:180: TypeError

Obviously, the current behavior expects that only JSON-serializable types are allowed in the cache, which is rather limiting. It would be nice if the cache module allowed for hooks to encode/decode values to allowable JSON.

@jaraco
Copy link
Contributor Author

jaraco commented Feb 16, 2018

I worked around the issue by adding jsonpickle as a dependency and adding this fixture to the test suite:

@pytest.fixture(scope='session', autouse=True)
def use_jsonpickle_in_cache():
	pickler = jsonpickle.pickler.Pickler()
	unpickler = jsonpickle.unpickler.Unpickler()

	class JSON:
		@staticmethod
		def dump(value, *args, **kwargs):
			json.dump(pickler.flatten(value), *args, **kwargs)

		@staticmethod
		def load(*args, **kwargs):
			return unpickler.restore(json.load(*args, **kwargs))

	cache = pytest.config.cache
	cacheprovider = importlib.import_module(cache.__class__.__module__)
	json = cacheprovider.json
	cacheprovider.json = JSON

This approach works, and I now the cache can accept any jsonpickle-serializable object, including datetime:

$ cat .pytest_cache/v/last\ ifconfig
{
  "__reduce__": [
    {
      "py/type": "datetime.datetime"
    },
    [
      "B+ICEAsOMwRN6A=="
    ]
  ],
  "py/object": "datetime.datetime"
}

Obviously, this approach is brittle, as it relies on implementation details of the cache module.

I imagine pytest could adopt this technique natively, though that might be undesirable, as it adds a dependency on jsonpickle and also exposes the cache to possible security vulnerabilities (if untrusted content can reach the cache) plus I can see that the project previously explicitly set out to avoid broader encoding, though arguably jsonpickle is less invasive than execnet.

What I would recommend instead is the Cache class expose a hook whereby a codec can be supplied for the content, such that one could use a supported interface to enable more robust cache support:

#conftest.py

def pytest_configure():
  pytest.config.cache.use_codec(
    jsonpickle.pickler.Pickler().flatten,
    jsonpickle.unpickler.Unpickler().restore,
  )

I'm not familiar with the order of operations, so not sure if this hook point is the right one. It seems desirable that the hook should happen as early as possible and not earlier, such that for the configured project, the hook is activated before values are loaded or stored.

@RonnyPfannschmidt
Copy link
Member

for sanity, security, and controll of those affected, i'd like to leave the structuring/restructuring of the processing data vs the serialized format up to the users of the cache,

its simply too easy to slip in a security mistake or compatibility break on the framework level thats bad for everyone,

for your example i would propose to store the data as a timestamp

additional i believe it may be sensible to allow users to pass in own serializers in some way

it would be interesting/beneficial to talk about the layering of such an abstraction (the current approach has various shortcomings that might be nicely addressable)

@nicoddemus nicoddemus added type: enhancement new feature or API change, should be merged into features branch plugin: cache related to the cache builtin plugin labels Feb 26, 2018
@nicoddemus
Copy link
Member

Duplicate of the discussion in #2899

@jaraco
Copy link
Contributor Author

jaraco commented Mar 26, 2018

Agreed it's a dupe.

@jaraco jaraco closed this as completed Mar 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plugin: cache related to the cache builtin plugin type: enhancement new feature or API change, should be merged into features branch
Projects
None yet
Development

No branches or pull requests

3 participants