Skip to content

hook for cache encoding #2899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
g-k opened this issue Nov 6, 2017 · 20 comments
Closed

hook for cache encoding #2899

g-k opened this issue Nov 6, 2017 · 20 comments
Labels
plugin: cache related to the cache builtin plugin type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature

Comments

@g-k
Copy link

g-k commented Nov 6, 2017

It'd be handy to have a way to set JSON encoding options for the default cacheprovider.

I resorted to patching cache.set from pytest_configure to encode datetime.datetimes.

Is there a cleaner way to do this? Would -p no:_cacheprovider and a custom cache provider be recommended instead?

@RonnyPfannschmidt
Copy link
Member

you should never ever need that

whats your use case

@g-k
Copy link
Author

g-k commented Nov 6, 2017

For testing our AWS configs we want to cache responses from botocore and use them as fixtures. The responses can include datetimes.

@RonnyPfannschmidt
Copy link
Member

the cache plugin has support for creating directories and using them to store file that use a more rich storage than json

for the pytest internals we limit the capabilities of the storage mechanisms to prevent edge-cases

the basic idea is - if you need really fancy storage, go for a own file

while it would be easy to say support adding a encoder type, we'd suddenly be api locked
while going for "only json supported" we are pretty much fine in any case

what we could consider is providing a better usable vriant of the loader/dumper mechanism in order for people to easily create richer setups without needing to lock ourselves in

@nicoddemus
Copy link
Member

Perhaps we should just go ahead and change our encoding to support datatime objects? They are builtin, seems reasonable to support that out of the box.

This does not replace the general idea of a more general save/load mechanism, but would probable be enough for most users.

@nicoddemus nicoddemus added the type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature label Nov 9, 2017
@RonnyPfannschmidt
Copy link
Member

json does not support datetime natively, and at least for me thats currently the end of the discussion, because if we invent a own pytest-json-with-datetimes is no longer json

also there is the much richer encoder from execnet, which also supports more data-types like frozensets, sets, tuples, bytes and so on - all of them native, fundamental python types that json also has no native support for

@nicoddemus
Copy link
Member

json does not support datetime natively, and at least for me thats currently the end of the discussion, because if we invent a own pytest-json-with-datetimes is no longer json

I could swear there was a flag for that, but I'm probably thinking of YAML. Yeah then it probably is a no go.

So I think we are reaching the consensus of creating new hooks for encoding/decoding of the cache contents?

@RonnyPfannschmidt
Copy link
Member

i believe its reasonable to add this via a new keyword argument for example - a whole hook seems the wrong way around

@nicoddemus
Copy link
Member

Rough idea:

def pytest_make_cache_codec(rootdir, config, ...):
     """
     Return an object with get/set signature identical to the internal cache codec. 
     Return None to use the default cache codec.
     """

@nicoddemus
Copy link
Member

i believe its reasonable to add this via a new keyword argument for example

Hmm keyword argument to what function?

@RonnyPfannschmidt
Copy link
Member

@nicoddemus i beleive we should take a look at providing a api via the cache plugin to get cache sections with a dedicated serializer/deserializer pair

the default serializer/deserializer being the json module with a .json extension

from my pov the pytest hook system is entirely unsuitable for managing them since it does not support the required flows of controll and data

@RonnyPfannschmidt
Copy link
Member

an example would be cache.set('someplugin/foo', value, serializer=date_json)`

as well as cache.make_serializer(extension='json+dates', encoder=DateAwareJsonEncoder(), decoder=DateAwareJsonDecoder)

@nicoddemus
Copy link
Member

Hmm sorry, what would be the advantage over what I proposed?

@RonnyPfannschmidt
Copy link
Member

a) control - your proposal gave control over what codec to use to a piece of code that has literally no idea and is first serve first come
b) locality - the pieces that know what they need can get what they need

@nicoddemus
Copy link
Member

I see, thanks for clarifying; it was not clear to me at first sight.

a) control - your proposal gave control over what codec to use to a piece of code that has literally no idea and is first serve first come
b) locality - the pieces that know what they need can get what they need

This two points make sense, but IMHO in practice most uses of codecs just want to same some simple Python objects, nothing too fancy. Our limitations of what can be currently saved by the cache plugin are purely because we chose the JSON backend; if we had chosen a more complete serializer (for example pickle) which can save more objects, we probably wouldn't need this discussion. Note that I'm not criticizing the choice of JSON, in anyway, given that is adequate most of the time and is in the standard library.

Also this poses all new complexity which will have to be implemented from scratch, as you rightfully commented that the current plugin system is not adequate for it. I wonder how different serializers would be installed and by which plugins, how to register them etc.

IMHO this seems overly complex in two aspects, both in implementation (which incurs the risk of never happening) and over engineering (I don't think we need all that flexibility).

I think examples of reasonable implementation of a cache-provider plugin might be one for execnet and pickle backends, while another which saves the data to a shared drive so it can be reused by different computers on a CI farm (plugins which track tests and dependencies and only run the affected tests come to mind).

Again, I think most usages of the cache plugin is for very simple objects and does not require a new multi-layer of cache encoders/decoders, which can be registered multiple times and in different flavors.

Also we can consider that def pytest_make_cache_codec(rootdir, config, ...): is a simple extension of what we have and it provides the opportunity for someone to implement a pytest-flex-cache plugin, where the object returned by def pytest_make_cache_codec(rootdir, config, ...) supports the make_serializer solution you propose.

But of course I may being simplistic now and I have to eat my foot later 😝

@RonnyPfannschmidt
Copy link
Member

the main reason i did split the specification of the serializer in such a way is to allow multiple definitions of a compatible serializer in different kinds of plugins, a more ideal way would be to let a plugin get something like a "sub-cache" that is configured with a namespace , a encoder/decoder and a file extension, the default just being json with .json as exposed in the global cache object

but i dont see a hook being a sane tool for that since the hooks simply have the wrong arity - for more arities that plugin systems can have see stevedore

@RonnyPfannschmidt
Copy link
Member

(bascially the arity i see needed is specifying a "driver" for a key or a name-space - hooks are fundamentally unable to provide that

@Zac-HD
Copy link
Member

Zac-HD commented Feb 25, 2019

Closing this issue as the proposal has been inactive for over a year.

@nicoddemus
Copy link
Member

Reopening as I still see this as relevant. 👍

@nicoddemus nicoddemus reopened this Feb 25, 2019
@RonnyPfannschmidt
Copy link
Member

@nicoddemus given the lack of initiative i think its reasonable to go for "write your own file" until a reasonable proposal gets delivered

and based on that i'd like to close this

@Zac-HD
Copy link
Member

Zac-HD commented Jul 6, 2024

Closing again as per #12465 (comment).

@Zac-HD Zac-HD closed this as not planned Won't fix, can't repro, duplicate, stale Jul 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plugin: cache related to the cache builtin plugin type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature
Projects
None yet
Development

No branches or pull requests

4 participants