Skip to content

Add CLI options for enhancing requests with HTTP headers #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 11, 2020

Conversation

wisechengyi
Copy link

Cherry pick pypa#8078

Fixes: pypa#8109

Copy link

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be ideal to have some feedback on the upstream PR before merging this (in particular: I suspect that they might request a more limited feature), but it looks well tested and fairly isolated.

If Benjy or John are fine with landing it, I am.

@jsirois
Copy link
Member

jsirois commented May 3, 2020

I don't understand the purpose of this except as a hack around a misconfigured server. In other words, pip uses requests and installs a cache aware adapter that looks kosher:
https://github.com/pypa/pip/blob/6385f02b1aa7716fef3e914646abbd873b6349d8/src/pip/_internal/network/session.py#L286-L299
https://github.com/pypa/pip/blob/6385f02b1aa7716fef3e914646abbd873b6349d8/src/pip/_vendor/cachecontrol/controller.py
So if the server - whether an index server or a find-links server, responds to GETs with proper cache control headers, the pip client should honor them fwict.

@wisechengyi
Copy link
Author

wisechengyi commented May 4, 2020

Hi @jsirois that's a good question. In fact it requires both client and server to be configured correctly, excerpt from pypa#8109 (comment):

Having experimented with the code, to summarize it in a more understandable way:

non-zero max-age in request header allows/turns on caching for a particular url
max-age in the response determines how long pip is going to keep it valid

That said, there's no rush. We can wait till pypa#8078 gets merged upstream.

@jsirois
Copy link
Member

jsirois commented May 4, 2020

Thanks @wisechengyi. I guess I still don't believe that can be true. If it is true it means there is a semi-substantial amount of code in pip dedicated to caching for all HTTP requests (for both simple index requests and find-links requests), that's broken. Since the code is there, it should work. Having to manually add headers to make it work sounds broken. I'll carve out time to do my own experiments.

@wisechengyi
Copy link
Author

The below change could be useful when experimenting against a http server instead of https server, as the http adapter will just ignore cache.

diff --git a/src/pip/_vendor/requests/sessions.py b/src/pip/_vendor/requests/sessions.py
index 2845880..0242255 100644
--- a/src/pip/_vendor/requests/sessions.py
+++ b/src/pip/_vendor/requests/sessions.py
@@ -720,7 +720,10 @@ class Session(SessionRedirectMixin):
         :rtype: requests.adapters.BaseAdapter
         """
         for (prefix, adapter) in self.adapters.items():
-
+            print(adapter)
+            from pip._vendor.cachecontrol import CacheControlAdapter
+            if isinstance(adapter, CacheControlAdapter):
+                return adapter
             if url.lower().startswith(prefix.lower()):
                 return adapter

@jsirois
Copy link
Member

jsirois commented May 4, 2020

@wisechengyi is the HTTP case your case? If so, can your findlinks server be converted to HTTPS?

@wisechengyi
Copy link
Author

Right, I experimented against a http server locally. On pip master, http won't be cached, but since I was able to use the diff above to get around it, I won't necessarily need a https server.

@jsirois
Copy link
Member

jsirois commented May 4, 2020

Right, I experimented against a http server locally. On pip master, http won't be cached, but since I was able to use the diff above to get around it, I won't necessarily need a https server.

I'm confused. In this PR you present a patch that allows working around non-caching. In your HTTP vs HTTPS comment you do the same. Either way, you're patching Pex's vendored pip to work around non-caching. What I'm trying to get at is, does caching actually work as things stand - no patches - if the find-links server is:

  1. Serving via HTTPS
  2. Setting proper cache control headers.

@wisechengyi
Copy link
Author

wisechengyi commented May 4, 2020

Sorry about the confusion.

does caching actually work as things stand - no patches - if the find-links server is:
Serving via HTTPS
Setting proper cache control headers.

The answer is no from my experiement, because max-age is always 0 in the request header (https://github.com/pypa/pip/blob/master/src/pip/_internal/index/collector.py#L152-L166), and pip needs max-age > 0 to turn on caching.

@jsirois
Copy link
Member

jsirois commented May 11, 2020

Thanks Yi. I do not like what Pip did there with max-age, but that's not your problem and this gives us a fix for now.

@jsirois jsirois merged commit f9dde7c into pex-tool:master May 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow cache TTL override for --find-links urls
4 participants