-
Notifications
You must be signed in to change notification settings - Fork 200
CDN issue causing 00-index.tar to be out of sync with available tarballs #537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
CC'ing @davean who manages our Fastly CDN configuration More generally, we may be able reduce the time-window in which a client can experience an inconsistent view of the Hackage repository temporarily, but we will never be able to fully eliminate that window, unless the CDN or any cloud storage provides stronger guarantees. IOW, There's no way around Hackage clients needing to be able to cope with transient object access failures as long as the communication path is made up of potentially unreliable components. When |
Yes, can't be eliminated completely but reduced caching of 404s sounds like a good idea. |
Just to give a little more detail of what I'm doing in case it helps: instead of running through the full mirroring scripts every 1/5/10 minutes, I'm moving over to a watch script which respects the Due to this issue, I've included an arbitrary 10-run forced synchronization in case the index.tar.gz file got out-of-sync with the sdist tarballs. |
One last detail: it may seem like it would be reasonable to just confirm that all tarballs are available instead of using the arbitrary 10-run cutoff. Unfortunately, there's another issue that prevents that from being possible: #436. Since there are some tarballs which legitimately fail the download (because they have been deleted for copyright purposes, but not removed from the index), and other tarballs that fail due to CDN caching issues, I don't see a way to detect that we should ignore the |
Re removed package tarballs, we try to return the appropriate HTTP code "410" (rather than a vague 404), e.g.:
|
I never noticed that. That could be very useful, thanks! |
@snoyberg since you're running a mirror, it'd be perfectly reasonable to bypass the CDN entirely. Then you get to choose if/how to respect the cache-control hints etc. If you'd like to do that, let us know and we can give you the details (ie IP address etc). Also, if you'd like to take part in the public mirroring of hackage (ie serving in the same original format) then you may like to use https://github.com/hvr/hackage-mirror-tool and optionally have your mirror added to the public mirror list http://hackage.haskell.org/mirrors.json . If so, just let us know. |
Update: @snoyberg has set up a new mirror and it is now listed as an official public mirror in the upstream http://hackage.haskell.org/mirrors.json Since the out-of-sync caching/proxying issue does not at appear to be a problem for |
Just to give one last note on all of this: I put a new page on stackage.org to track the relative up-to-dateness of Hackage vs mirrors and Git repos, you can see it at: https://ci.stackage.org/status/mirror I've configured the page to return a status 500 if the lag time is ever more than an hour, so using normal HTTP monitoring tools can give an alert if the mirroring functionality ever stops working. |
@snoyberg that's coincidentally something similar to something half-finished (sans the Git repos status) that I've been hacking on as well, as we needed that for haskell.org as well... except less html'y, just a plain/text .cgi script which validates the TUF meta-data for freshness :-) |
If it would be helpful to add a few more URLs to that table, just say so. It's no big deal for me too track the last-modified of a few more files. |
@snoyberg it may be interesting to add "http://objects-us-west-1.dream.io/hackage-mirror/01-index.tar.gz" there, as well as the |
Cool, commit pushed, should be live in a few minutes. On Wed, Sep 21, 2016 at 11:13 PM, Herbert Valerio Riedel <
|
The mirror I've been running went down about 8 hours ago (see: commercialhaskell/all-cabal-hashes#13). AFAICT, the problem is that the privately provided IP address for the upstream server (behind the CDN) changed. I've switched the mirror to use hackage-origin.haskell.org, is that correct? |
That should be correct, yes. |
I've experienced this personally in running the all-cabal-hashes mirror, and have received user reports. Relevant links:
The idea is: you download the
00-index.tar.gz
file from Hackage (e.g., viacabal update
), and it includes a.cabal
file for a certain package/version combo (likeyaml-0.8.18.6.cabal
). But when you try to downloadyaml-0.8.18.6.tar.gz
from Hackage, you get a 404 for a while, which eventually corrects itself. I've experienced situations where two different build servers - both in the US - returned a 404, while downloading from my house in Israel worked. This leads me to believe it's a regional caching issue with the CDN.Just a complete guess here: perhaps it's worth disabling CDN caching for non-200 responses?
The text was updated successfully, but these errors were encountered: