-
Notifications
You must be signed in to change notification settings - Fork 212
Utilize S3 as the backing file store #373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Minio is a mostly feature complete S3 clone, running it with Docker locally should be painless. |
For now, when fetching a file, look on S3 first, but if that request fails, return the content column in the database.
Fallback is always AWS, but this allows easy use of minio and the like for local development.
Latest commit implements support for setting the S3 endpoint. This makes it possible to use minio (I've tested). |
I'm going to hold off on working on the upload script until this is merged; my current plan is that we can simply run a query that goes something like At this point I think this PR is ready to go. |
Thanks for putting this together! This looks great, and i'm glad you were able to get this working. I'm slightly against forcing the assumption that you'll either have the S3 connection or a minio server available. Even if we bundle it into the vagrant container for the dev build, i still want a way to be independent of the S3 protocol. The simplest way to do that is to keep the database fallback, and allow the use of S3/minio as an optional optimization. To that end, what happens right now if there are no S3 credentials in the environment? It looks like the "read" code tries to run the rusoto code, which (i assume) returns an error, so it reads out the database value in the Unless/until we reconfigure the virtual machine used with Vagrant to include a minio instance and the proper configuration to make the development instance saved to that, i can't accept this. At the very least, i would want to see directions written into the README so that people who want to work on docs.rs can set it up themselves. (I certainly know nothing about minio and vanishingly little about setting up an S3 bucket, and i would need this information to be able to keep working on docs.rs after this change.) |
I can definitely provide some setup instructions for minio and try and get a patch for Vagrant put together though I don't have Vagrant installed currently (using qemu for docs.rs development instead). With that said, I'll also implement the database fallback to ease transition. |
This avoids developers needing to install/configure minio. While the installation process is somewhat simple, it is still additional overhead that is too much to ask of developers at this point.
Okay, installing minio and configuring it in vagrant without vagrant installed locally proved a little too ambitious for me, so I scaled back. We no longer require minio (or any S3-compatible client); all reads/writes to the file DB will continue to use it if AWS access keys are not present in the environment. Note, however, that if we detect AWS_ACCESS_KEY_ID in the environment we will error on writes to the DB, though not reads to maintain the ability to gradually migrate files to S3. |
src/db/file.rs
Outdated
} else { | ||
let client = s3_client(); | ||
let row = rows.get(0); | ||
let content = client.and_then(|c| c.get_object(GetObjectRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could use the in-s3
hard-coded value to gate this check, so that we don't generate erroneous calls to S3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great idea, it'll definitely help with the common case for now. I've pushed up a commit that does this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good to go! Thanks so much for bearing with the review (and for being much faster at responding to comments than i was at making them >_>
). Let's get this merged, so we can start offloading our files listing from the database.
I've included the relevant access keys on the docs.rs server already, so this can be freely deployed. It should not cause any regressions, other than possibly in performance, as all reads fall back to the database. We'll want to work on a script that moves the existing rows in the database to S3, but that can be done later (and possibly not in this repository at all).
r? @QuietMisdreavus