-
Notifications
You must be signed in to change notification settings - Fork 282
PR: Rewrite ls to use prefixes #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Designed to get around the case where some parts of a bucket are not accessible
@jseabold can you recommend a few tests to ensure that the behavior you're looking for checks out? |
Will have a look. Working on getting the ACLs for you that would allow you to make a role with these settings. It sounds like our typical settings were best practices from AWS but still trying to chase that down. |
ls('') should list buckets, but [] if no bockets or anon. ls('nonexistent') should raise if can't access it (whether or not it actually exists)
path = path[len('s3://'):] | ||
path = path.rstrip('/') | ||
bucket, key = split_path(path) | ||
key = key + '/' if key else "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe raise an informative error here if the path doesn't end in a forward slash? Also, key is really a prefix here, if things are called correctly. Maybe use that nomenclature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather not raise here, I think it's reasonable to do ls('bucket/path') and expect to get keys below bucket/path/.
OK to calling it prefix.
In list_objects call, these are not really keys
@jseabold , I'm sorry, I still can't reproduce your situation, where you can't even list some bucket/prefix. The link is about user-level access policies, but I don't have the privileges to create those. Is there any chance we can set up a little bucket which encapsulates this problem for testing? Hare you checked whether my changes above solve the problem for you? |
This didn't work on the first thing I tried, but there's a lot going on here, so I'm trying to pin down why. Fundamentally, there's something I don't yet understand with how you're instantiating the client. This works
This doesn't
|
It's because you're calling ls('') in the instantiation to make sure that everything worked I guess. I can't do this, but I need still need to sign my requests. |
Does this list and cache the entire bucket on instantiation? Again still trying to understand what's going on here, but, if so, that can be a super expensive operation. Assuming that's right, I would start from the premise that everyone has a key-value store with hundreds of thousands of keys (if not more) and err on the side of avoiding trying to list those at all cost. |
Other than that, yes, this seems to fix my issues in some quick checks. Thanks! Will let you know if I run in to anything else. |
Oh! Thank you for finding the root ( That can certainly be avoided - it exists for the |
I don't follow this. Seems like a heavy way to check if credentials are valid. Given that it will happen later, why check at all? If you really want to handhold, you could fail gracefully later and tell the users to check their credentials or use If you attach the session, users can call S3FileSystem.session.get_credentials() and look themselves. I don't see how to get these from the service clients. |
Excellent idea! Also, you may have seen in #48 that there may be a simpler way coming to access single files one at a time (so you don't get filesystem methods, but you do get file-like behaviour, buffering etc). |
lgtm |
Designed to get around the case where some parts of a bucket are not accessible
Fixes #38
@jseabold , please see if this fixes things for you.