-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-19681: Fix S3A failing to initialize S3 buckets having namespace with dot followed by number #7942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
Test |
@steveloughran : Could you please review the changes. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this complicates things. I've been happily treating all issues related to buckets with . in them as WONTFIX. Storediag tells people off too.
- make sure that FileSystem.get(URI, config) always returns the same instance from cache (and a different one from that with a a .2 as a suffix)
- could this be moved to a unit test, rather than an ITest?
- you will need to add something in the documentation, especially about path style access -and warn that support is not guaranteed across other versions/applications.
Configuration config = new Configuration(); | ||
Path path = new Path("s3a://test-bucket-v1.1"); | ||
try (FileSystem fs = path.getFileSystem(config)) { | ||
assertThat(fs instanceof S3AFileSystem) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use whatever assertj assertion is about instanceof, so you get a better error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
@steveloughran, in HADOOP-17241, you referenced this announcement as one of the reasons to not support this buckets with dot in the name: https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/ However, AWS have since revised their stance. AWS has confirmed they will continue supporting buckets with dots in their names through virtual hosted-style URLs due to customer feedback and compatibility requirements.
So I guess it makes sense to add support for it in S3A. |
…ce with dot followed by number
ea1308a
to
5014c32
Compare
Thanks @steveloughran for the review. I have addressed your comments. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Description of PR
S3A fails to initialize when S3 bucket namespace is having dot followed by a number.
Specific Problem: URI parsing fails when S3 bucket names contain a dot followed by a number (like bucket-v1.1-us-east-1). Java's
URI.getHost() method incorrectly interprets the dot-number pattern as a port specification, causing it to return null.
How was this patch tested?
Tested in us-east-1 with bucket having namespace with dot followed by a number.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?