-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-18565. Completes outstanding items for the SDK V2 upgrade. #5421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18565. Completes outstanding items for the SDK V2 upgrade. #5421
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
I added
to hadoop-tools/hadoop-aws/dev-support/findbugs-exclude.xml, but that hasn't suppressed the spotbug for some reason. not sure what I did wrong.. |
it looks exactly the same as the one above, doesn't it? Sometimes with findbugs is simplest to just surrender: create some setter which is synchronized and set it through that. If you look at BTW, if you don't know already, it is possible to run the findbugs on the command line. I normally only do this and the checkstyle calls when trying to fix things they've reported through yetus
|
couldn't get |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good, if spotbugs is happy, minor nits and then its merge time
transferManager = clientFactory.createS3TransferManager(s3AsyncClient); | ||
} | ||
|
||
// set in synchronized method to suppress spotbugs error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- give it a name like
createS3AsyncClient
as it is more than a simple setter - comment should be a javadoc so IDEs preview it.
💔 -1 overall
This message was automatically generated. |
spotbugs is happy; deprecation warnings file...all that is left is to get the name of the new method right |
💔 -1 overall
This message was automatically generated. |
ok, let's give up on spotbugs. this is one of its enternal losing battles, right? |
here's what I propose: merge then fix |
sounds good. I've to follow up with a few other fixes anyway. For fixing the spotbugs issue, do you mean reverting the changes made for spotbugs (moving to a separate method)? |
new method is fine, we just need to add in whatever shuts spotbugs up. I would have hoped that was it |
oh, and based some other changes, spotbugs warning about extant spotbugs doesn't mean you've added a new one, just that there is one to get rid of. yetus can still warn there even if the pr fixes it, which is what you have just done. |
I'm going to merge this in and then play with myself; see if there are any more final-final-final changes to worry about. then pull into trunk. |
sounds good. I'm currently working on rebasing, there are couple of things that went into trunk for MPU which are causing conflicts, I should be able to complete the rebase and test by tomorrow. also need to open a PR with a couple of outstanding issues mentioned in this PR and the S3 encryption client, that can also be done after trunk merge though |
so you want me to merge this as is? oh, here's my list for the next iteration
|
did you mean to trunk or somewhere else? I was thinking I'll complete the rebase and we can update this feature branch, and then I'll open a PR against trunk to initiate the merge process. Fix any yetus issues and also open a PR for the above list + S3 encryption client. Will we be ok to merge to trunk even though there is a regression in the rename() performance? The SDK team is looking into implementing multipart copy for the java async client in the TM which should solve that. |
…pache#5421) Changes include * use bundled transfer manager * adds transfer listener to upload * adds support for custom signers * don't set default endpoint * removes v1 sdk bundle, only use core package * implements region caching + many more Note: spotbugs is warning about inconsistent synchronization in accessing a new s3a FS field. This will be fixed in a follow-up patch. Contributed by Ahmar Suhail
…pache#5421) Changes include * use bundled transfer manager * adds transfer listener to upload * adds support for custom signers * don't set default endpoint * removes v1 sdk bundle, only use core package * implements region caching + many more Note: spotbugs is warning about inconsistent synchronization in accessing a new s3a FS field. This will be fixed in a follow-up patch. Contributed by Ahmar Suhail
sorry, i'm creating confusion. I was going to merge this into the feature branch. I hadn't realised that the sdk was going to hurt rename; yes, that would be an issue. I think if we do add our implementation, we just need the ability to disable it and fall back to the single COPY in case it turns out that we've gone and broken it... |
did a local build and test with -Dparallel-tests -DtestsThreadCount=8 -Dscale against s3 london; failures in ITestS3ABucketExistence were seen, but I've not looked at the details there. |
…pache#5421) Changes include * use bundled transfer manager * adds transfer listener to upload * adds support for custom signers * don't set default endpoint * removes v1 sdk bundle, only use core package * implements region caching + many more Note: spotbugs is warning about inconsistent synchronization in accessing a new s3a FS field. This will be fixed in a follow-up patch. Contributed by Ahmar Suhail
See aws_sdk_v2_changelog.md for details. Co-authored-by: Ahmar Suhail <[email protected]> Co-authored-by: Alessandro Passaro <[email protected]> HADOOP-18073. Address review comments. (apache#31) addresses review comments + yetus errors Co-authored-by: Ahmar Suhail <[email protected]> Move MultiObjectDeleteException to impl Reinstate old constants Move TransferManager initialization to ClientFactory Add unit tests for BlockingEnumeration Add unit tests for SelectEventStreamPublisher updates new providers in TestS3AAWSCredentialsProvider to V2 update GET range referrer header logic to V2 adds in unit check for bytes HADOOP-18565. Complete outstanding items for the AWS SDK V2 upgrade. (apache#5421) Changes include * use bundled transfer manager * adds transfer listener to upload * adds support for custom signers * don't set default endpoint * removes v1 sdk bundle, only use core package * implements region caching + many more Note: spotbugs is warning about inconsistent synchronization in accessing a new s3a FS field. This will be fixed in a follow-up patch. Contributed by Ahmar Suhail fixes issues after rebase Change-Id: I1d4b658979c725a88f6b54832bb7d106ffc2347d
See aws_sdk_v2_changelog.md for details. Co-authored-by: Ahmar Suhail <[email protected]> Co-authored-by: Alessandro Passaro <[email protected]> HADOOP-18073. Address review comments. (apache#31) addresses review comments + yetus errors Co-authored-by: Ahmar Suhail <[email protected]> Move MultiObjectDeleteException to impl Reinstate old constants Move TransferManager initialization to ClientFactory Add unit tests for BlockingEnumeration Add unit tests for SelectEventStreamPublisher updates new providers in TestS3AAWSCredentialsProvider to V2 update GET range referrer header logic to V2 adds in unit check for bytes HADOOP-18565. Complete outstanding items for the AWS SDK V2 upgrade. (apache#5421) Changes include * use bundled transfer manager * adds transfer listener to upload * adds support for custom signers * don't set default endpoint * removes v1 sdk bundle, only use core package * implements region caching + many more Note: spotbugs is warning about inconsistent synchronization in accessing a new s3a FS field. This will be fixed in a follow-up patch. Contributed by Ahmar Suhail fixes issues after rebase Change-Id: I1d4b658979c725a88f6b54832bb7d106ffc2347d
See aws_sdk_v2_changelog.md for details. Co-authored-by: Ahmar Suhail <[email protected]> Co-authored-by: Alessandro Passaro <[email protected]> HADOOP-18073. Address review comments. (apache#31) addresses review comments + yetus errors Co-authored-by: Ahmar Suhail <[email protected]> Move MultiObjectDeleteException to impl Reinstate old constants Move TransferManager initialization to ClientFactory Add unit tests for BlockingEnumeration Add unit tests for SelectEventStreamPublisher updates new providers in TestS3AAWSCredentialsProvider to V2 update GET range referrer header logic to V2 adds in unit check for bytes HADOOP-18565. Complete outstanding items for the AWS SDK V2 upgrade. (apache#5421) Changes include * use bundled transfer manager * adds transfer listener to upload * adds support for custom signers * don't set default endpoint * removes v1 sdk bundle, only use core package * implements region caching + many more Note: spotbugs is warning about inconsistent synchronization in accessing a new s3a FS field. This will be fixed in a follow-up patch. Contributed by Ahmar Suhail
See aws_sdk_v2_changelog.md for details. Co-authored-by: Ahmar Suhail <[email protected]> Co-authored-by: Alessandro Passaro <[email protected]> HADOOP-18073. Address review comments. (apache#31) addresses review comments + yetus errors Co-authored-by: Ahmar Suhail <[email protected]> Move MultiObjectDeleteException to impl Reinstate old constants Move TransferManager initialization to ClientFactory Add unit tests for BlockingEnumeration Add unit tests for SelectEventStreamPublisher updates new providers in TestS3AAWSCredentialsProvider to V2 update GET range referrer header logic to V2 adds in unit check for bytes HADOOP-18565. Complete outstanding items for the AWS SDK V2 upgrade. (apache#5421) Changes include * use bundled transfer manager * adds transfer listener to upload * adds support for custom signers * don't set default endpoint * removes v1 sdk bundle, only use core package * implements region caching + many more Note: spotbugs is warning about inconsistent synchronization in accessing a new s3a FS field. This will be fixed in a follow-up patch. Contributed by Ahmar Suhail
See aws_sdk_v2_changelog.md for details. Co-authored-by: Ahmar Suhail <[email protected]> Co-authored-by: Alessandro Passaro <[email protected]> HADOOP-18073. Address review comments. (apache#31) addresses review comments + yetus errors Co-authored-by: Ahmar Suhail <[email protected]> Move MultiObjectDeleteException to impl Reinstate old constants Move TransferManager initialization to ClientFactory Add unit tests for BlockingEnumeration Add unit tests for SelectEventStreamPublisher updates new providers in TestS3AAWSCredentialsProvider to V2 update GET range referrer header logic to V2 adds in unit check for bytes HADOOP-18565. Complete outstanding items for the AWS SDK V2 upgrade. (apache#5421) Changes include * use bundled transfer manager * adds transfer listener to upload * adds support for custom signers * don't set default endpoint * removes v1 sdk bundle, only use core package * implements region caching + many more Note: spotbugs is warning about inconsistent synchronization in accessing a new s3a FS field. This will be fixed in a follow-up patch. Contributed by Ahmar Suhail
Description of PR
Jira: https://issues.apache.org/jira/browse/HADOOP-18565
This PR:
aws-java-sdk-core
nowSIGNING_ALGORITHM_STS
identifier, I don’t think this was getting picked up correctly until now.getRegion()
logic. Also, if no endpoint is specified, don’t set the default endpoint. Let the SDK figure it out from the region.S3ABlockOutputStream
.S3AUtils
to use generics, so it can be used by V1 Cred provider, V2 cred provider, and signers.copyFile()
. No meaningful way to add these in V2, as it the transfer listener no longer tracksTransfer_Part_Completed
. Only bytes transferred. (We could maybe check if bytes transferred == partSize, but not sure if we want to do this.)getRegion()
logic, which will throw anUnknownStoreException
, we no longer need this test. It currently fails, because it tries to intialise a FS for a bucket which does not exist.Still TODO
fs.s3a.audit.execution.interceptors
. However can also be added using Global interceptors (however this affects the order in which they are executed).How was this patch tested?
Tested in
eu-west-1
by running mvn -Dparallel-tests -DtestsThreadCount=16 clean verify.All tests pass now.
Note, to get all tests to pass, we should now configure our config with the bucket specific region. Eg:
This is so tests that rely on public buckets don't fail. They will do a bucket probe to get the bucket region and configure the region. If the region was set using
fs.s3a.endpoint.region
, tests which use public buckets will also end up using this region, and since there is no cross region support, they end up failing.