Skip to content

Conversation

davidh44
Copy link
Contributor

@davidh44 davidh44 commented Jun 26, 2023

Motivation and Context

Implementing sync streaming compression

The ContentStreamProvider of the original RequestBody will be wrapped by CompressionContentStreamProvider, which will wrap the underlying InputStream with AwsCompressionInputStream. The AwsCompressionInputStream overrides read(byte[] b, int off, int len) to return compressed chunks.

Modifications

Added new class AwsChunkedEncodingInputStream to compress request in chunks.
Added overloaded compress() methods to Compressor and GzipCompressor to compress byte[] and ByteBuffer.
Renamed DecodedStreamBuffer to UnderlyingStreamBuffer for more general use

Testing

added tests

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document
  • Local run of mvn install succeeds
  • My code follows the code style of this project
  • My change requires a change to the Javadoc documentation
  • I have updated the Javadoc documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have added a changelog entry. Adding a new entry must be accomplished by running the scripts/new-change script and following the instructions. Commit the new file created by the script in .changes/next-release with your changes.
  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • I confirm that this pull request can be released under the Apache 2 license

@davidh44 davidh44 requested a review from a team as a code owner June 26, 2023 20:53
@joviegas
Copy link
Contributor

Could you please update Motivation and Context specs/why/ what are the benefits of Chunked encoding for gzip ?

@davidh44
Copy link
Contributor Author

Could you please update Motivation and Context specs/why/ what are the benefits of Chunked encoding for gzip ?

Updated to add description

Copy link
Contributor

@joviegas joviegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 99 to 101
CompressionContentStreamProvider streamProvider =
new CompressionContentStreamProvider(requestBody.contentStreamProvider(), compressor);
return Optional.of(RequestBody.fromContentProvider(streamProvider, requestBody.contentType()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this enables chunked encoding? Are we missing the Transfer-Encoding: chunked header? Are we making sure that there is no Content-Length header?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding Transfer-Encoding : chunked header
Modeling tools should throw an error for operations with both streaming and requiresLength, will remove the Content-Length header just in case

input = updateContentEncodingHeader(input, compressor);
return updateContentLengthHeader(input);
// non-streaming OR transfer-encoding chunked
if (!isStreaming(context) || isTransferEncodingChunked(input)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case we use isTransferEncodingChunked for non streaming ? I was assuming that transferEncodingChunked is always for streaming.
How is isTransferEncodingChunked() set on the input ? is it via user or on the model itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, TE:chunked is only for streaming. This is checking for the 2 separate cases, not the combined case.

The specification states that when the request is already TE : chunked , we should compress the entire stream first, instead of compressing in chunks.

It is set in the marshallers: AbstractStreamingRequestMarshaller and JsonProtocolMarshaller

@davidh44 davidh44 changed the base branch from feature/master/request-compression to master July 6, 2023 16:15
@davidh44 davidh44 changed the base branch from master to feature/master/request-compression July 6, 2023 16:15
@joviegas
Copy link
Contributor

joviegas commented Jul 6, 2023

https://sonarcloud.io/component_measures?id=aws_aws-sdk-java-v2&pullRequest=4135&metric=new_coverage&view=list
Is showing only 23% can we increase this to 80 and update the test cases to cover

@davidh44
Copy link
Contributor Author

davidh44 commented Jul 6, 2023

https://sonarcloud.io/component_measures?id=aws_aws-sdk-java-v2&pullRequest=4135&metric=new_coverage&view=list Is showing only 23% can we increase this to 80 and update the test cases to cover

This was run before the latest commits. Looks like a Github issue, as the commits were missing from the PR (had to change base branch to fix). Retried the checks through CodeBuild, running

@davidh44 davidh44 closed this Jul 6, 2023
@davidh44 davidh44 reopened this Jul 6, 2023

long contentLength = Long.parseLong(input.firstMatchingHeader("Content-Length").orElse("0"));
return contentLength >= minimumCompressionThreshold;
int requestSize = SdkBytes.fromInputStream(input.contentStreamProvider().newStream()).asByteArray().length;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are reading the entire stream here to get the request size,
We should avoid intermediate reading of the streams this can cause performance impact , can we discuss this with team ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the content length header isn't set yet for nonstreaming operations. not sure how we would get the content length if not by reading the stream, can discuss with team

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated method to check for content length header first, and if not present, then read the stream to determine length

}

@Override
public int hashCode() {
Copy link
Contributor

@joviegas joviegas Jul 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can improve the Coverage of newly added classes by performing EqualsVerifier test , this will make sure equals and hashcode are tested
(generic comment for newly added classes as part of this feature)

import nl.jqno.equalsverifier.EqualsVerifier;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added tests for CompressionType and RequestCompressionConfiguration

@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.    Quality Gate failed

Bug C 13 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 45 Code Smells

70.3% 70.3% Coverage
4.3% 4.3% Duplication

idea Catch issues before they fail your Quality Gate with our IDE extension sonarlint SonarLint

@davidh44 davidh44 mentioned this pull request Jul 21, 2023
12 tasks
@davidh44 davidh44 closed this Jul 21, 2023
@davidh44 davidh44 deleted the hdavidh/sync-streaming-compression branch August 29, 2023 23:44
aws-sdk-java-automation added a commit that referenced this pull request Jul 18, 2025
…4bbc94a63

Pull request: release <- staging/06b7e1ba-f147-418e-94ce-8504bbc94a63
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants