-
Notifications
You must be signed in to change notification settings - Fork 622
Description
Checkboxes for prior research
- I've gone through Developer Guide and API referenceI've checked AWS Forums and StackOverflow.I've searched for previous similar issues and didn't find any solution.To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.
Describe the bug
Use case: We have a folder on a disk, we want to stream its contents to a tar stream and then stream that to s3. For performance reasons we don't want to create the tar file on disk. We want to upgrade our aws s3 sdk version from 3.282.0 to 3.782.0. This is being done as we want to take advantage of a new feature in s3 where it can calculate checksums for us which is a requirement to enable object lock by default. Currently we do not have object lock enabled by default.
We now get "Unable to calculate hash for flowing readable stream" errors, which seems to come from [https://github.com/smithy-lang/smithy-typescript/blob/2d5a06af634c243bf2469566176dd17afedb1058/packages/hash-stream-node/src/readableStreamHasher.ts#L11C22-L11C38 ]
Logic:
- First create a tar stream from the data on disk
- create a buffered transform (needed b/c with the latest aws sdk version you cannot push data under 8kb in size)
- use pipeline to push data to s3.
Old logic that works in older versions of sdk
- First create a tar stream from the data on disk
- use a passThrough and pipeline to push data to s3.
Old logic that doesn't work with newer version of sdk:
- generates "Only the last chunk is allowed to have a size less than 8192 bytes error is Only the last chunk is allowed to have a size less than 8192 bytes" when instead of using the custom transform we simply use a passThrough
Regression Issue
- Select this option if this issue appears to be a regression.To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.
SDK version number
@aws-sdk/client-s3": "3.782.0"
Which JavaScript Runtime is this issue in?
Node.js
Details of the browser/Node.js/ReactNative version
node version: v20.17.0
Reproduction Steps
// the min amount of data to send to S3 at once
const CHUNK_SIZE = 8 * 1024;
export class BufferedTransform extends Transform {
private _bufferChunks: Buffer[] = [];
private _bufferLength = 0;
constructor(
startPaused = true
) {
super();
if (startPaused) {
this.pause(); // Start paused to avoid immediate flowing
}
}
_transform(chunk: Buffer, _encoding: BufferEncoding, callback: TransformCallback): void {
// Defensive copy of the incoming chunk for safety, I don't want to assume the caller will not modify it or destroy it before we push it
const safeChunk = Buffer.from(chunk);
this._bufferChunks.push(safeChunk);
this._bufferLength += safeChunk.length;
if (this._bufferLength >= CHUNK_SIZE) {
this.pushDataAndClearBuffer();
}
callback();
}
override resume(): this {
// logging_line1
return super.resume();
}
_flush(callback: TransformCallback): void {
this.pushDataAndClearBuffer();
callback();
}
// a helper function that pushes and resets the buffer
pushDataAndClearBuffer(): void {
if (this._bufferLength > 0) {
const combined = Buffer.concat(this._bufferChunks, this._bufferLength);
// logging_line2
if (!this.push(combined)) {
// Pause the stream if the downstream is not ready
this.once('drain', () => this.resume());
this.pause();
}
// Reset internal buffer state
this._bufferChunks.length = 0;
this._bufferLength = 0;
}
}
}
private async uploadSingleFile(
stream: Pack,
size: number,
params: Omit<PutObjectCommandInput, 'Body'>
): Promise<void> {
const bufferedStream = new BufferedTransform(true);
const sendCommandPromise = this.client.send(
new PutObjectCommand({ ...params, Body: bufferedStream, ContentLength: size })
);
/*
if (stream.readableFlowing === true) {
logging_line3
} else {
logging_line4
}
if (bufferedStream.readableFlowing === true) {
logging_line5
} else {
logging_line6
}
*/
const piplelinePromise = pipeline(stream, bufferedStream);
/*
if (stream.readableFlowing === true) {
logging_line7
} else {
logging_line8
}
if (bufferedStream.readableFlowing === true) {
logging_line9
} else {
logging_line10
}
*/
await Promise.all([piplelinePromise, sendCommandPromise]);
}
Observed Behavior
Error "Unable to calculate hash for flowing readable stream" is generated with the exception
Expected Behavior
be able to upload to s3 using streams and pipeline.
Possible Solution
No response
Additional Information/Context
I added logging to many places and this is the results:
- logging line added to "override resume()", ie logging_line1, is never logged
- logging line added to "pushDataAndClearBuffer", ie logging_line2, is never logged
- logging added to uploadSingleFile indicate "readableFlowing" is never true. Not before or after the pipeline
Activity
aBurmeseDev commentedon May 7, 2025
Hey @CWMark - thanks for reaching out.
To resolve this error,
Only the last chunk is allowed to have a size less than 8192 bytes
, add a client config shown below:Refer to this comment for the culprit: #6859 (comment)
Hope that helps!
CWMark commentedon May 7, 2025
Hi @aBurmeseDev, thanks for the response.
I am aware of that setting to allow less than 8192 bytes however if I use that and revert to my original code (no use of my buffering class BufferedTransform) and simply have this as the function to handle small files:
Then I get a lot more errors, all of which so far seem to be "An error was encountered in a non-retryable streaming request". This seems to be the same thing reported in #6770.
Another thing I tried was to keep the use of my buffering class BufferedTransform as is but change the function that handles small files to delay the pipeline with the hopes of giving AWS time to set up the checksum:
This gives less errors however I have seen some "An error was encountered in a non-retryable streaming request" and one "Unable to calculate hash for flowing readable stream".
Any suggestions/ideas? It really seems like the latest AWS s3 sdk is having issues with streaming data.
DMCTowns commentedon Jul 23, 2025
I'm also getting the "Unable to calculate hash for flowing readable stream" error when trying to stream the response from a fetch request to S3:
marcesengel commentedon Aug 4, 2025
@DMCTowns what I've found is that the typings are a little off since they are shared between web and nodejs. Looking at https://github.com/smithy-lang/smithy-typescript/blob/b15137dba0091b26b3dd5d6efaac58545cf1c18a/packages/hash-stream-node/src/readableStreamHasher.ts#L11-L13 the field
readableFlowing
is checked to benull
. This works for nodejs readable streams but not for web streams returned byfetch
. This can be fixed by the following conversionHowever at that point uploading still fails for me, now with the warning
An error was encountered in a non-retryable streaming request.
(see #6770), so for now I'm collecting the stream into a buffer and then send that...Edit: uploading works using
@aws-sdk/lib-storage
.