-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Allow Support for Uploading Byte Arrays and Strings in S3 TransferManager #964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
To give some context, the application ingests information using Kafka/Spark Streaming so I operate on an RDD to get my byte arrays. I then want to upload those byte arrays to S3. The application would operate completely in memory if I didn't need to create a tempfile (which also means the application would be more portable because I only have to worry about memory on my production environment). |
…tribution and fix the totsuki implementation to actually work (with better performance)
Hi @pradyuman, would wrapping your byte array in a |
The SDK also has |
Those are both good options, but the arrays are 100-200MB in size so I imagine that I would benefit from the multipart optimizations. |
I would be happy for an in-memory version of |
I would love to see this implemented, has there been any progress? |
No progress has been made. We are considering supporting InputStreams in TransferManager (which would allow for other types like strings and byte arrays easily) but that's quite a ways off. aws/aws-sdk-java-v2#139 For now we recommend using an overload of put or get object (one takes/returns a string) and submitting to an executor if you need parallelization. |
Hey @pradyuman @mzapletal @rmilejcz, the SDK team has reviewed the feature request list for V1, and since they're concentrating efforts on V2 new features they decided to not implement this one in V1. It's still being considered for the TransferManager refactor in V2, see the referenced issue above. I'll go ahead and close this one. Please feel free to comment on the V2 issue with your use case, and reach out if you have further questions. |
Currently, the only way to upload data to S3 via TransferManager is through an InputStream or a File. I'm creating a service needs to upload data that is in memory. This means I need to write that Array[Byte] to a temp file, and then upload that temp file which introduces other variables that need to be tuned for performance (now need to make sure the temporary file system is optimized for our use case). It would be great if I could just pass in the byte array (can also convert to string if that's easier). This would mean I could keep everything in memory and would keep my performance benchmarking and production environment tuning simpler.
I'm sure this is a use case many other people may run into (or have already run into) so I think it's definitely worth taking a look at.
The text was updated successfully, but these errors were encountered: