Skip to content

Blob trigger with large file results in timeout #42

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JonathanGiles opened this issue Jul 30, 2018 · 13 comments
Open

Blob trigger with large file results in timeout #42

JonathanGiles opened this issue Jul 30, 2018 · 13 comments

Comments

@JonathanGiles
Copy link
Member

I have a BlobTrigger Java function which I believe is timing out when deployed to Azure after 5 minutes because it takes too long for the function to complete. I say this because I see the first log method Executing 'Functions.JavaDocBlobWatcher' (Reason='New blob detected: incoming/output.zip', Id=0f54292e-460e-4c2b-8e10-f9489fe2f34b), but then I see a second message Timeout value of 00:05:00 exceeded by function 'Functions.JavaDocBlobWatcher' (Id: '0f54292e-460e-4c2b-8e10-f9489fe2f34b'). Initiating cancellation.

The file is 38.1 MB and I believe it is all loaded into the byte array that the BlobTrigger is associated with. From a brief bit of testing I believe that large blobs of data take a very long time to be sent into the function, and I wonder whether something other than a byte[] should be used (e.g. some kind of input stream).

@brunoborges
Copy link
Member

Not only due to the possible timeout, an InputStream should be used anyways. There is no way to adjust the memory of the JVM used behind the scenes on Azure Functions, and therefore OutOfMemory could easily happen for large files.

@brunoborges
Copy link
Member

Might be related: #23

@JonathanGiles
Copy link
Member Author

I suspect #23 is related, but it isn't noted in there the file size. In my experience the time it takes to call the function is proportional to the blob file size, so I expect they are related. I would suggest that this is a critical issue that should be resolved prior to 1.0 being released.

@brunoborges
Copy link
Member

I think this would require significant change and delay GA a lot.

My recommendation is to add a file limit on the trigger side, or a warning, and figure this out in another release.

@JonathanGiles
Copy link
Member Author

The issue with delaying is that it will then be a breaking API change. That's fine if it is understood and accepted.

As an aside, my use case is such that I want to be notified of a new file....but I don't actually need to be sent the byte array of its content (I can grab that myself). I wish there was a way to opt in to being sent the content (again, a stream would help here).

@pragnagopa
Copy link
Member

We only allow 32MB on consumption plan Host/ScriptHost.cs#L76.
@JonathanGiles - can you try on a dedicated app?

@pragnagopa
Copy link
Member

We are tracking adding stream support for out of process languages Azure/azure-functions-host#1361 but not for GA.

@JonathanGiles
Copy link
Member Author

That's an interesting insight, thanks for sharing Pragna. If that is to be hardcoded then we should definitely improve the messaging in the logs to clarify why the function has timed out (it's more precisely not even tried), and I would wonder if we could have an API that allows for the user to opt-out of being sent the content. I can imagine one approach is to have an extra attribute on @BlobTrigger along the lines of sendContent which is by default true, and can be set to false, in which case the content array will be empty, but it will handle any file size.

@JonathanGiles
Copy link
Member Author

Thanks for the stream support link - I'll keep an eye on it as it seems like a big gap we should offer across all functions SDKs.

@pragnagopa
Copy link
Member

Yes. Correct. I will leave this issue open for improving the error message. This value is configurable for dedicated apps. Here is the issue for not reading the data from the blog Azure/azure-functions-host#2698

@JonathanGiles
Copy link
Member Author

Thanks - subscribed to that issue too!

@asavaritayal
Copy link

Related to Azure/azure-functions-host#1361

@AspenForester
Copy link

I'm seeing similar results with Azure PowerShell Functions, Blob triggers, and files as small as 4MB. Like others have mentioned, the content of the blob is useless to me. All I need is the trigger metadata. After that, if I want the content of the blob, I can go get it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants