Skip to content

FileNotFoundError when accessing same file from multiple processes #8411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
prateekjjw001 opened this issue Nov 3, 2023 · 9 comments
Closed
Labels
needs mcve https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

Comments

@prateekjjw001
Copy link

What is your issue?

I am trying to access the same file using xr.load_dataset() from multiple processes. They are all trying to read it at the same time (or within 0.1s of each other). But only the 1st process is able to access it, while the other processes are not able to read it. It gives a generic "FileNotFoundError" even though the file is there. The file is written about 2-3s before it is read by different processes. Is this expencted? Earlier I suspected xr.open_dataset() to be the culprit but removing that with load_dataset() also did not solve the issue.

The issue is sporadic and cannot be reproduced easily but it happens in our production process. Any suggestions please?

@prateekjjw001 prateekjjw001 added the needs triage Issue that has not been reviewed by xarray team member label Nov 3, 2023
Copy link

welcome bot commented Nov 3, 2023

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

@keewis
Copy link
Collaborator

keewis commented Nov 3, 2023

Note that as far as I'm aware open_dataset itself is not designed to be thread-safe, which would mean that this kind of issue is to be expected (see #4100 for more details and suggestions).

That said, we might be able to improve the error message.

@prateekjjw001
Copy link
Author

Thanks for your reply keewis.
On this page hdf thread safety it mentions that HDF5 supports concurrent access to a single dataset from multiple processes - this is what I am trying to do. I am reading from separate process, so I should not get this issue?

@max-sixty max-sixty added needs mcve https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports and removed needs triage Issue that has not been reviewed by xarray team member labels Nov 3, 2023
@max-sixty
Copy link
Collaborator

Assuming they're actually different processes, I can't see what would be causing this on the xarray side — there shouldn't be any shared resources between different processes reading files.

An MCVE is somewhat necessary here to understand exactly what's happening.

@prateekjjw001
Copy link
Author

Does it matter that multiple processes are running on the same server? Once we moved jobs to different servers, they the issue disappeared.

@max-sixty
Copy link
Collaborator

Are they in different processes or the same process in different threads?

MCVE please!

@prateekjjw001
Copy link
Author

I use airflow, I created 5 separate jobs on same server. They all run at the same time. I do not create threads anywhere. SO they are in different processes right?

@max-sixty
Copy link
Collaborator

I'm not sure how airflow works there...

I think to make progress here, we'd need a minimal example that reproduces the problem outside of airflow...

@max-sixty
Copy link
Collaborator

Closing as no MCVE, feel free to reopen with one

@max-sixty max-sixty closed this as not planned Won't fix, can't repro, duplicate, stale Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs mcve https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports
Projects
None yet
Development

No branches or pull requests

3 participants