-
Notifications
You must be signed in to change notification settings - Fork 900
v5+ corrupts data when using MPI IO indexed view #11917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@chhu thank you for the report, but I am afraid I will need a bit more information to identify the problem. I re-ran this morning our internal test suite as well as the hdf5 1.12.2 testsuite, and all tests that were passing with 4.1.5 are also passing with the current main branch (One test from the hdf5 testsuite is failing and I can try to look into that over the weekend, but I double checked that the same test is also failing with 4.1.5, so I don't think its a regression). Is there any chance to create a small reproducer to debug the issue? |
just for documentation purposes, the one test that is failing from the hdf5-1.12.2 testsuite with open-mpi 4.1.5 is also failing if using the romio321 component, so it doesn't seem to be ompio specific In addition, the same test from the hdf5-1.14.2 testsuite is passing, so it was most likely a bug in the test itself, which is now fixed. So at this point unless you can provide more information or a reproducer, I have nothing that I can look into. |
I understand without a reproducing code this is tough. Since it involves complex domain decomposition already it would be tough for me to strip it down, and the sim is unfortunately not open source. In the meantime I tried to compile / reproduce everything on a different OS / kernel / environment (Debian, no CUDA, no InfiniBand) and unfortunately succeeded with the same results. What I could do is share the code responsible for the reads with you. At least you see the MPI calls involved and its order. I use the same MPI_Type for reading and writing, I guess this cannot cause problems? I suspect a stack corruption on MPI side, probably even caused by ill parameters from my side. Is it possible to enable an mpi build with libasan (gcc -fsanitize=address)? |
@chhu can you please ping me by email and we can communicate directly? I am happy to keep things confidential. Note however, that the parallel I/O part is not part of my current job description, so I am doing it on the side in the evenings and weekends. (If you cannot find my non-professional email, please ping me on linkedIn and will send it to you in the chat) Do I understand your statement in the first paragraph correctly, that if you compile Open MPI without GPU support, your test passes? That would be interesting, because this is one area of the parallel I/O code that has changed between 4.1.x and 5.0.x, so this could be hint. Regarding address sanitizer, I have not used that with Open MPI yet. I did use valgrind with memcheck a few times, but it only makes sense if the issue is easily reproduced with few processes and in s reasonable amount of time (e.g. a few seconds), otherwise the output is overwhelming. |
Thanks, just connected via linkedin, hope that works. As I mentioned before, the bug is quirky. It seems to fail only if the view is switched back from vector field to scalar field. Maybe I find some time to look at OMPI sources to hunt the bug myself, it should be in MPI_File_read_all or MPI_File_set_view. |
@chhu has provided me with a reproducer directly, and I was able to identify the problem. As part of introducing support for file atomicity in 615f4ef we tried to make the data sieving routines more robust as well, and this introduced the issue identified in this ticket. Since support for file atomicity is not available in the 4.1.x series, the issue doesn't exist there. A patch is coming shortly. Thank you! |
as part of introducing atomicity support for ompi v5.0, we also tried to improve the robustness in some file I/O routines. Unfortunately, this also introduced a bug since ret_code returned by a function does not necessarily contain the number of bytes read or written, but could contain the last value (e.g. 0). The value was however used in a subsequent calculation and we ended not copying data out of the temporary buffer used in the data sieving at all. This commit also simplifies some of the logic in the while loop, no need to retry to read past the end of the file multiple times. Fixes issue open-mpi#11917 Code was tested with the reproducer provided as part of the issue, our internal testsuite, and the hdf5-1.4.2 testsuite, all tests pass. Signed-off-by: Edgar Gabriel <[email protected]>
as part of introducing atomicity support for ompi v5.0, we also tried to improve the robustness in some file I/O routines. Unfortunately, this also introduced a bug since ret_code returned by a function does not necessarily contain the number of bytes read or written, but could contain the last value (e.g. 0). The value was however used in a subsequent calculation and we ended not copying data out of the temporary buffer used in the data sieving at all. This commit also simplifies some of the logic in the while loop, no need to retry to read past the end of the file multiple times. Fixes issue open-mpi#11917 Code was tested with the reproducer provided as part of the issue, our internal testsuite, and the hdf5-1.4.2 testsuite, all tests pass. Signed-off-by: Edgar Gabriel <[email protected]>
as part of introducing atomicity support for ompi v5.0, we also tried to improve the robustness in some file I/O routines. Unfortunately, this also introduced a bug since ret_code returned by a function does not necessarily contain the number of bytes read or written, but could contain the last value (e.g. 0). The value was however used in a subsequent calculation and we ended not copying data out of the temporary buffer used in the data sieving at all. This commit also simplifies some of the logic in the while loop, no need to retry to read past the end of the file multiple times. Fixes issue open-mpi#11917 Code was tested with the reproducer provided as part of the issue, our internal testsuite, and the hdf5-1.4.2 testsuite, all tests pass. Signed-off-by: Edgar Gabriel <[email protected]> (cherry picked from commit fb3b68f)
fixed with #11933 |
as part of introducing atomicity support for ompi v5.0, we also tried to improve the robustness in some file I/O routines. Unfortunately, this also introduced a bug since ret_code returned by a function does not necessarily contain the number of bytes read or written, but could contain the last value (e.g. 0). The value was however used in a subsequent calculation and we ended not copying data out of the temporary buffer used in the data sieving at all. This commit also simplifies some of the logic in the while loop, no need to retry to read past the end of the file multiple times. Fixes issue open-mpi#11917 Code was tested with the reproducer provided as part of the issue, our internal testsuite, and the hdf5-1.4.2 testsuite, all tests pass. Signed-off-by: Edgar Gabriel <[email protected]>
Background information
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
Reproduced with v5.0.x branch and main. Works in v4.x and 3.x.
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Build:
Subs:
Please describe the system on which you are running
Details of the problem
Our simulation uses two big indexed file types, one for scalar, one for vector. There are write and read operations for multiple scalars/vectors in a single file. Here are some details:
Sorry I cannot provide a demo code. However I can relatively easy test / recompile this scenario from a different branch / version.
Best,
Christian
The text was updated successfully, but these errors were encountered: