-
Notifications
You must be signed in to change notification settings - Fork 901
fcoll/two_phase: fix coverity errors #963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@edgargabriel Please review. This fixes a large number of issues identified by coverity. There are others that remain but the fixes are not as straight-forward. |
Fixed typos. |
Test FAILed. |
What is the reason that you removed all the checks whether the pointer is a NULL pointer before freeing the memory for all variables? I do not see any benefit on that, and in fact, I am pretty sure e.g. in the dynamic segmentation algorithm they required for correctness inside the while loop. |
free (NULL) is completely safe in C. free checks for NULL already and returns if the pointer is NULL. With the check before free you end up checking the conditional twice for the common case (non-NULL pointer). |
ok, grabbing a copy of your repository and testing whether our tests still work. |
the changes to dynamic module seem to be ok, the two_phase module generates now however a segfault in a free statement. gabriel@salmon:~/ompi-tests/mpi2basic_tests/file> mpirun --mca fcoll two_phase -np 6 ./filetest [salmon:07724] *** End of error message ***mpirun noticed that process rank 0 with PID 0 on node salmon exited on signal 11 (Segmentation fault). |
@edgargabriel Ok, I will double-check. Thanks for testing. I will let you know when I have determined what I did wrong. |
I can also check, but it might be next week before I get to it. Thanks! |
Fixes CIDs 72300, 72344, 1196764-1196768, 72300: Resource leaks Mulitple allocated arrays are going out of scope at the end of mca_fcoll_two_phase_file_write_all. Free these arrays. Also removed the extraneous NULL checks since free (NULL) is safe in C. Change returns to goto exit where the allocated resources are freed. Fixes CIDs 72285-72292, 72297, 72298: Resource leaks Change all appropriate return statements to goto exit to ensure that all resources are freed. Also removed the NULL checks since free (NULL) is safe in C. Fixes CIDs 72295, 72296: Resource leaks Moved free of requests and recv_types to after exit label. This will ensure these are freed on error. Also added a loop and statement to free send_buf which is going out of scope at the end of the function. Fixes CIDs 72336-72240, 735197, 735198: Resource leaks Moved the exit label before to before the resources are released and changed all appropriate return statements to goto exit. Also removed extraneous NULL checks because free (NULL) is safe in C. Fixes CIDs 72341, 72343, 1196805-1196809: Resource leaks Free all resources after exit label and change return statements to goto exit to ensure all resources are freed on error. Fixes CID 1269973: Unused value Check return code of ompi_request_wait_all. If it fails jump to the exit. Fixes CID 714119: Dereference before NULL check Wrong value checked in conditional. Signed-off-by: Nathan Hjelm <[email protected]>
Fixes CID 72320: Explicit NULL dereferenced On error it is possible that the blocklen_per_process array is NULL. Change the NULL check before the free to check for non-NULL on the array not the array element. Also clean up allocation of this array to use calloc instead of malloc + setting each element to NULL. Signed-off-by: Nathan Hjelm <[email protected]>
Found a typo in one of the changes. Fixed that error. Now I see another error but that also happens without any of my changes. |
you mean the error in the append test? I see that too, not sure what is triggering that, will check. It used to pass. |
Seeing the error in the write shared test on my mac:
|
so actually it works for me mostly, just changed the free to my_req_per_proc, correct? Which sharedfp component is selected in your case? sm or lockedfile (I don't have a mac to test unfortunately). |
the write_shared test passes for both components on linux, so not sure where to start to look into that. can you post a complete traceback? |
Output of rank 0 with -mca sharedfp_base_verbose 100: https://gist.github.com/hjelmn/9ffb5b42299058b89701 |
I looked at the file that you attached, and I think I have an idea on what might be going wrong on your system. But I will need somehow access to a MAC to test it :-( Anyway, it is not related to this PR, it has to do with creating the shared memory region through mmap inside of the sm component. So I think we should be good to go with this PR. I can open a new issue for the sm sharedfp component. +1 |
fcoll/two_phase: fix coverity errors
Update the PMIx support to the PMIx 1.1.2 release.
Fixes CIDs 72300, 72344, 1196764-1196768, 72300: Resource leaks
Mulitple allocated arrays are going out of scope at the end of
mca_fcoll_two_phase_file_write_all. Free these arrays. Also removed
the extraneous NULL checks since free (NULL) is safe in C.
Change returns to goto exit where the allocated resources are freed.
Fixes CIDs 72285-72292, 72297, 72298: Resource leaks
Change all appropriate return statements to goto exit to ensure that
all resources are freed. Also removed the NULL checks since free
(NULL) is safe in C.
Fixes CIDs 72295, 72296: Resource leaks
Moved free of requests and recv_types to after exit label. This will
ensure these are freed on error.
Also added a loop and statement to free send_buf which is going out of
scope at the end of the function.
Fixes CIDs 72336-72240, 735197, 735198: Resource leaks
Moved the exit label before to before the resources are released and
changed all appropriate return statements to goto exit. Also removed
extraneous NULL checks because free (NULL) is safe in C.
Fixes CIDs 72341, 72343, 1196805-1196809: Resource leaks
Free all resources after exit label and change return statements to
goto exit to ensure all resources are freed on error.
Fixes CID 1269973: Unused value
Check return code of ompi_request_wait_all. If it fails jump to the
exit.
Fixes CID 714119: Dereference before NULL check
Wrong value checked in conditional.
Signed-off-by: Nathan Hjelm [email protected]