-
Notifications
You must be signed in to change notification settings - Fork 900
ompi/info: introduce support for the mpi_memory_alloc_kinds info object (II) #13055
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ompi/info: introduce support for the mpi_memory_alloc_kinds info object (II) #13055
Conversation
e91f38b
to
9fe1afe
Compare
As a side note, I do have a bunch of tests that I developed as part of the project, and we might need to find a good spot for where to put them. The tests do require some human interaction and the output is platform dependent, but its a good starting point in my opinion. For the World model:
and for Sessions:
|
Why does HCOLL need to be excluded? |
hcoll needs to be excluded if you compile with ROCm support, they don't seem to like each other |
return num_unique; | ||
} | ||
|
||
#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to leave this debug code in the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to keep the code in there, I am 99.9% sure that we will have use for that in the near future. Furthermore, there are other components/code parts that also have dump functionality in the code base. The #if 0
is just to get rid of compiler warning about unused static function
b495610
to
3314f1d
Compare
I have updated the PR based on the feedback that I received on the hybrid WG mailing list.
The third item, namely clarification of the behavior of info objects passed to |
f1d9f85
to
3f7fef1
Compare
ompi/communicator/comm.c
Outdated
newcomp->super.s_info = OBJ_NEW(opal_info_t); | ||
ompi_info_memkind_copy_or_set (&comm->instance->super, &newcomp->super, info); | ||
if (info) { | ||
newcomp->super.s_info = OBJ_NEW(opal_info_t); | ||
opal_info_dup(info, &(newcomp->super.s_info)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we dup the provided info before handling the memkind? Aren't we risking to overwrite the memkind info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are correct, I reverted the order
static void mca_accelerator_cuda_ze_memkind (ompi_memkind_t *memkind) | ||
{ | ||
memkind->im_name = NULL; | ||
memkind->im_no_restrictors = false; | ||
memkind->im_num_restrictors = 0; | ||
|
||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static void mca_accelerator_cuda_ze_memkind (ompi_memkind_t *memkind) | |
{ | |
memkind->im_name = NULL; | |
memkind->im_no_restrictors = false; | |
memkind->im_num_restrictors = 0; | |
return; | |
} | |
static void mca_accelerator_null_memkind (ompi_memkind_t *memkind) | |
{ | |
memkind->im_name = NULL; | |
memkind->im_no_restrictors = false; | |
memkind->im_num_restrictors = 0; | |
return; | |
} |
We don't need to return system
here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't see where this callback is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argh, this (and your comment below for the ze component) were originally fixed in the version before I did the rebase on the wrong machine and the force-push lost that work. Anyway, it is fixed now again, thanks!
You are btw. correct that we register the get_memkind() method with the null component although it is not actually invoked from anyway, its just for symmetry/completeness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the 'system' and 'mpi' memkinds are set in the info_memkind.c, independent of the accelerator component used.
084c609
to
cdba5b2
Compare
@devreal thank you for your review, I think I incorporated all your requests/suggestions. Whenever you have a few minutes, can you please re-review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, some more things I didn't notice earlier. I'm ok with not having an MCA parameter for now but I find having the undocumented environment variable strange.
ompi/info/info_memkind.c
Outdated
const char *ompi_info_memkind_cb (opal_infosubscriber_t *obj, const char *key, const char *value) | ||
{ | ||
return value; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return NULL
here so signal that the info value stored in the object has not changed. Returning value
signals that we accepted the new value and that the subscriber mechanism will change the stored value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried changing it to return NULL instead of value, in that case however it stops working, it looks like the callback is also invoked in the initial setting of the value (which doesn't seem to occur when I return NULL instead of the value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh yes, I forgot about the first time a value is set. How about testing for whether the value is set and rejecting any subsequent attempts? We don't want users to be able to change the key AFAIR.
const char *ompi_info_memkind_cb (opal_infosubscriber_t *obj, const char *key, const char *value) | |
{ | |
return value; | |
} | |
/** | |
* Callback invoked by the info subscriber mechanism. | |
* Accepts only the first value set. | |
*/ | |
const char *ompi_info_memkind_cb (opal_infosubscriber_t *obj, const char *key, const char *value) | |
{ | |
int flag, valuelen; | |
opal_info_get_valuelen(obj->s_info, key, &valuelen, &flag); | |
return (flag && valuelen) ? NULL : value; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is correct either. If the user provided mpi-assert-memory-alloc-kinds
, we calculate a new string 'value' for mpi-memory-alloc-kinds
(e.g. at communicator creation) and want that value to be set, not what was provided (and would be available after the info_dup).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a chat with @devreal and I think we came up with the correct solution for this problem, change has been pushed and integrated into the PR
9052fda
to
c2c408d
Compare
add code to handle the memkind info objects defined in MPI 4.1 Signed-off-by: Edgar Gabriel <[email protected]>
c2c408d
to
61f4b55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
bot:retest |
/azp run |
add an API to the accelerator component to retrieve the memory_alloc_kind information that is supported by the component. The values stored/returned are based on the side document that is about to be ratified, see https://github.com/mpi-forum/mem-alloc/blob/main/mem_alloc.tex Signed-off-by: Edgar Gabriel <[email protected]>
in intercomm_create_from_group, we need to set the grp_instance value for the leader_group. This issue was exposed with the new mmemkind code and the mpi4py testsuite. Signed-off-by: Edgar Gabriel <[email protected]>
61f4b55
to
e901e6a
Compare
This PR introduces support for the
mpi_memory_alloc_kinds
andmpi_assert_memory_alloc_kinds
info objects as defined in the MPI 4.1 document and in the side document specifying the values for the three supported accelerator kinds.The logic is surprisingly twisted and I would not be surprised if some of the code here would have to be revised over time.
There are two ways on how the user can specify the
mpi_memory_alloc_kinds
info object: either as a runtime argument tompiexec
, namely-memory-alloc-kinds
, or as part of the info-object passed toMPI_Session_init
. There are two predefined memor-alloc-kinds, namelysystem
andmpi
, the latter having three potential restrictors. When user retrieves the info object e.g. usingMPI_Comm_get_info
onMPI_COMM_WORLD
, the MPI library is allowed to return more memory-kinds than requested by the user.This freedom has been applied in the following manner here:
system
and/ormpi
memory-alloc-kinds, we add them to the list of provided memkindsThis default provided memkind is applied to all new {Comm/File/Window} objects, unless the user sets
mpi_assert_memory_alloc_kinds
during object creation (e.g.MPI_Comm_dup_with_info
orMPI_File_open
). The user can restrict the memory kinds supported by the object withmpi_assert_memory_alloc_kinds
, i.e. setting this info object will influencempi_memory_alloc_kinds
on that object. (I am not sure whether there is another example in the MPI spec where providing one info object influences the value of another info object, but the MPI 4.1 specification is pretty clear in my understanding that this is what is expected to happen.)Another difference in the handling of
mpi_assert_memory_alloc_kinds
is that if a provided value is not recognized, the entirempi_assert_memory_alloc_kinds
is dropped/ignored, not just the unrecognized memkind itself. (This is my reading of the specification on what is expected to happen).Lastly, there could probably be some discussion whether the location chosen for the majority of the code is appropriate (i.e. ompi/info), a directory which so-far has contained the code to manage the
MPI_Info
object in general. After some discussion I thought this is the most appropriate locations, since