Description
Hi, I have been looking into the HAN collective component, and would like to suggest some usability improvements and some fixes. I was planning on implementing these improvements (or some/most of them) and submitting PRs myself. So, in this issue, I'm looking for the "green light" that these suggestions are desirable, or any ideas/comments regarding them, or to know if someone else is already working on them or something similar.
- Currently, module selection for the intra/inter-node communicators has a fixed range of selections, and the MCA parameters can influence this selection through numeric indices associated with components.
I suggest adjusting the component choice to be based on the name (string) of the collective component to utilize, and remove the fixed selections. This will allow easier tuning (strings instead of IDs), and the possibility to use any component for each comm, without code modification. Example: --mca coll_han_bcast_up_module adapt --mca coll_han_bcast_low_module sm
.
- Towards further ease-of-use improvements, add MCA param(s) to control the component choice for all primitives (and segsize, _use_simple?)
Currently, parameters are in the form of coll_han_<coll>_up_module
, coll_han_<coll>_down_module
, coll_han_<coll>_segsize
, coll_han_use_simple_<coll>
. While keeping these, example of addition: coll_han_up_module
, coll_han_down_module
. The primitive-specific parameters would override the new non-primitive-specific parameter, if set.
In the context of (1) and (2), I would also seek to unify mca_coll_han_comm_create()
and mca_coll_han_comm_create_new()
(?).
- Look into the dynamic functions available, and possibly fix them. I'm not entirely sure how these work, and it's possible that they are actually working they way they are supposed to. I have deposited some notes regarding these here: coll/han: dynamic selection does not work for simple algorithms #9883 (comment)
FYI, for anyone working on HAN, I believe that #10335 also affects (?) the ompi_comm_coll_preference
info key that is used to influence the component selection for each subcomm.