Skip to content

Improvements and fixes for coll/HAN #10438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gkatev opened this issue Jun 1, 2022 · 7 comments
Open

Improvements and fixes for coll/HAN #10438

gkatev opened this issue Jun 1, 2022 · 7 comments

Comments

@gkatev
Copy link
Contributor

gkatev commented Jun 1, 2022

Hi, I have been looking into the HAN collective component, and would like to suggest some usability improvements and some fixes. I was planning on implementing these improvements (or some/most of them) and submitting PRs myself. So, in this issue, I'm looking for the "green light" that these suggestions are desirable, or any ideas/comments regarding them, or to know if someone else is already working on them or something similar.

  1. Currently, module selection for the intra/inter-node communicators has a fixed range of selections, and the MCA parameters can influence this selection through numeric indices associated with components.

I suggest adjusting the component choice to be based on the name (string) of the collective component to utilize, and remove the fixed selections. This will allow easier tuning (strings instead of IDs), and the possibility to use any component for each comm, without code modification. Example: --mca coll_han_bcast_up_module adapt --mca coll_han_bcast_low_module sm.

  1. Towards further ease-of-use improvements, add MCA param(s) to control the component choice for all primitives (and segsize, _use_simple?)

Currently, parameters are in the form of coll_han_<coll>_up_module, coll_han_<coll>_down_module, coll_han_<coll>_segsize, coll_han_use_simple_<coll>. While keeping these, example of addition: coll_han_up_module, coll_han_down_module. The primitive-specific parameters would override the new non-primitive-specific parameter, if set.

In the context of (1) and (2), I would also seek to unify mca_coll_han_comm_create() and mca_coll_han_comm_create_new() (?).

  1. Look into the dynamic functions available, and possibly fix them. I'm not entirely sure how these work, and it's possible that they are actually working they way they are supposed to. I have deposited some notes regarding these here: coll/han: dynamic selection does not work for simple algorithms #9883 (comment)

FYI, for anyone working on HAN, I believe that #10335 also affects (?) the ompi_comm_coll_preference info key that is used to influence the component selection for each subcomm.

@jsquyres
Copy link
Member

jsquyres commented Jun 6, 2022

@bosilca Any thoughts on this?

@FlorentGermain-Bull
Copy link
Contributor

FlorentGermain-Bull commented Jun 7, 2022

For now, we have 2 similar implementation in the han component

  1. The collectives using mca_coll_han_comm_create
    In this vue, there is one sub_communicator for each level for each component we want to use on it.
    For example, on inter_node level (up_comm), we have one communicator for libnbc and another one for adapt. These component are "forced" through ompi_comm_coll_preference .

  2. The collectives using mca_coll_han_comm_create_new
    Here, there is one sub_communicator for each topological level. In 2-level implementation, we have one intra-node (low) and one inter-node (up) communicator. There is where the "dynamic" part of han is usefull. Called on sub_communicators, dynamic functions will call specific components according to the configuration.

The implementation 1 is the historical one.
The implementation 2 is more flexible but have to be extended to support historical algorithms. For example, implementation 2 does not support non blocking collectives yet.

We extended han topology-aware capabilities one our side: we added the support n-levels with the implementation 2 and we consider pushing it to the github repo but we do not know when yet.
We also improve a by-name choice in configuration file. Now topological-levels, collectives, components and han algorithms can be chosen by name. This could be extended to MCA parameters too.

@gkatev
Copy link
Contributor Author

gkatev commented Jun 8, 2022

Thanks for the responses here and in the PR, I now better understand what goes on in HAN!

I presume that the idea in all our minds, is that at some point the dynamic path will supersede the historical one. The dynamic path is not currently a match for some of the historical implementations in HAN, i.e. those that use non-blocking collectives, and this is the reason that it has not yet replaced the old path.

From what I understand there's not a great technical obstacle behind making han-dynamic handle non-blocking collectives; one "speedbump" that @FlorentGermain-Bull mentioned in our call is that base's COLLTYPE_T enum that HAN uses does not include non-blocking functions. Florent, are you aware of another problem that would prevent extension of the dynamic path to cover all collective implementations in HAN?

My suggestions (1) and (2) from the original post still stand, and I suppose they should be integrated into the dynamic path rather than the old one. Regarding (1), you mention that you have implemented selection by name instead of ID in the configuration file; does that also mean that any component can be arbitrarily chosen only via its name, or does the fixed range of selections remain (enum COMPONENT_T)?

At the moment the dynamic path is effectively disabled, to overcome a bug that existed in it. I plan to take a look into re-enabling it and fixing the bug that results in #8248, along with some other small ones I have noticed, and submitting PRs. Or maybe it is preferable that it remains disabled until it completely replaces the historical one, to avoid confusing users with which path is used for each collective?

@FlorentGermain-Bull
Copy link
Contributor

FlorentGermain-Bull commented Jun 8, 2022

To extend the dynamic part of han on non-blocking collectives, extending COLLTYPE_T enum to non-blocking collective and write dynamic functions for these non-blocking collectives should be enough. To support them by name, mca_coll_base_colltype_to_str should alse be extended.

This fix we talked about yesterday for #8248, it probably was a copy-paste or a merging error, sorry for that one: eeb479e
(I do not know if permissions are set correctly, let me know whether you have access)

(1) I think we can manage components without COMPONENT_T enum: components are identified when stored in the module_list (c_coll->module_list in the communicator structure). For now, we scan this list during first han call to exctract module identifiers considering COMPONENT_T enum.
We could scan this list again when a not expected component is required by his name. If the overhed is low enough, this can become themain path and COMPONENT_T enum san be removed.
To support this functionnality, we need to switch from component_id to component_name in MCA parameters too.

(2) For now, we use MCA parameters default values as default behaviour. I do not know how to detect if the user has explicitely set an MCA parameter to its default value.
For example: if we use tuned by default on up level for all collectives, we will have coll_han_<coll>_up_module=tuned for all collectives.
If I want to have basic for all the collectives except for bcast, I would set coll_han_up_module=basic and explicitely set coll_han_bcast_up_module=tuned.
For now, do not know if it is possible to check if coll_han_bcast_up_module=tuned by default or if the user has explicitely set it. A way to do it would be to have by default all the coll_han_<coll>_up_module to a specific value like 'default' or 'unused' or 'global'.
To summarize, it is achiveable but not simple

@gkatev
Copy link
Contributor Author

gkatev commented Jun 8, 2022

I see the fix yes. Will you make a PR?

(1) Yes I believe that since the component names are available in module_list, we could completely remove the IDs as a middleman, both in the code and in the MCA parameters. I imagine that changes will be required here and there; quickly looking through the code, the main adjustment might be to remove mca_coll_han_get_all_coll_modules (?), and in get_module iterate the module list, and return the module whose name matches the MCA parameter or the dynamic rule.

(2) I think we can achieve this by having NULL by default for the primitive-specific parameters. So in the HAN component, set default values coll_han_up_module="libnbc", coll_han_low_module="tuned", coll_han_<coll>_<up/low>_module=NULL. Then, to make it more seamless, in mca_coll_han_init_query(), for each MCA that is still NULL, because the user didn't set it, copy the value of the generic parameter. At some point, I had written some code for this, for the non-dynamic scenario: coll_han_component.c.txt

@FlorentGermain-Bull
Copy link
Contributor

FlorentGermain-Bull commented Jun 8, 2022

PR #10456 created (I know a python script has rejected it, I'm checking why)

(1) yes exactly

(2) Another issue havind a global MCA parameter is that the only component which can be used for all the collectives must provide an implementation for all the collectives. In particular, if we extend han to non-blocking collectives, there is no component that provides an implementation for both blocking and non-blocking collectives.

@gkatev
Copy link
Contributor Author

gkatev commented Jun 8, 2022

Perfect (the signed-off-by thing is missing).

(2) Yes it could get a bit confusing. Perhaps another approach would be for all MCA parameters (including the non-coll-specific ones) to remain unset (NULL) by default. In this approach, the first non-han component that implements each collective, i.e. the one with the highest priority, would be chosen for the subcommunicator. And the parameters would be there to override and fine-tune, with the responsibility for correct choice lying with the tuner. (In any case (2) is a less important item..)

bosilca added a commit to bosilca/ompi that referenced this issue Jun 24, 2022
Improve the module selection for the up and low collective modules to
allow the, more user-friendly, use of the module name in addition to
module number.

This is a partial fix for open-mpi#10438

Signed-off-by: George Bosilca <[email protected]>
jsquyres pushed a commit to bosilca/ompi that referenced this issue Jul 18, 2022
Improve the module selection for the up and low collective modules to
allow the, more user-friendly, use of the module name in addition to
module number.

This is a partial fix for open-mpi#10438

Signed-off-by: George Bosilca <[email protected]>
jsquyres pushed a commit to jsquyres/ompi that referenced this issue Jul 19, 2022
Improve the module selection for the up and low collective modules to
allow the, more user-friendly, use of the module name in addition to
module number.

This is a partial fix for open-mpi#10438

Signed-off-by: George Bosilca <[email protected]>
(cherry picked from commit e572aee)
MamziB pushed a commit to MamziB/ompi that referenced this issue Jul 26, 2022
Improve the module selection for the up and low collective modules to
allow the, more user-friendly, use of the module name in addition to
module number.

This is a partial fix for open-mpi#10438

Signed-off-by: George Bosilca <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants