-
Notifications
You must be signed in to change notification settings - Fork 900
coll/han: fix coll preference selection in mca_coll_han_comm_create_new #8250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Exclude HAN, don't include it. Signed-off-by: Joseph Schuchart <[email protected]>
I'm not sure that this was actually a typo. It kind of looks to me like the dynamic path is supposed to choose HAN for the subcomms, and then manually call other components' modules on each level. It is for this reason that Of course, there are other bugs with the dynamic path (#10438 #9883). It looks like this commit disables some buggy path, that triggers the segfault seen in #8248. Honestly the dynamic path itself and the way it combines with the non-dynamic one is all a bit confusing. |
Bot of you are correct. In an ideal situation, we would like to have a multi-level architecture awareness, and then at each level HAN will make a choice on how to break the collective. However, we never make a choice of module based on the topologic_level, because we limit the scope of HAN to only 2-levels hierarchies. Basically, the user visible communicator need to make a decision on what module to call on the underlying communicators, but as we don't support multi-level architecture-awareness we need to prevent the underlying communicators from using HAN. |
So if I'm understanding correctly, the dynamic path is designed to support multiple levels, but because it is not yet fully implemented, it is manually bypassed in favour of the traditional up/low path? From reading the code it's not fully apparent to me why the dynamic path is not used even for just 2 levels of hierarchy, it looks like it would work(with small fixes/adjustments). It generally feels like there is duplication between the two paths. |
Yes, the original design called for multiple levels, but then it turned out we were lacking architectural information (basically netloc) to take advantage of this, so we went ahead and worked under a 2-levels assumption. If I remember correctly the only way to build a multi-level is using the decision files, and there you explicitly mark the modules you want to use, so we don't need to exclude/include anything. |
Would it make sense to merge the dynamic and non-dynamic paths? So, for the most part, join |
That particular code path was added by the folks at Bull (@EmmanuelBRELLE, @bsergentm). I'm all for simplification as long as we retain similar capabilities, but maybe they want to chime in. |
Hello We consider pushing it to the github repo but we do not know when yet. I think making han simpler is a good idea too. Merging |
Exclude HAN, don't include it.
Signed-off-by: Joseph Schuchart [email protected]
Refs #8248