Skip to content

configury: fix ofi components dependencies #6313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

ggouaillardet
Copy link
Contributor

use $(opal_common_ofi_*) variables since these are the only
one defined (by opal/mca/common/ofi/configure.m4)

Refs. #2519

Thanks Alastair McKinstry for the report and initial fix.
Thanks Rashika Kheria for the reminder.

Signed-off-by: Gilles Gouaillardet [email protected]

use $(opal_common_ofi_*) variables since these are the only
one defined (by opal/mca/common/ofi/configure.m4)

Refs. open-mpi#2519

Thanks Alastair McKinstry for the report and initial fix.
Thanks Rashika Kheria for the reminder.

Signed-off-by: Gilles Gouaillardet <[email protected]>
@ggouaillardet
Copy link
Contributor Author

@jsquyres can you please review this ?

on ubuntu, mca_mtl_ofi.so nor mca_rml_ofi.so depend on libmca_common_ofi.so (since they do not use any of its symbols) nor libfabric.so (since we use the incorrect variables in Makefile.am). Also, libmca_common_ofi.so does not depend on libfabric.so (since it does not use any of the symbols provided by libfabric.so).

From what I understand, the sole purpose of opal/mca/common/ofi is to provide the MCA_opal_common_ofi_CONFIG macro (from its configure.m4), and then ofi components rml/ofi and mtl/ofi can AC_REQUIRE([MCA_opal_common_ofi_CONFIG]).

This fix should be backported to the release branches.

As far as I am concerned, I'd rather get rid of common/ofi, and do things a la verbs and have the components call OPAL_CHECK_OFI() (revamped to add some caching). Not sure this future revamp should be backported to the release branches though.

@rhc54 @bwbarrett do you have any recollection on why we created common/ofi in the first place ?

@rhc54
Copy link
Contributor

rhc54 commented Jan 29, 2019

Please delete the rml/ofi component - it never worked very well and there is no interest in pursuing it further.

@bwbarrett
Copy link
Member

Oi, there's a trail of sadness in this one. Calling other component's configure macro works, but is super hinky from a "components should be independent" standpoint. At least it would be an error during autogen.

I'm pretty sure common:ofi was originally meant to hold all the MCA parameters that are currently set by the various OFI components. It looks like most of those parameters ended up being unique to the components (ie, most of the mtl:ofi parameters only make sense when dealing with the point-to-point interface), so the common parameter pool remains empty.

This is the minimal patch to remove the suck. Long term, we need to rethink our sharing strategy; this one doesn't match what other interfaces are doing.

@jsquyres
Copy link
Member

I agree with Ralph -- if rml/ofi is the problem and it is unmaintained / does not work, then perhaps we should just remove that component.

Additionally, as Brian mentioned, it looks like common/ofi was meant to hold common MCA vars. But it's empty. Meaning: this never happened (and if we remove rml/ofi, then we're down to a single OFI component, and perhaps common/ofi should go away as well).

I.e., perhaps the real fix should be removing rml/ofi and common/ofi, and then fixing the macros in mtl/ofi/Makefile.am (which also means moving the OPAL_CHECK_OFI from opal/mca/common/ofi/configure.m4 to ompi/mca/mtl/ofi/configure.m4)

@jsquyres
Copy link
Member

We talked about this on the webex today:

  1. opal/mca/common/ofi has no C code that anyone is using
  2. But opal/mca/common/ofi does call OPAL_CHECK_OFI, the output of which is used by multiple other components (contrary to what I said in a comment, above)
  3. @matcabral volunteered to look deeper into this

@rhc54
Copy link
Contributor

rhc54 commented Jan 30, 2019

My contribution is #6338

@jsquyres
Copy link
Member

jsquyres commented Feb 6, 2019

@ggouaillardet Can we close this one without merging in favor of #6363?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants