Skip to content

opnempi 4.0.0 Cannot compile with external pmix 3.1, but there is no mention of this in the docs or check during configuration #6456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
georgemarselis opened this issue Mar 5, 2019 · 7 comments

Comments

@georgemarselis
Copy link

Background information

Trying to install openmpi by hand for one of my scientific users.

What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)

v4.0.0

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

tarball from open-mpi website

Please describe the system on which you are running

Operating system/version: Linux Centos 7.5
Computer hardware: Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz, 64 cores total
Network type: cisco gbit

Details of the problem

Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc. It is most helpful if you can attach a small program that a developer can use to reproduce your problem.

download && check checksum && untarball && cd openmpi-4.0.0

./configure --prefix=/lsc/openmpi/4.0.0 --enable-branch-probabilities --enable-pretty-print-stacktrace --enable-pty-support --enable-weak-symbols --enable-dlopen --enable-show-load-errors-by-default --enable-heterogeneous --enable-binaries --enable-script-wrapper-compilers --enable-per-user-config-files --enable-ipv6 --enable-orterun-prefix-by-default --enable-mpirun-prefix-by-default --enable-mpi-interface-warning --enable-sparse-groups --enable-peruse --enable-mpi-fortran --enable-mpi-cxx --enable-mpi-cxx-seek --enable-mpi1-compatibility --enable-grequest-extensions --enable-spc --enable-shared --enable-static --enable-wrapper-rpath --enable-wrapper-runpath --enable-cxx-exceptions --enable-builtin-atomics --enable-openib-udcm --enable-openib-rdmacm --enable-openib-rdmacm-ibaddr --enable-btl-portals4-flow-control --enable-opal-btl-usnic-unit-tests --enable-event-evport --enable-event-debug --enable-hwloc-pci --enable-visibility --enable-memchecker --enable-install-libpmix --enable-pmix-timing --enable-ft --enable-mpi-ext=affinity,cuda,pcollreq --enable-visibility --enable-fast-install --with-libnl --with-devel-headers --with-max-processor-name=256 --with-max-error-string=256 --with-max-object-name=64 --with-max-info-key=36 --with-max-info-val=256 --with-max-port-name=1024 --with-max-datarep-string=128 --with-zlib-libdir=/lsc/zlib/lib --with-cuda=/lsc/nvidia/cuda/8.0-GA1 --with-pmix=external --with-pmix-libdir=/usr/lib64 --with-mpi-param-check=always --with-oshmem-param-check=always --with-jdk-dir=/usr/lib/jvm/java-1.8.0 --with-cs-fs --with-ofi-libdir=/usr/lib64 --with-xpmem --with-valgrind --with-pmi=/lsc/pmix/3.1 --with-slurm=/lsc/slurm/18.08.3 --with-tm --with-sge --with-moab --with-singularity --with-pvfs2 --with-psm --with-psm2 --with-ompi-pmix-rte --with-orte --with-treematch=/lsc/treematch/1.3 LDFLAGS='-L/lsc/pmix/3.1/lib' CFLAG='-I/lsc/pmix/3.1/include'

all other software precompiled already, external pmix is in /lsc/pmix/3.1 , both static and shared libraries are built.

make

...

ext3x.c: In function 'ext3x_value_unload':
ext3x.c:1109:10: error: 'PMIX_MODEX' undeclared (first use in this function); did you mean 'PMIX_UNDEF'?
     case PMIX_MODEX:
          ^~~~~~~~~~
          PMIX_UNDEF
ext3x.c:1109:10: note: each undeclared identifier is reported only once for each function it appears in
ext3x.c:1221:10: error: 'PMIX_INFO_ARRAY' undeclared (first use in this function); did you mean 'PMIX_INFO_TRUE'?
     case PMIX_INFO_ARRAY:
          ^~~~~~~~~~~~~~~
          PMIX_INFO_TRUE
make[2]: *** [Makefile:1921: libmca_pmix_ext3x_la-ext3x.lo] Error 1
make[2]: Leaving directory '/lsc/sources/openmpi/4.0.0/build/opal/mca/pmix/ext3x'
make[1]: *** [Makefile:2368: all-recursive] Error 1
make[1]: Leaving directory '/lsc/sources/openmpi/4.0.0/build/opal'
make: *** [Makefile:1886: all-recursive] Error 1

gist of output here: https://gist.github.com/georgemarselis/6a5ec92392aef6f2415bc81e78d4f771

there is a related error in #1660 but #1660 was fixed.

What I did was to re-patch PMIX_MODEX and PMIX_INFO_ARRAY by hand into /lsc/pmix/3.1/include/pmix_common.h

I think there should be a configure-time check for these two defines and if they are not present, configure should fall on its knees 😅

also, I will submit a modified header that throws a "DEPRECIATED" warning over at the pmix repo.

@rhc54
Copy link
Contributor

rhc54 commented Mar 5, 2019

We already fixed this by simply adding those definitions to the ext3x.h header if they aren't previously defined. They are not used by OMPI 4.0 and so there is no reason to have configure "die". The fix should be in OMPI v4.0.1

@georgemarselis
Copy link
Author

Oh, ok! Thank you. sorry to bother you . Should I close this?

@rhc54
Copy link
Contributor

rhc54 commented Mar 5, 2019

No bother - I'll go ahead and close it since it was already resolved.

@rhc54 rhc54 closed this as completed Mar 5, 2019
@jsquyres
Copy link
Member

jsquyres commented Mar 5, 2019

@georgemarselis You can download the latest Open MPI v4.0.1rc from:

https://www.open-mpi.org/software/ompi/v4.0/

And the latest v4.0.x nightly snapshot from:

https://www.open-mpi.org/nightly/v4.0.x/

@georgemarselis
Copy link
Author

thanks! i just think it should be backported, but that might be a bit of a herculean task, so I understand. Closing this again!

@jsquyres by the way is there an email i can reach you at?

@steven-varga
Copy link

steven-varga commented Jun 8, 2019

Can someone please post the git commit numbers where PMIx and OMPI is known to work?
OMPI commit b8a8ae9 (HEAD, tag: v4.0.1) with PMIx 3.1.2 release fails with much similar error OP posted on top.
configuration:
PMIx: ./configure --prefix=/usr/local/ --enable-pmix-binaries --with-platform=optimized --with-hwloc=/usr/local
OMPI: ./configure --prefix=/usr/local --with-slurm --with-pmix=/usr/local --enable-mpi1-compatibility --with-hwloc=internal --with-libevent=/usr

ext3x.c: In function ‘ext3x_value_unload’:
ext3x.c:1109:10: error: ‘PMIX_MODEX’ undeclared (first use in this function); did you mean ‘PMIX_UNDEF’?
     case PMIX_MODEX:
          ^~~~~~~~~~
          PMIX_UNDEF
ext3x.c:1109:10: note: each undeclared identifier is reported only once for each function it appears in
ext3x.c:1221:10: error: ‘PMIX_INFO_ARRAY’ undeclared (first use in this function); did you mean ‘PMIX_DATA_ARRAY’?
     case PMIX_INFO_ARRAY:
          ^~~~~~~~~~~~~~~

@jjhursey
Copy link
Member

This particular fix for PMIX_MODEX was committed in the following PRs (Circa Jan 7, 2019)

in the commit that you reference, I see the change that should address this type of error.

I just did a build of PMIx 3.1.2 and OMPI 4.0.1 releases and it built fine for me. I tried with the 4.0.1 HEAD of development and it built fine. Here is the last commit on that branch:

[mpiuser@d7ce52cbd054 openmpi-v4.0.x]$ git remote -v
origin	https://github.com/open-mpi/ompi.git (fetch)
origin	https://github.com/open-mpi/ompi.git (push)
[mpiuser@d7ce52cbd054 openmpi-v4.0.x]$ git branch
* v4.0.x
[mpiuser@d7ce52cbd054 openmpi-v4.0.x]$ git log -n 1
commit cb8dd569ff38ec999328dbd4949e1f323654173e
Merge: 0cd5a5a 900f0fa
Author: Howard Pritchard <[email protected]>
Date:   Thu Jun 13 18:55:53 2019 -0600

    Merge pull request #6747 from devreal/rdma-fetchop-local-v4.0.x
    
    OSC rdma: make sure accumulating in shared memory is safe

I configured Open MPI with:

  $ ./configure --prefix=/home/mpiuser/local/ompi --with-hwloc=/home/mpiuser/local/hwloc --with-libevent=/home/mpiuser/local/libevent --with-pmix=/home/mpiuser/local/pmix --enable-mpirun-prefix-by-default

I configured PMIx with:

  $ ./configure --prefix=/home/mpiuser/local/pmix --with-hwloc=/home/mpiuser/local/hwloc --with-libevent=/home/mpiuser/local/libevent

TrupeshKumarPatel added a commit to TrupeshKumarPatel/ompi that referenced this issue May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants