Skip to content

openmpi 2.1.0 fails to build on s390x #3443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
opoplawski opened this issue May 3, 2017 · 9 comments
Closed

openmpi 2.1.0 fails to build on s390x #3443

opoplawski opened this issue May 3, 2017 · 9 comments

Comments

@opoplawski
Copy link
Contributor

Thank you for taking the time to submit an issue!

Background information

What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)

2.1.0

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Building Fedora openmpi package

Please describe the system on which you are running

  • Operating system/version: Fedora rawhide

Details of the problem

libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I../../../../opal/include -I../../../../ompi/include -I../../../../oshmem/include -I../../../../opal/mca/hwloc/hwloc1112/hwloc/include/private/autogen -I../../../../opal/mca/hwloc/hwloc1112/hwloc/include/hwloc/autogen -I../../../../ompi/mpiext/cuda/c -I../../../.. -I../../../../orte/include -I/builddir/build/BUILD/openmpi-2.1.0/opal/mca/event/libevent2022/libevent -I/builddir/build/BUILD/openmpi-2.1.0/opal/mca/event/libevent2022/libevent/include -DNDEBUG -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -march=zEC12 -mtune=z13 -finline-functions -fno-strict-aliasing -pthread -MT mca_btl_sm_la-btl_sm_component.lo -MD -MP -MF .deps/mca_btl_sm_la-btl_sm_component.Tpo -c btl_sm_component.c  -fPIC -DPIC -o .libs/mca_btl_sm_la-btl_sm_component.o
btl_sm_component.c: In function 'create_rndv_file':
btl_sm_component.c:631:5: warning: ignoring return value of 'asprintf', declared with attribute warn_unused_result [-Wunused-result]
     asprintf(&tmpfname, "%s.tmp", fname);
     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from btl_sm.c:45:0:
../../../../opal/include/opal/sys/cma.h:89:2: error: #error "Unsupported architecture for process_vm_readv and process_vm_writev syscalls"
 #error "Unsupported architecture for process_vm_readv and process_vm_writev syscalls"
  ^~~~~
../../../../opal/include/opal/sys/cma.h: In function 'process_vm_readv':
../../../../opal/include/opal/sys/cma.h:101:18: error: '__NR_process_vm_readv' undeclared (first use in this function); did you mean 'process_vm_readv'?
   return syscall(__NR_process_vm_readv, pid, lvec, liovcnt, rvec, riovcnt, flags);
                  ^~~~~~~~~~~~~~~~~~~~~
                  process_vm_readv
../../../../opal/include/opal/sys/cma.h:101:18: note: each undeclared identifier is reported only once for each function it appears in
../../../../opal/include/opal/sys/cma.h: In function 'process_vm_writev':
../../../../opal/include/opal/sys/cma.h:112:18: error: '__NR_process_vm_writev' undeclared (first use in this function); did you mean 'process_vm_writev'?
   return syscall(__NR_process_vm_writev, pid, lvec, liovcnt, rvec, riovcnt, flags);
                  ^~~~~~~~~~~~~~~~~~~~~~
                  process_vm_writev
/usr/include/bits/uio.h: In function 'process_vm_readv':
../../../../opal/include/opal/sys/cma.h:102:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^

2.0.2 builds fine. With 2.0.2 I see:

checking if user requested CMA build... no

With 2.1.0 I see:

Transports
-----------------------
Cray uGNI (Gemini/Aries): no
Intel Omnipath (PSM2): no
Intel SCIF: no
Intel TrueScale (PSM): no
Mellanox MXM: no
Open UCX: no
OpenFabrics Libfabric: no
OpenFabrics Verbs: no
Portals4: no
Shared memory/copy in+copy out: yes
Shared memory/Linux CMA: yes
Shared memory/Linux KNEM: no
Shared memory/XPMEM: no
TCP: yes

I seem to be unable to disable CMA either with --without-cma or --with-cma=no.

@amckinstry
Copy link

amckinstry commented May 4, 2017

Confirm that this is also the case with the Debian openmpi package.
The configuration can be seen here: http://sources.debian.net/src/openmpi/2.1.0rc2-1/debian/rules/

(Ignore the 2.1.0rc2 version tag; it is 2.1.0)

@jsquyres
Copy link
Member

jsquyres commented May 4, 2017

Does the same thing happen with 2.1.1rc1? https://www.open-mpi.org/software/ompi/v2.1/

When I use --without-cma with 2.1.1rc1, it seems to disable CMA properly for me.

@jsquyres
Copy link
Member

jsquyres commented May 4, 2017

Actually, I'd also like to know if OMPI 2.1.1rc1 properly disables CMA on the platform on which you're building. E.g., if there's platforms where CMA simply does not work, configure should auto-disable building CMA on those platforms.

We're likely just running into this now because I seem to recall that CMA was not built by default in the v2.0.x series, but we now try to build it by default in the v2.1.x series.

@jsquyres jsquyres added this to the v2.1.1 milestone May 4, 2017
@amckinstry
Copy link

I hadn't noticed the 2.1.1rc1 release; thanks, i'll test it,

@jsquyres
Copy link
Member

jsquyres commented May 4, 2017

I was actually all set to release v2.1.1 yesterday -- this issue and #3442 gave me pause. So if you could let me know ASAP, that would be great. Even if we have to have you temporarily --without-cma to build v2.1.1 on problematic architectures, I'm comfortable adding a fix for auto-disabling CMA on unsupported platforms after v2.1.1.

@opoplawski
Copy link
Contributor Author

I see the same compile problem with 2.1.1rc1, but --without-cma does appear to disable it now.

@jsquyres
Copy link
Member

jsquyres commented May 4, 2017

Ok, good news, at least, that --without-cma works around the issue. I'll mark this as a 2.1.2 issue for now.

Can you send a link to a full build log and/or a config.log file so that we can look at why it's not automatically disabling itself on platforms that don't support CMA?

@jsquyres jsquyres modified the milestones: v2.1.2, v2.1.1 May 4, 2017
@opoplawski
Copy link
Contributor Author

Full build log - https://kojipkgs.fedoraproject.org//work/tasks/8839/19398839/build.log it will be around for a week or so. Printed all config.log files - perhaps overkill.

@jsquyres
Copy link
Member

jsquyres commented May 8, 2017

Closing since this is now merged on master, v2.x, and v3.x.

@jsquyres jsquyres closed this as completed May 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants