Skip to content

libompitrace soversion is still 0.0.0 in release 2.0.0 #1906

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
amckinstry opened this issue Jul 27, 2016 · 11 comments
Closed

libompitrace soversion is still 0.0.0 in release 2.0.0 #1906

amckinstry opened this issue Jul 27, 2016 · 11 comments
Assignees
Labels
Milestone

Comments

@amckinstry
Copy link

Hi,
All the other libraries have upgraded from $random to 20.0.0 in OMPI release 2.0.0.

This doesn't have any operational impact on OMPI, but it does cause issues with packaging. With OMPI shipping multiple libraries, we typically put them all in a single package (libopenmpi1.10 , now libopenmpi2.0). We aim to make it possible for both sets of libraries to be present simultaneously, to enable clean upgrades, but having both packages ship the same file libompitrace.so.0.0.0 breaks this.

(Alternatively we can ship 1 library file per package, but this creates large numbers of packages in the OMPI case).
I presume shipping libompitrace.so at 0.0.0 with everything else at 20.0.0 is a bug.

@jsquyres jsquyres added the bug label Jul 27, 2016
@jsquyres jsquyres added this to the v2.0.1 milestone Jul 27, 2016
@jsquyres jsquyres self-assigned this Jul 27, 2016
@jsquyres
Copy link
Member

Oops -- yes, this is a bug / oversight.

jsquyres added a commit to jsquyres/ompi-release that referenced this issue Jul 27, 2016
On master, this value is explicitly set to 0:0:0.  On v2.x, set it to
20:0:0 (just so that it is different than master).

Fixes open-mpi/ompi#1906.

Signed-off-by: Jeff Squyres <[email protected]>

(cherry picked from commit open-mpi/ompi@2e0c3c7)
@jsquyres
Copy link
Member

Note that this library hasn't changed in a long time. The PR I just created (open-mpi/ompi-release#1278) sets the .so version to 20:0:0, but it's unlikely that the content of this library will change, and therefore the version will stay at 20:0:0, even in successive 2.x releases.

@amckinstry
Copy link
Author

Thanks, I'm applying this fix to our current release of openmpi (2.0.0-2 in debian experimental).
It would be good to come up with a general solution to "the versioning problem", though. Its problematic for us when individual libraries don't update their soversion, but others do.

The canonical solution in Debian is one package per library, with soname version, eg.
package libopenmpitrace20 contains /usr/lib/libopenmpitrace.so.20.0.0.
Then on an update, libopenmpitrace21 can be co-installed safely.
This allows codes (inc. user codes) linked to v20 to continue to to work through an upgrade.

If multiple small libraries are shipped, we can put them all in a package as we currently do with openmpi, eg. libopenmpi1.10.3 and libopenmpi2, but with the caveat that all libraries must update their soversions (or names) between the two releases, or they will conflict. (This is whats happening now).

A preferred solution is that there are fewer soversion changes, with symbols in the dynamic libraries using symbol versioning. This works well enough for e.g libc (at version 6 and holding for many years), with new soname changes only really needed when functionality gets dropped.
Would openmpi be ok with such a solution if a patch was provided?

Regards
Alastair

@jsquyres
Copy link
Member

Hmm. That seems to contradict the official GNU Libtool guidance for shared library versioning (which is what we use): https://www.gnu.org/software/libtool/manual/libtool.html#Updating-version-info

This is how we apply that Libtool guidance to Open MPI: https://github.com/open-mpi/ompi/blob/master/README#L1514-L1575

Do you configure Open MPI with --disable-dlopen, or do you get creative with pkglibdir and/or includedir? I ask because we have (at least?) 3 issues that make it difficult to install two different versions of Open MPI in the same prefix:

  1. The prefix/include/mpi.h header file can change between releases.
    • New functionality can get added
    • Deprecated / deleted functionality can get removed (although this hasn't happened yet)
    • Structure definitions / sizes may change (this can happen at ABI breaks, such as v2.0.0)
  2. The plugins have unversioned filenames, and are installed in pkglibdir
  3. The support binaries have unversioned filenames, and are not guaranteed to work with libraries from a different Open MPI version

In general, we tell people who want multiple versions of Open MPI installed to install them in different prefixes (this is actually fairly common).

@amckinstry
Copy link
Author

In Debian (and derivatives) we distinguish between the library packages (eg libopenmpi1.10) and the development packages - libopenmpi-dev. There may be multiple old versions of the library installed, providing eg. libopenmpi.so.1.10.3 (in package libopenmpi1.10.3) and libopenmpi.so.2.0.0 (in libopenmpi2 at the moment), but there can be only one development package libopenmpi-dev.
So, there is only one set of headers, and one set of linking files
libopenmpi.so -> which point to the latest library, ie libopenmpi.so.2.0.0

Right now, libopenmpi2 is in experimental and collides with libopenmpi1.10 because we have the unversioned plugins, all ending in .so. I have to come up with a solution to that, which will probably entail a new versioned directory for the plugins.

Support binaries similarly: we would only provide one set, linked against the latest library version: it is purely the dynamic library we guarantee can stay over an upgrade.

On the symbol versioning / libtool: our semantics are compatible with the libtool usage; we are only really concerned with the major sover number. It is presumed that libraries with libname.so.x.y.z , that the major version number x is incremented whenever any incompatible change is made, y and z are free for developers semantics. Pure additions (not deleting or changing an ABI, structure changes, etc) are trivial and can just be added without the major number being incremented; changes in ABI can be handled with symbol versioning, and only symbol deletion (functionality deletion) actually requires a new major version change.
(An example of this is HDF5 changes which recently dropped POSIX interface functionality, requiring major version number increase). Even changes in structures can be handled (with effort) using symbol versioning. See:
https://gcc.gnu.org/wiki/SymbolVersioning
for details.

Debian (and I personally) have experience adding symbol versioning. It can be added without breakage to any library. It won't necessarily work for all platforms - Linux, BSD, etc. are straightforward, my question here is does this break any platforms OpenMPI supports, and would you accept it if offered.

@rhc54
Copy link
Contributor

rhc54 commented Jul 27, 2016

How did you resolve this before? We haven't changed anything in this regard since the project began over 12 years ago.

@amckinstry
Copy link
Author

I don't believe we did (I started as OpenMPI maintainer a few months ago), but its becoming increasingly an issue, with more programs linked to MPI (and OpenMPI becoming the default for all architectures).
Looking further, as we ship the executables in a package openmpi-bin, and only have version of this installed, the question becomes: can programs linked against libopenmpi1.10 (or libopenmpi1.6 ) use the binaries such as mpiexec from 2.0.0 and beyond ? How stable have these interfaces been / expected to be in the future? I appreciate this cannot be an indefinite promise never to break the interfaces, but if the stability is there, we could use it.

@jsquyres
Copy link
Member

(the issue closed because the PR applying the version to libompitrace was just committed, but we can keep this conversation going, even though the initially-reported issue is now closed)

The short answer is: the only symbols that we have provided versioning guarantees about have been the MPI API (and now the OpenSHMEM API). Specifically: we do not provide any guarantees about internal symbols in support binaries (e.g., mpirun), or any of the plugins.

We did make MPI API ABI level changes in libmpi from v1.10 to v2.x, so we did the Libtool thing of increasing current and setting age to 0 (I don't know offhand how this translates to the major so version number). We bumped current all the way up to 20 for v2.0.0 so that we'd still have some head room if we need to do some more v1.10.x releases. Plus, 20 is symbolically close to v2.0.0, which was just aesthetically nice. 😄

That's an indirect way of saying: no, the mpirun from one version of Open MPI is not guaranteed to work with the back-end libraries from another version of Open MPI. We treat those back-end libraries (e.g., libopen-rte and libopen-pal) as internal and subject to change every release -- they're not intended for users to write apps against. Does that make sense?

From that perspective, I'm not sure that internal symbol versioning would make much sense for us -- since the only thing that we provide guarantees about is the user application's use of the MPI and OSHMEM APIs, we don't really need to version anything there because those two standards bodies are pretty careful themselves to only ever add or delete (and not modify).

Does that sound right to you?

Is there an "alternatively"-like system in Debian, where you can effectively install multiple versions of Open MPI into non-conflicting directories, and then set sym links for things that matter (e.g., mpirun and the other executables)?

@amckinstry
Copy link
Author

For information the "libtool" c:r:a scheme is basically a variant of the standard "symantic versioning" libfoo.$major.$minor.$revision ; for our purposes only the major ("current") matters; it needs to change when the ABI incompatibly changes; with work (as referenced above), this really only needs to happen when functionality is dropped (not added, or changed, including structure definition /size changes). I imagine its unlikely that functionality would be dropped in the 1.x branch, so no new library versions would be needed.

Adding functionality is trivial. Supporting changing member size/definitions or function signatures is awkward but can be done, as GNU libc demonstrates. Its dropping functionality that is problematic for us, but as you almost never do that on the public MPI / OSHMEM interfaces, thats not a major problem.

I would like to have Debian 9.0 ("stretch", to be released Jan/Feb 2017 or so) contain openmpi2, with all the MPI-enabled software built against it. Stretch would have a 2+ year lifespan (slightly more with LTS support); we can add new versions of 2.x (via "backports") as long as the major version number of MPI, OSHMEM does not change.

We do have the "alternatives" scheme in Debian but I don't think that sounds like the solution to our problem. I think the answer here is: keep the major number of libmpi__.so._ and liboshmem.* (the public interfaces) constant for OpenMPI 2.x (using symbol versioning and not dropping functionality in the 2.x timescale).

We use packages to divide up the problem of dependencies. So we would package as follows:

  • Packages like libhdf5-8 depend on libopenmpi-20 (containing libmpi.so.20, libmpi_*.so.20)
  • Another package libopenmpi-int-20 contains internal libraries (open-rte, etc)
  • openmpi-bin then depends on libopenmpi-int-20.

As 2.x evolves, the internal interfaces in open-rte, etc. change,and their major version numbers can change, and mpiexec etc., but the major version number for MPI remains constant. So 2.1 in Debian could ship (eg.) libopenmpi-20, libopenmpi-int-21 and an openmpi-bin package with Debian version number 2.1-1 depending on libopenmpi-int-21. libhdf5-8 still uses libmpi.so.20 and so the new libopenmpi-20 package (version 2.1-1) drops in transparently.

Does this sound ok to you? if so, I could prepare the symbol versioning patches.

@rhc54
Copy link
Contributor

rhc54 commented Jul 30, 2016

I'm not sure how that would all work, to be honest, as I have yet to see a release that didn't involve change to all three libraries, and thus you'd have to update everything anyway. However, I confess I may not fully understand what you are trying to do.

Perhaps the next step is for you to provide a patch so we can better understand exactly what you are proposing. We need to be cautious here that we don't mess up the other upstream packagers who may not break things up the same way you propose to do.

@amckinstry
Copy link
Author

Agreed. I'm preparing a detailed document and initial patch for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants