Skip to content

MPI needs a standard ABI #751

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jeffhammond opened this issue Aug 31, 2023 · 11 comments
Closed

MPI needs a standard ABI #751

jeffhammond opened this issue Aug 31, 2023 · 11 comments
Assignees
Labels
had reading Completed the formal proposal reading mpi-5.0 For inclusion in the MPI 5.0 standard passed final vote Passed the final formal vote passed first vote Passed the first formal vote wg-abi ABI Working Group

Comments

@jeffhammond
Copy link
Member

jeffhammond commented Aug 31, 2023

Problem

This issue needs to exist so I can submit a pull request to solve it, because of our voting procedures.

The problem is described in some detail in https://arxiv.org/abs/2308.11214. I do not want to repeat it here.

Proposal

Define a standard ABI for MPI. This includes:

  • describe how calling conventions work
  • header, module and shared library names
  • status object
  • integer types
  • handle types
  • integer constants
  • handle constants
  • callback constants

We will need a way to detect the existence and versions of the ABI.

The primary impact is on C but there are Fortran aspects too.

Changes to the Text

Write a completely new chapter to define all of the above. Define relevant terms. State the constraints.

Add all the constants to tables.

Impact on Implementations

The implementation of this in an MPI library is not trivial but not profound, either.

There is a prototype in MPICH already: pmodels/mpich#6390.

Impact on Users

Some users are desperate for this, because they are tired of compiling all their MPI software two or more times.

Use cases include:

  • Python, Julia and Rust bindings.
  • Users of containers.

References and Pull Requests

Pull request:

https://github.com/mpi-forum/mpi-standard/pull/875

Related issues:

#744
#743
#735
#709
#704
#702
#654
#642
#159
#107

@jeffhammond jeffhammond self-assigned this Aug 31, 2023
@jeffhammond jeffhammond added the wg-abi ABI Working Group label Aug 31, 2023
@github-project-automation github-project-automation bot moved this to To Do in MPI 5.0 Aug 31, 2023
@wesbland wesbland added the mpi-5.0 For inclusion in the MPI 5.0 standard label Sep 11, 2023
@wesbland wesbland moved this from To Do to In Progress in MPI 5.0 Sep 14, 2023
@wesbland wesbland added the scheduled reading Reading is scheduled for the next meeting label Nov 27, 2023
@wesbland wesbland added this to the December 2023 milestone Nov 27, 2023
@wesbland wesbland modified the milestones: December 2023, March 2024 Feb 5, 2024
@wesbland wesbland added the scheduled no-no vote No-No vote is scheduled for the next meeting label Mar 19, 2024
@Wee-Free-Scot
Copy link

@jeffhammond -- Feedback from Intel (@alexander-sannikov, @ddurnov, @garzaran):

  1. We're looking at how extensible the ABI might be -- a specific example would be "what if there were a C integer type different to MPI_Int, MPI_Long, etc. that needed to be added to the ABI? There is no room in the 8:3 allocation for C integer types" -- I know provision has been made in some categories, but not all. How will awkward extensions to the standard ABI be handled? We are thinking about new categories in the 12:9 part assigned to "extension_01" etc. which gives us an extra 8:3 space for new values.

  2. Separately, we are expecting that non-standard experimental features can be added into the ABI header file prior to standardisation without breaking compatibility, as long as the extension does not re-use a constant value already assigned to something else. For example, MPIX_ERR_INVALID_NOTIFICATION = 891 for the notified-RMA proposal. Of course, the specific values are subject to change during the subsequent standardisation effort.

  3. We would like to add an advice to users along the lines of "The MPI handles might not be valid pointers. The user must never attempt to dereference any MPI handle, even when it is defined/declared using an incomplete pointer typedef." -- I think this is okay and doesn't change anything, it just clarifies what should be common sense.

Ultimately, the extensibility questions push in the direction of asking "how useful is the Huffman code going forward?"

@jeffhammond
Copy link
Member Author

  1. I would add it in the # other C/C++ types branch of the Huffman code here:
 568 001000111000 MPI_C_BOOL
 569 001000111001 MPI_CXX_BOOL
 570 001000111010 reserved datatype
 571 001000111011 reserved datatype
 572 001000111100 MPI_WCHAR
 573 001000111101 reserved datatype
 574 001000111110 reserved datatype
 575 001000111111 reserved datatype

There is also plenty of space after the # 32 byte Fortran category.

We can also add another branch for datatypes but before we do that, we ought to think whether we should instead use the unnamed predefined datatype route, which is what we discussed in Boston (unfortunately, it was offline) for solving the AI float types problem. We have these for F90 kind but it can be generalized. I will write it up in detail if necessary.

@jeffhammond
Copy link
Member Author

  1. Can you make the MPIX error codes larger than MPI_ERR_LASTCODE? That seems like the easy solution.

@jeffhammond
Copy link
Member Author

  1. We can add that, although I must say, I don't think anyone has ever dereferenced an Open MPI handle before, so I doubt it's necessary.

@jeffhammond
Copy link
Member Author

"how useful is the Huffman code going forward?"

I think it's useful and I worked really hard to make sure it's future-proof. We are using less than 1/3 of the 1024 bits I intended to use, and we can go all the way up to 4095 if necessary. I think if we are considering adding a lot more predefined handles, we ought to evaluate whether that is the right design in the first place.

@Wee-Free-Scot
Copy link

Wee-Free-Scot commented Sep 25, 2024

  1. Thanks for the reminder about the "unnamed predefined datatype" idea. That would likely solve a lot of the datatype extensibility issues.
  2. Hmm, maybe a new error code wasn't the best specific example. Unless you're advocating that ALL experimental constants should have values above the 4096 reserved ones until they are standardised and a value in [0,4095] is assigned by the MPI Forum? That might work as a general answer.
  3. The advice should be common sense, but we're dealing with users here :)

Your work on this proposal is valuable and we're all grateful you've taken on another topic like this (large count, etc.). Apologies if my phraseology gave the wrong impression there. Perhaps we won't know the answer to this until we've all implemented the existing scheme and then extended it a bunch of times in the future.

@jeffhammond
Copy link
Member Author

  1. For handles, as long as an implementation can guarantee that it will not create a user handle that conflicts with the predefined one, it can use any value. Since Intel MPI does not create user handles with malloc anyways, reserving the zero page is not relevant.

However, if you want, we can reserve the last 1024 bits of the predefined space for experimentation. I think that's more than enough.

This is valuable feedback and I appreciate the care taken in making the proposal better.

@Wee-Free-Scot
Copy link

I think a specific reservation for experimentation (and differentiation) would be very useful. Thanks @jeffhammond.
The advice to users is belt-and-braces, but it heads-off a potential misinterpretation problem in future and does no harm.
Both changes are no-no-vote size adjustments, IMHO.

@wesbland wesbland added had reading Completed the formal proposal reading and removed scheduled reading Reading is scheduled for the next meeting scheduled no-no vote No-No vote is scheduled for the next meeting labels Oct 14, 2024
@wesbland wesbland moved this from In Progress to Had Reading in MPI 5.0 Oct 14, 2024
@wesbland wesbland removed this from the March 2024 milestone Oct 31, 2024
@wesbland wesbland added the scheduled first vote First vote is scheduled for the next meeting label Oct 31, 2024
@wesbland wesbland added this to the December 2024 milestone Nov 7, 2024
@wesbland
Copy link
Member

This passed a 1st vote.

Yes No Abstain
26 0 3

@wesbland wesbland added passed first vote Passed the first formal vote and removed scheduled first vote First vote is scheduled for the next meeting labels Dec 13, 2024
@wesbland wesbland moved this from Had Reading to Passed 1st Vote in MPI 5.0 Dec 13, 2024
@wesbland wesbland removed this from the December 2024 milestone Dec 16, 2024
@wesbland wesbland added scheduled second vote Second vote is scheduled for the next meeting scheduled reading Reading is scheduled for the next meeting scheduled no-no vote No-No vote is scheduled for the next meeting labels Dec 16, 2024
@wesbland
Copy link
Member

wesbland commented Jan 9, 2025

This passed a no-no vote.

Yes No Abstain
30 0 0

@wesbland wesbland removed scheduled no-no vote No-No vote is scheduled for the next meeting scheduled second vote Second vote is scheduled for the next meeting scheduled reading Reading is scheduled for the next meeting labels Jan 9, 2025
@wesbland
Copy link
Member

wesbland commented Jan 9, 2025

This passed a 2nd vote.

Yes No Abstain
30 0 0

@wesbland wesbland added the passed final vote Passed the final formal vote label Jan 9, 2025
@wesbland wesbland moved this from Passed 1st Vote to Passed 2nd Vote in MPI 5.0 Jan 9, 2025
@wgropp wgropp closed this as completed Jan 9, 2025
@github-project-automation github-project-automation bot moved this from Passed 2nd Vote to Done in MPI 5.0 Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
had reading Completed the formal proposal reading mpi-5.0 For inclusion in the MPI 5.0 standard passed final vote Passed the final formal vote passed first vote Passed the first formal vote wg-abi ABI Working Group
Projects
Status: Done
Development

No branches or pull requests

3 participants