Skip to content

Added an initial implementation of partitioned communication over persistant MPI calls. #8641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 20, 2021

Conversation

mdosanjh
Copy link
Contributor

Signed-off-by: Matthew Dosanjh [email protected]

This is an initial implementation of partitioned communication using persistent sends and persistent receives. This should be compliant with the current release candidate of the MPI-4.0 specification.

@ibm-ompi
Copy link

The IBM CI (GNU/Scale) build failed! Please review the log, linked below.

Gist: https://gist.github.com/03dace16a79e5958d962c52e752aee8d

@ibm-ompi
Copy link

The IBM CI (XL) build failed! Please review the log, linked below.

Gist: https://gist.github.com/6e3e4e117f7c2aefd199166acefc6be9

@ibm-ompi
Copy link

The IBM CI (PGI) build failed! Please review the log, linked below.

Gist: https://gist.github.com/f6049cb406966176744f98a0685e5655

@ibm-ompi
Copy link

The IBM CI (XL) build failed! Please review the log, linked below.

Gist: https://gist.github.com/3fc96a30c7741cd755a49eafcc7be691

@ibm-ompi
Copy link

The IBM CI (PGI) build failed! Please review the log, linked below.

Gist: https://gist.github.com/eb81ff3fe01960c95f32f78d509140d3

@ibm-ompi
Copy link

The IBM CI (GNU/Scale) build failed! Please review the log, linked below.

Gist: https://gist.github.com/54c78e4f404b5a0637d79fc9793b2ae0

@ibm-ompi
Copy link

The IBM CI (XL) build failed! Please review the log, linked below.

Gist: https://gist.github.com/7a7ba36c4d7585b1384612d9500095b2

@ibm-ompi
Copy link

The IBM CI (XL) build failed! Please review the log, linked below.

Gist: https://gist.github.com/1a6ea556e35bbe709dd715057608cb1b

@ibm-ompi
Copy link

The IBM CI (XL) build failed! Please review the log, linked below.

Gist: https://gist.github.com/fec21300f6fdbde084d421c53690c9c3

@mdosanjh mdosanjh force-pushed the persist-part branch 3 times, most recently from 68512cc to 8324fea Compare March 18, 2021 22:09
@ibm-ompi
Copy link

The IBM CI (XL) build failed! Please review the log, linked below.

Gist: https://gist.github.com/fbe93757a0df3d339c7e34f226c21ebd

@mdosanjh
Copy link
Contributor Author

I'm having trouble parsing these errors from the checks (one seems to be happening in OB1?) does anyone have insight on to what I need to fix?

@mdosanjh
Copy link
Contributor Author

bot:retest

@ibm-ompi
Copy link

The IBM CI (XL) build failed! Please review the log, linked below.

Gist: https://gist.github.com/08068d909e505586843f9b9497d51a1e

@jsquyres
Copy link
Member

@mdosanjh Not sure why you're asking about ob1, the error is later:

Making all in mpi/fortran/use-mpi-f08/bindings
make[2]: Entering directory '/tmp/ompi-src/ompi/mpi/fortran/use-mpi-f08/bindings'
  PPFC     ompi-mpifh-bindings.lo
"mpi-f-interfaces-bind.h", line 331.28: 1513-230 (S) Procedure ompi_parrived_f was declared with the BIND(C) attribute.  Dummy argument flag of ompi_parrived_f must have interoperable type and type parameters.
** ompi_mpifh_bindings   === End of Compilation 1 ===
1501-511  Compilation failed for file ompi-mpifh-bindings.F90.

@mdosanjh
Copy link
Contributor Author

mdosanjh commented Mar 19, 2021

@jsquyres Apologies, this is my first time interacting with the pull request check system and ob1 was where https://gist.github.com/08068d909e505586843f9b9497d51a1e stopped/truncated as far as I could tell, so I had assumed that was where the issue was. I have subsequently found a local system to replicate the error on and I believe I have fixed it. (I just noticed the truncation banner at the top of that page)

@mdosanjh Not sure why you're asking about ob1, the error is later:

Making all in mpi/fortran/use-mpi-f08/bindings
make[2]: Entering directory '/tmp/ompi-src/ompi/mpi/fortran/use-mpi-f08/bindings'
  PPFC     ompi-mpifh-bindings.lo
"mpi-f-interfaces-bind.h", line 331.28: 1513-230 (S) Procedure ompi_parrived_f was declared with the BIND(C) attribute.  Dummy argument flag of ompi_parrived_f must have interoperable type and type parameters.
** ompi_mpifh_bindings   === End of Compilation 1 ===
1501-511  Compilation failed for file ompi-mpifh-bindings.F90.

@mdosanjh mdosanjh requested a review from hjelmn March 19, 2021 17:32
@mdosanjh mdosanjh self-assigned this Mar 19, 2021
@mdosanjh mdosanjh force-pushed the persist-part branch 3 times, most recently from 2d9bd52 to fadd3be Compare March 22, 2021 15:50
@mdosanjh mdosanjh requested a review from hjelmn March 22, 2021 17:42
@mdosanjh
Copy link
Contributor Author

bot:retest

@jsquyres
Copy link
Member

@mdosanjh Please see https://github.com/open-mpi/ompi/wiki/PRJenkins for a listing of the bot commands. If possible, please be judicious with re-running CI, because our CI can get backed up with queued builds, etc.

@jsquyres jsquyres closed this Mar 22, 2021
@jsquyres jsquyres reopened this Mar 22, 2021
@jsquyres
Copy link
Member

Oops -- clicked the wrong button there; didn't mean to close this PR.

@mdosanjh
Copy link
Contributor Author

@mdosanjh Please see https://github.com/open-mpi/ompi/wiki/PRJenkins for a listing of the bot commands. If possible, please be judicious with re-running CI, because our CI can get backed up with queued builds, etc.

Apologies, the build kept failing on ARMV8 with a GitHub check out error (with a reference a commit (other than mine) not being a tree). I'll try to be more specific with the bot commands in the future.

@mdosanjh mdosanjh requested a review from hppritcha March 26, 2021 13:46
@@ -340,6 +340,12 @@ lib@OMPI_LIBMPI_NAME@_mpifh_la_SOURCES += \
iscatter_f.c \
iscatterv_f.c \
issend_f.c \
parrived_f.c \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please put in alphabetical order

@@ -254,6 +254,12 @@ linked_files = \
piscatterv_f.c \
pisend_f.c \
pissend_f.c \
pparrived_f.c \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these too should be in alphabetical order in the source file list

@@ -304,6 +304,12 @@ PN2(void, MPI_Irecv, mpi_irecv, MPI_IRECV, (char *buf, MPI_Fint *count, MPI_Fint
PN2(void, MPI_Irsend, mpi_irsend, MPI_IRSEND, (char *buf, MPI_Fint *count, MPI_Fint *datatype, MPI_Fint *dest, MPI_Fint *tag, MPI_Fint *comm, MPI_Fint *request, MPI_Fint *ierr));
PN2(void, MPI_Isend, mpi_isend, MPI_ISEND, (char *buf, MPI_Fint *count, MPI_Fint *datatype, MPI_Fint *dest, MPI_Fint *tag, MPI_Fint *comm, MPI_Fint *request, MPI_Fint *ierr));
PN2(void, MPI_Issend, mpi_issend, MPI_ISSEND, (char *buf, MPI_Fint *count, MPI_Fint *datatype, MPI_Fint *dest, MPI_Fint *tag, MPI_Fint *comm, MPI_Fint *request, MPI_Fint *ierr));
PN2(void, MPI_Parrived, mpi_parrived, MPI_PARRIVED, (MPI_Fint *request, MPI_Fint *partition, ompi_fortran_logical_t *flag, MPI_Fint *ierr));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alphabetical order

@@ -320,6 +321,12 @@ mpi_api_files = \
iscatterv_f08.F90 \
isend_f08.F90 \
issend_f08.F90 \
psend_init_f08.F90 \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alphabetical order

@@ -165,6 +165,12 @@
#define MPI_Iscatterv PMPI_Iscatterv
#define MPI_Isend PMPI_Isend
#define MPI_Issend PMPI_Issend
#define MPI_Psend_init PMPI_Psend_init
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alphabetical order

@@ -25,6 +25,7 @@ typedef enum {
OMPI_REQUEST_GEN, /**< MPI-2 generalized request */
OMPI_REQUEST_WIN, /**< MPI-2 one-sided request */
OMPI_REQUEST_COLL, /**< MPI-3 non-blocking collectives request */
OMPI_REQUEST_PART, /**< MPI-4 partitioned communication request */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this added here rather than after OMPI_REQUEST_COMM?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hppritcha Thank you for the review, I've fixed all of the alphabetical orderings and as per this comment, there was not a reason for the placement, so I have moved OMPI_REQUEST_PART to be after OMPI_REQUEST_COMM.

@@ -2310,6 +2321,17 @@ OMPI_DECLSPEC int PMPI_Isend(const void *buf, int count, MPI_Datatype datatype,
int tag, MPI_Comm comm, MPI_Request *request);
OMPI_DECLSPEC int PMPI_Issend(const void *buf, int count, MPI_Datatype datatype, int dest,
int tag, MPI_Comm comm, MPI_Request *request);
OMPI_DECLSPEC int PMPI_Precv_init(const void* buf, int partitions, MPI_Count count,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@@ -1645,6 +1645,17 @@ OMPI_DECLSPEC int MPI_Isend(const void *buf, int count, MPI_Datatype datatype,
int tag, MPI_Comm comm, MPI_Request *request);
OMPI_DECLSPEC int MPI_Issend(const void *buf, int count, MPI_Datatype datatype, int dest,
int tag, MPI_Comm comm, MPI_Request *request);
OMPI_DECLSPEC int MPI_Precv_init(const void* buf, int partitions, MPI_Count count,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please add these in alphabetical order?

@mdosanjh mdosanjh force-pushed the persist-part branch 3 times, most recently from 24b8bb9 to e0d646b Compare March 26, 2021 23:49
@ibm-ompi
Copy link

The IBM CI (GNU/Scale) build failed! Please review the log, linked below.

Gist: https://gist.github.com/757b00059d0d4eb755171f50e177e95e

@ibm-ompi
Copy link

The IBM CI (XL) build failed! Please review the log, linked below.

Gist: https://gist.github.com/7f9273003d2846772017c4643f28d9c4

@ibm-ompi
Copy link

The IBM CI (PGI) build failed! Please review the log, linked below.

Gist: https://gist.github.com/945f245ddfc09f44492840d286a255ba

@hppritcha
Copy link
Member

@mdosanjh could you rebase this PR? Looks like a simple conflict.

Signed-off-by: Matthew Dosanjh <[email protected]>

Addressed ordering issues as per review by @hppritcha

fixed ordering in ompi/mpi/c makefiles
@mdosanjh
Copy link
Contributor Author

@mdosanjh could you rebase this PR? Looks like a simple conflict.

@hppritcha It should be rebased now.

@mdosanjh
Copy link
Contributor Author

@hjelmn I just wanted check to see if you've gotten a chance to re-review this since I incorporated your requested changes.

@mdosanjh
Copy link
Contributor Author

@hjelmn I just wanted check to see if you've gotten a chance to re-review this since I incorporated your requested changes.

It was suggested that I put some sort of timeline here to say when I would 'dismiss' the review, specifically the 23rd. (If this is unreasonable, please let me know)

@hppritcha hppritcha merged commit 03ed56c into open-mpi:master Apr 20, 2021
@mdosanjh mdosanjh deleted the persist-part branch April 20, 2021 18:01
@hppritcha
Copy link
Member

@regrant does this PR include all partitioned comm related items that went in to MPI 4.0?

@regrant
Copy link
Contributor

regrant commented Jul 29, 2021

@hppritcha This has all of the MPI 4.0 related content, so we're 4.0 compliant for partitioned communication with this merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants