Skip to content

f08: do not BIND(C) to subroutines with LOGICAL parameters #1327

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ggouaillardet
Copy link
Contributor

Thanks Paul Romano for reporting this issue.

Thanks Paul Romano for reporting this issue.
@jsquyres
Copy link
Member

This kinda sucks; does it mean that mpi_f08 is busted in 1.10.2? If so, how did it make past testing? :-(

@ggouaillardet
Copy link
Contributor Author

@jsquyres can you have a look at this ?

i simply removed BIND(C) and it works
(i was able to link tests that use these functions)
/* i will push my tests from now */

that being said, i am not convinced this is very clean ...
link is successful because mpi_comm_dup_fn_ is a weak symbol, and there is no f08 definition for this function (from libmpi.so).
fwiw, mpich implements this function in f08, but if we do that, i am afraid we might end up with multiple definition of the same symbol.

yes, f08 is busted ...
the error only occurs if compiling the application with -std=f2008, without this flag, it happens to work.

@ggouaillardet
Copy link
Contributor Author

@jsquyres the issue also occurs when using gcc 5.2.0 and using the -std=f2008 flag.

it is currently not possible to build use-mpi-f08 with the -std=f2008 flag
(compiler complains about assumed type)

diff --git a/ompi/mpi/fortran/use-mpi-f08/Makefile.am b/ompi/mpi/fortran/use-mpi-f08/Makefile.am
index b5ec769..daaafb0 100644
--- a/ompi/mpi/fortran/use-mpi-f08/Makefile.am
+++ b/ompi/mpi/fortran/use-mpi-f08/Makefile.am
@@ -27,7 +27,7 @@ AM_FCFLAGS = -I$(top_builddir)/ompi/include \
              -I$(top_srcdir)/ompi/include \
              $(OMPI_FC_MODULE_FLAG)$(top_builddir)/ompi/$(OMPI_FORTRAN_USEMPI_DIR) \
              $(OMPI_FC_MODULE_FLAG). \
-             -I$(top_srcdir) $(FCFLAGS_f90)
+             -I$(top_srcdir) $(FCFLAGS_f90) -std=f2008

 MOSTLYCLEANFILES = *.mod
  PPFC     mpi-f08-interfaces.lo
../../../../../../src/ompi-master/ompi/mpi/fortran/use-mpi-f08/mpi-f08-interfaces.F90:32:10:

    OMPI_FORTRAN_IGNORE_TKR_TYPE, INTENT(IN) :: buf
          1
Error: TS 29113/TS 18508: Assumed type at (1)

at this stage, my best bet is to update the mtt trivial test suite, and use -std=f2008 when compiling a f08 file with gfortran.

makes sense ?

@jsquyres
Copy link
Member

Adding @paulromano.

@jsquyres
Copy link
Member

@ggouaillardet Please add "Fixes #1323" in the commit message.

@jsquyres
Copy link
Member

@ggouaillardet If you remind the BIND(C) to bind these mpi_f08 interfaces to Fortran functions, don't we then have to provide implementations of these interfaces? Put differently: since, for example, the MPI_COMM_NULL_COPY_FN in mpi_f08 won't be bound to ompi_comm_null_copy_fn_f any more, we need to actually provide a definition of this interface. Right?

@ggouaillardet
Copy link
Contributor Author

if we do that, the mpif08 library will define a mpi_comm_null_copy_fn symbol (modulo the compiler mangling). will this be an issue since this symbol is also defined in an other MPI library ?

@jsquyres
Copy link
Member

@ggouaillardet Gah -- I missed your above comment (#1327 (comment)) where you said the same thing that I just did.

So, yes, without BIND(C), it works -- kinda by accident. Technically, it's a hack -- the MPI_COMM_NULL_COPY_FN, for example, takes different types between mpif.h/mpi and mpi_f08 (int vs. Type(MPI_Comm) handle). But in reality, they're both the same size under the covers (i.e., a single INTEGER).

So... it should work out. But it's sketchy. And even if it works today, I don't know if we want to rely on that hack.

There's therefore 2 problems here:

  1. How to do this correctly.
  2. How to preserve the mpi_f08 ABI with 1.10.0/1.10.1.

I think that we can implement this the Right way by using an INTERFACE block to make the compiler names be the MPI-mandated names (e.g., MPI_COMM_NULL_COPY_FN), but make the back-end symbol be something else (e.g., ompi_comm_null_copy_fn_f08).

I see that there's an extraneous INTERFACE / END INTERFACE in attr-fn-f08-callback-interfaces.h -- that will likely need to be removed / replaced with individual INTERFACE blocks and names.

The problem then becomes how to maintain the ABI for the v1.10.x series.

Perhaps the hack described above (i.e., just remove BIND(C)'s and let the compiler, linker, and God sort it all out). Since the v1.10.x series has a limited life left in it, that might be Good Enough (although we should put a good comment in the v1.10 fix such that Future Gilles and Future Jeff can remember why we did this, and what the pitfalls are). And then on master / v2.x, we fix this properly (with INTERFACE blocks).

How does that sound?

@ggouaillardet
Copy link
Contributor Author

will it work when mpi_comm_null_copy_fn is passed as a parameter of mpi_comm_create_keyval ?

if yes, then that sounds good to me.
if an app is linked with libmpi_usempif08.so and has a mix of legacy and f08 mpi, then i can imagine legacy MPI will use the f08 implementation of mpi_comm_null_copy_fn.
since this function is pretty dumb, that will hopefully not be an issue

@jsquyres
Copy link
Member

Yes. I.e., the name that mpi_f08 exports to the Fortran application will be MPI_COMM_NULL_COPY_FN, but the symbol used under the covers will be ompi_comm_null_copy_fn_f08.

If an application has both mpif.h/mpi module and mpi_f08 module usage, then there should be 2 different symbols for old-style-MPI_COMM_NULL_COPY_FN (and friends) and new-style-MPI_COMM_NULL_COPY_FN. So there should be no problem.

@ggouaillardet
Copy link
Contributor Author

@jsquyres
here is a simple fortran program i wrote as a proof of concept

module callback
  ABSTRACT INTERFACE
     SUBROUTINE cb(v)
       INTEGER :: v
     END SUBROUTINE cb
  END INTERFACE
end module callback

module f90
contains
  subroutine xxx(v)
    use callback
    implicit none
    INTEGER :: v
    v = 90
  end subroutine xxx
end module f90

module f08
  interface xxx
     subroutine xxx_f08(v)
       INTEGER :: v
     end subroutine xxx_f08
  end interface xxx
end module f08

subroutine xxx_f08(v)
  INTEGER :: v
  v = 2008
end subroutine xxx_f08

subroutine bar(p)
  use callback
  implicit none
  integer :: v
  procedure(cb) :: p
  call p(v)
  write (*,*) v
end subroutine bar

subroutine test_f90
  use callback
  use f90
  implicit none
  integer :: v
  call xxx(v)
  write (*,*) 'test_f90 returns ', v
  call bar(xxx)
end subroutine test_f90

subroutine test_f08
  use callback
  use f08
  implicit none
  integer :: v
  call xxx(v)
  write (*,*) 'test_f08 returns ', v
  !call bar(xxx)
end subroutine test_f08

program test
  implicit none
  call test_f90
  call test_f08
end program test

in f90 module, the symbol of subroutine xxx is xxx_ and in f08, the symbol of subroutine xxx is xxx_f08_. so far so good ...

with fortran 5.3, it compiles and works, but if i uncomment the call to bar in test_f08, then compilation fails

 call bar(xxx)
         1
Error: GENERIC procedure 'xxx' is not allowed as an actual argument at (1)

to me, that makes some kind of sense.
in f08 module, xxx is an interface, so it can be "implemented" by several subroutines but with different signatures (see MPI_Sizeof). consequently, xxx cannot be passed as a parameter of a subroutine, because it is not possible to find at compile time which subroutine should be passed.
(well, in that case, there is only one subroutine and it is xxx_f08, so the compiler could/should know ...
i do not know enough Fortran to tell whether this is a bug or a feature of gfortran)

the only option i know, so the symbol for xxx in f08 is not xxx_ is via BIND(C)
that would work in that case, but it would fail if an argument of xxx was a LOGICAL

there might be an other way, but i am not aware of it

any thoughts ?

and by the way, is there a way to tell Fortran that xxx "implements" cb (e.g. xxx must have the same type that cb) ?

@jsquyres
Copy link
Member

jsquyres commented Feb 1, 2016

Ok, this is a pickle, then.
I think we should take this PR, and let it work "by accident".
We should start up another thread somewhere to figure out if this is actually ok, or if it's a problem in the MPI specification itself...

👍

jsquyres added a commit that referenced this pull request Feb 1, 2016
f08: do not BIND(C) to subroutines with LOGICAL parameters
@jsquyres jsquyres merged commit 910eca7 into open-mpi:master Feb 1, 2016
@jsquyres
Copy link
Member

jsquyres commented Feb 1, 2016

Actually, it looks like MPI-3.1 addresses this issue directly -- see MPI-3.1 p270 5-19:

Advice to implementors. The predefined Fortran functions MPI_COMM_NULL_COPY_FN, MPI_COMM_DUP_FN, and MPI_COMM_NULL_DELETE_FN are defined in the mpi module (and mpif.h) and the mpi_f08 module with the same name, but with different interfaces. Each function can coexist twice with the same name in the same MPI library, one routine as an implicit interface outside of the mpi module, i.e., declared as EXTERNAL, and the other routine within mpi_f08 declared with CONTAINS. These routines have different link names, which are also different to the link names used for the routines used in C. (End of advice to implementors.)

Advice to users. Callbacks, including the predefined Fortran functions MPI_COMM_NULL_COPY_FN, MPI_COMM_DUP_FN, and MPI_COMM_NULL_DELETE_FN should not be passed from one application routine that uses the mpi_f08 module to another application routine that uses the mpi module or mpif.h, and vice versa; see also the advice to users on page 660. (End of advice to users.)

@ggouaillardet Do you want to take a whack at this?

@ggouaillardet
Copy link
Contributor Author

i will have a look at this

@ggouaillardet
Copy link
Contributor Author

@jsquyres i made PR #1337
i complety missed that if a subroutine is defined after CONTAINS the generated symbol is prefixed with the module name ... that made everything almost trivial

jsquyres pushed a commit to jsquyres/ompi that referenced this pull request Sep 19, 2016
coll/hcoll: Fixes predifined types mapping
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants