Skip to content

Add FP16 datatypes #6205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Feb 8, 2019
Merged

Add FP16 datatypes #6205

merged 9 commits into from
Feb 8, 2019

Conversation

kawashima-fj
Copy link
Member

@kawashima-fj kawashima-fj commented Dec 19, 2018

This PR adds the following datatypes for FP16 (half-precision floating point) if a compiler supports the corresponding types.

  • MPIX_SHORT_FLOAT for C/C++ short float and _Float16
  • MPIX_C_SHORT_FLOAT_COMPLEX for C short float _Complex
  • MPIX_CXX_SHORT_FLOAT_COMPLEX for C++ std::complex<short float>
  • MPI_REAL2 for Fortran REAL*2 (and REAL(kind=2))
  • MPI_COMPLEX4 for Fortran COMPLEX*4 (and COMPLEX(kind=2))

Datatypes with MPIX_ prefix are available through the MPI extension.

short float is proposed in the C WG and the C++ WG in ISO/IEC.

The background is described in a issue and a slide in the MPI Forum.

MPICH has MPIX_C_FLOAT16. @artpol84 and I are talking with MPI guys to use a same name.

This PR is still WIP. I want comments. The following features will be implemented soon.

  • MPI_SIZEOF
  • MPI_MATCH_SIZE
  • MPIX_SHORT_FLOAT in the mpi_f08_ext module

@bosilca Do you have any comments?

@artpol84 @Sergei-Lebedev Could you add your HCOLL support commit to this PR (with your Signed-off-by)?

If there are no problems, I want to merge this PR next month.

@kawashima-fj
Copy link
Member Author

@bosilca @jsquyres Thanks for review, but I believe this PR does not break ABI for MPI programs (mpi.h, mpif.h, mpi.mod, mpi_f08.mod, ...).

OMPI_DATATYPE_MPI_* macros, which I changed values of, are used as values of ompi_datatype_t::id and indices of the following arrays.

The value of ompi_datatype_t::id is not exposed to MPI programs.

ompi_datatype_t::d_f_to_c_index, which you concern, is set in the MOOG macro and
the ompi/include/mpif-values.pl file. I didn't change existing values and only added new values.

I mentioned the ABI compatibility issue in my commit messages.

On the other side, this PR breaks ABI for MCA components (configure --devel-headers). Do we care about it?

I removed the ABI break label. If I miss something, please let me know and readd the label.

@bosilca
Copy link
Member

bosilca commented Dec 26, 2018

@kawashima-fj I thought that the OMPI_DATATYPE_MPI_ values must be in sync with the handles in the mpif-values.pl. I might be wrong, but I wanted a second pair of eyes on this before we break the Fortran layer.

@kawashima-fj
Copy link
Member Author

@bosilca OMPI_DATATYPE_MPI_* and mpif-values.pl don't have same values in master though mpif-values.pl and MOOG have same values.

In any case, I'll revert changes of values of OMPI_DATATYPE_MPI_* if you and/or community desires.

@kawashima-fj
Copy link
Member Author

Now remaining features are implemented except the mpi-f08-ext bindings, which requires #6210.

If there are no problems, I want to merge this PR in mid-Jan.

@kawashima-fj
Copy link
Member Author

bot:ompi:retest

@kawashima-fj
Copy link
Member Author

I have completed my tests. I'll merge this PR in this week unless someone has any negative comment.

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really nice piece of work. Excellent sequence of commits; thank you for breaking it down!

That being said, I echo the concerns discussed in the MPI Forum in Dec 2018: we're basing this off C types that do not yet exist (and may never exist). That's probably ok from the "MPIX" point of view, but this is a ton of code that may get ripped out someday if short float (and friends) and/or MPI_SHORT_FLOAT (and friends) ultimately do not come to fruition.

It would be one thing if this was entirely an MPI extension, but the vast majority of the code is outside of ompi/mpiext because it has to integrate with the datatype and op infrastructure. That gives me a little pause.

I don't have a strong objection to this, especially since some vendors obviously see some benefit from this (and I assume have customers who want it?). But it does... give me pause.

@kawashima-fj
Copy link
Member Author

@jsquyres Thanks a lot for your review! I added comments to your non-trivial reviews. For trivial ones, I agree with you and I'll update the code.

I also attended the MPI Forum meeting in Dec 2018 from Japan via WebEx. I also have the same concern but at least Fujitsu and Mellanox need FP16 support in Open MPI. I'll delay merge and want to hear OMPI developer's opinions.

@jsquyres
Copy link
Member

I think the only thing meaningful thing left from my review was the configure test update. Easily fixed.

Let's put a timeout on getting comments back from other OMPI developers (e.g., about whether we want to add all this code for a type that is not yet standardized) -- this PR has waited quite a long time, mainly because devs [like me] took forever to look at the details. If there's no deadline, people get caught up in other work and miss PR's like this.

@kawashima-fj
Copy link
Member Author

I propose Feb. 7th for the deadline. OK?

All, could you comment if you have opinions? I am about to merge FP16 (half precision floating point) datatype support. Corresponding C/C++ types are not yet standardized but they are proposed in ISO/IEC WGs. The background is described in a issue and a slide in the MPI Forum. Links to related pages are listed in my page.

@jsquyres
Copy link
Member

I forwarded your note to the devel mailing list.

One thing I forgot to ask: what is MPICH doing in terms of MPIX_ for half precision? If possible, it would be nice if our MPIX_ names/meanings could be the same as theirs.

@kawashima-fj
Copy link
Member Author

kawashima-fj commented Jan 31, 2019

@jsquyres Thanks. I should have mailed to devel list, not only GitHub.
MPICH has MPIX_C_FLOAT16 for C _Float16. It is compatible with this PR.

`MPIX_C_FLOAT16` is defined as a synonym for `MPIX_SHORT_FLOAT`
if the C compiler supports `_Float16`, which is defined in
ISO/IEC JTC 1/SC 22/WG 14 N1945 (ISO/IEC TS 18661-3:2015).
This name and meaning are same as that of MPICH. This may be
a transitional datatype until the MPI Forum decides a proper
name for the type.

Signed-off-by: KAWASHIMA Takahiro <[email protected]>
`short float` support of the Intel C++ Compiler (group of C and C++
compilers), at least versions 18.0 and 19.0, is half-baked. It can
compile declarations of `short float` variables and expressions of
`sizeof(short float)` but cannot compile operations of `short float`
variables. In this situation, `AC_CHECK_TYPES(short float)` defines
`HAVE_SHORT_FLOAT` as 1 and compilation errors occur in
`ompi/mca/op/base/op_base_functions.c`. To avoid this error
tentatively, we disable `short float` support when using the Intel
C++ Compiler.

Signed-off-by: KAWASHIMA Takahiro <[email protected]>
@kawashima-fj
Copy link
Member Author

I updated the PR to reflect @jsquyres's review.

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, I have minor reservations about basing a big chunk of infrastructure on C/C++ datatypes that are not yet standardized. That being said, I'm still overall in favor of this PR.

@kawashima-fj
Copy link
Member Author

Ok, nobody else has a comment. I understand OMPI developers have no negative comments other than @jsquyres's one or don't care. Two developers approved the PR. So I'll merge the PR.

@kawashima-fj kawashima-fj merged commit 8bbd201 into open-mpi:master Feb 8, 2019
@artpol84
Copy link
Contributor

@kawashima-fj Thanks for the great work!

@artpol84
Copy link
Contributor

@raffenet FYI

@ggouaillardet
Copy link
Contributor

@kawashima-fj this PR broke Open MPI compilation on OS-X

the root cause is that there is no object file in ompi/mpiext/shortfloat/* and OSX refuses to create an empty archive (linux has no issue with that)

$ make V=1
Making all in c
/bin/sh ../../../../libtool  --tag=CC   --mode=link gcc-8  -g -Wall -Wundef -Wno-long-long -Wsign-compare -Wmissing-prototypes -Wstrict-prototypes -Wcomment -pedantic -Werror-implicit-function-declaration -finline-functions -fno-strict-aliasing -mcx16  -module -avoid-version -Wl,-flat_namespace  -o libmpiext_shortfloat_c.la    -lz 
libtool: link: ar cru .libs/libmpiext_shortfloat_c.a 
ar: no archive members specified
usage:  ar -d [-TLsv] archive file ...
	ar -m [-TLsv] archive file ...
	ar -m [-abiTLsv] position archive file ...
	ar -p [-TLsv] archive [file ...]
	ar -q [-cTLsv] archive file ...
	ar -r [-cuTLsv] archive file ...
	ar -r [-abciuTLsv] position archive file ...
	ar -t [-TLsv] archive [file ...]
	ar -x [-ouTLsv] archive [file ...]
make[1]: *** [libmpiext_shortfloat_c.la] Error 1
make: *** [all-recursive] Error 1

@jsquyres do you know an elegant way for fixing this ?

@jsquyres
Copy link
Member

I'm looking at https://github.com/open-mpi/ompi/blob/master/ompi/mpiext/shortfloat/c/Makefile.am and I don't see any .c files listed. Is that correct?

Is that .la file there solely because the the mpiext system requires a .la file?

@kawashima-fj
Copy link
Member Author

kawashima-fj commented Feb 25, 2019

@jsquyres Yes. The extension is required only for header and module files but the mpiext system requires .la files.

The OMPI_EXT_MAKE_LISTS macro in config/ompi_ext.m4 adds ompi/mpiext/COMPONENT/BINDING/libmpiext_COMPONENT_BINDING.la to the list of the OMPI_MPIEXT_C_LIBS output variable and ompi/Makefile.am uses the output variable. Removing .la from ompi/mpiext/shortfloat/c/Makefile.am causes the following make error.

make[2]: Entering directory '/home/tkawa/src/openmpi-master/build/ompi'
make[2]: *** No rule to make target '../ompi/mpiext/shortfloat/c/libmpiext_shortfloat_c.la', needed by 'libmpi.la'.  Stop.

I could not find an elegant way. The pcollreq extension (for MPIX_-prefixed persisistent collectives) has a dummy function. I can take the same way in this extension to work around the error.

@ggouaillardet This extension is not built and the error does not occur unless a C FP16 type (_Float16) is usable or it is explicitly enabled by --enable-alt-short-float=.... Did you enable it by --enable-alt-short-float=...?

@ggouaillardet
Copy link
Contributor

@kawashima-fj I was using OS X Mojave (x86 arch) with the default clang compiler LLVM version 10.0.0 (clang-1000.11.45.5)

to my surprise, this compiler does support _Float16 out of the box (fwiw, it does not support short float)

a simple workaround is to add some C files with a dummy global subroutine or global variable.
The right fix is likely not to generate a library if such case, but since this extension is aimed at landing into the main codebase, we might simply want to take the above shortcut for now. Please let me know if you want me to issue a PR for that.

kawashima-fj added a commit to kawashima-fj/ompi that referenced this pull request Feb 25, 2019
These dummy functions are required for the following reason.

- The `libmpiext_shortfloat_{c,mpifh,usempif08}.la` files must
  be built because the `OMPI_EXT_MAKE_LISTS` macro in the
  `config/ompi_ext.m4` file adds the files to the lists of the
  `OMPI_MPIEXT_{C,MPIFH,USEMPIF08}_LIBS` output variables and the
  following files use the output variable.
    * `ompi/Makefile.am`
    * `ompi/mpi/fortran/mpif-h/Makefile.am`
    * `ompi/mpi/fortran/use-mpi-f08/Makefile.am`
- The ar command of OS X refuses to create an archive file which
  does not contain any object files.

The `usempi` binding is not affected because `OMPI_MPIEXT_USEMPIF_LIBS`
is not used anywhere by nature. Generally it only includes `mpifh`.

See open-mpi#6205 (comment)

Signed-off-by: KAWASHIMA Takahiro <[email protected]>
@kawashima-fj
Copy link
Member Author

kawashima-fj commented Feb 25, 2019

@ggouaillardet Ok, I see. LLVM (Clang) 6 and 7 supports _Float16 even on no-FP16 CPUs. This will be amended in the next LLVM 8.

I've created the shortcut in #6429.

ggouaillardet added a commit to ggouaillardet/ompi that referenced this pull request Feb 25, 2019
if NOLIB_<component> or NOLIB_<component>_<suffix> is set, do not require
ompi/mpiext/<component>/<lang>/libmpiext_<component>_<suffix>.la

Allow some extensions to be built on OS X since the creation of
archives with no files is not permitted.

Refs. open-mpi#6205

Signed-off-by: Gilles Gouaillardet <[email protected]>
ggouaillardet added a commit to ggouaillardet/ompi that referenced this pull request Feb 25, 2019
the shortfloat extension is only made of header files,
and hence do not require a library to be built.

Refs. open-mpi#6205

Signed-off-by: Gilles Gouaillardet <[email protected]>
ggouaillardet added a commit to ggouaillardet/ompi that referenced this pull request Feb 26, 2019
Do not require an archive when the OMPI_MPIEXT_<ext>_HAVE_OBJECT
macro is defined to 0.
See `ompi/mpiext/example/configure.m4`.

Allow some extensions to be built on OS X since the creation of
archives with no files is not permitted.

Refs. open-mpi#6205

Signed-off-by: Gilles Gouaillardet <[email protected]>
Signed-off-by: KAWASHIMA Takahiro <[email protected]>
ggouaillardet added a commit to ggouaillardet/ompi that referenced this pull request Feb 26, 2019
the shortfloat extension is only made of header files,
and hence do not require a library to be built.

Refs. open-mpi#6205

Signed-off-by: Gilles Gouaillardet <[email protected]>
Signed-off-by: KAWASHIMA Takahiro <[email protected]>
jsquyres pushed a commit to ggouaillardet/ompi that referenced this pull request Apr 18, 2019
Do not require an archive when the OMPI_MPIEXT_<ext>_HAVE_OBJECT
macro is defined to 0.
See `ompi/mpiext/example/configure.m4`.

Allow some extensions to be built on OS X since the creation of
archives with no files is not permitted.

Refs. open-mpi#6205

Signed-off-by: Gilles Gouaillardet <[email protected]>
Signed-off-by: KAWASHIMA Takahiro <[email protected]>
Signed-off-by: Jeff Squyres <[email protected]>
jsquyres pushed a commit to ggouaillardet/ompi that referenced this pull request Apr 18, 2019
the shortfloat extension is only made of header files,
and hence do not require a library to be built.

Refs. open-mpi#6205

Signed-off-by: Gilles Gouaillardet <[email protected]>
Signed-off-by: KAWASHIMA Takahiro <[email protected]>
@jladd-mlnx
Copy link
Member

@kawashima-fj Congrats on your HPL-AI score 🥇 !! Out of curiosity, did you use this code in your exaflop busting run?

@kawashima-fj
Copy link
Member Author

@jladd-mlnx Thank you. We are proud of awards achieved with Open MPI-based Fujitsu MPI. HPL-AI for Fugaku is developed by RIKEN and I don't know the detail. I asked some people in Fujitsu and RIKEN but nobody has the answer. My colleague will contact a developer of Fugaku HPL-AI. When it turns out, I'll share it.

@kawashima-fj
Copy link
Member Author

@jladd-mlnx - @Shinji-Sumimoto had contact with developers of Fugaku HPL-AI. They used Fujitsu MPI which is based on this code but did not use this FP16 MPI datatype.

They said, they first tried to communicate FP16 data as unsigned short using MPI because they wanted to compile the same code on no-FP16 machines. Later they rewrote the code to use low-level communication API (uTofu) for communication performance tuning.

@jladd-mlnx
Copy link
Member

@kawashima-fj , @Shinji-Sumimoto - Thank you very much for your detailed response; it makes perfect sense. Again, congratulations on your HPL and HPL-AI scores.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants