You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've just set up a new machine with Debian 13 Trixie which includes OpenMPI 5.0.7-1 and gfortan 14.2.0. Hardware is AMD Ryzen 7840U.
I'm developing the NASA GISS ModelE GCM, and the line with call MPI_INIT(rc) is causing a floating point exception. The backtrace is below, and Frame 18 at at model/MPI_Support/dist_grid_mod.F90:277 is the line call MPI_INIT(rc)
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
Backtrace for this error:
#0 0x1555550232ba in ???
#1 0x155555022375 in ???
#2 0x155554d59def in ???
#3 0x155554408b43 in ???
#4 0x1555543c1892 in ???
#5 0x15555439cec4 in ???
#6 0x1555554fcc61 in ???
#7 0x1555542bd03b in ???
#8 0x1555542af328 in ???
#9 0x155553850789 in ???
#10 0x155553851163 in ???
#11 0x15555385b2d9 in ???
#12 0x15555469741b in ???
#13 0x15555469b429 in ???
#14 0x15555469c187 in ???
#15 0x15555469371f in ???
#16 0x1555546c449e in ???
#17 0x15555542bbc9 in ???
#0 0x1555550232ba in ???
#1 0x155555022375 in ???
#2 0x155554d59def in ???
#3 0x155554408b43 in ???
#4 0x1555543c1892 in ???
#5 0x15555439cec4 in ???
#6 0x1555554fcc61 in ???
#7 0x1555542bd03b in ???
#8 0x1555542af328 in ???
#9 0x155553850789 in ???
#10 0x155553851163 in ???
#11 0x15555385b2d9 in ???
#12 0x15555469741b in ???
#13 0x15555469b429 in ???
#14 0x15555469c187 in ???
#15 0x15555469371f in ???
#16 0x1555546c449e in ???
#17 0x15555542bbc9 in ???
#18 0x5555564fb94d in __dist_grid_mod_MOD_init_app
at model/MPI_Support/dist_grid_mod.F90:277
#18 0x5555564fb94d in __dist_grid_mod_MOD_init_app
at model/MPI_Support/dist_grid_mod.F90:277
#19 0x5555557cd185 in initializemodele
at model/MODELE.f:588
#20 0x5555557ca97c in giss_modele_
at model/MODELE.f:234
#21 0x5555557c4971 in modele_maindriver_
at model/MODELE_DRV.f:27
#22 0x555555560b46 in MAIN__
at model/main.F90:2
#23 0x555555560b96 in main
at model/main.F90:3
#19 0x5555557cd185 in initializemodele
at model/MODELE.f:588
#20 0x5555557ca97c in giss_modele_
at model/MODELE.f:234
#21 0x5555557c4971 in modele_maindriver_
at model/MODELE_DRV.f:27
#22 0x555555560b46 in MAIN__
at model/main.F90:2
#23 0x555555560b96 in main
at model/main.F90:3
--------------------------------------------------------------------------
prterun noticed that process rank 0 with PID 500573 on node fw13 exited on
signal 8 (Floating point exception).
--------------------------------------------------------------------------
[Thread 0x1555542ff6c0 (LWP 500572) exited]
[Thread 0x1555545006c0 (LWP 500571) exited]
[Inferior 1 (process 500568) exited with code 0210]
(gdb)
If I turn of FPE checking around that line, the model runs:
This new system is new hardware, new OS, and updated gfortran and OpenMPI. In an attempt to isolate the issue I've done the following:
Tested on this new hardware with old dev environment in Docker. Debian 12 Bookworm, gfortran 12, OpenMPI 4.something. No issue -> Not hardware.
Tested with gfortran-12 on this OS, installed with apt install gfortran-12 and adjusting the Makefile. I assume this uses the same system installed OpenMPI 5.x. Issue exists -> Not gfortran.
Disabling FPE checking just for the MPI_INIT line above suggests this may be related to the new OpenMPI.
I've just set up a new machine with Debian 13 Trixie which includes OpenMPI 5.0.7-1 and gfortan 14.2.0. Hardware is AMD Ryzen 7840U.
I'm developing the NASA GISS ModelE GCM, and the line with
call MPI_INIT(rc)
is causing a floating point exception. The backtrace is below, and Frame 18 atat model/MPI_Support/dist_grid_mod.F90:277
is the linecall MPI_INIT(rc)
If I turn of FPE checking around that line, the model runs:
This new system is new hardware, new OS, and updated gfortran and OpenMPI. In an attempt to isolate the issue I've done the following:
apt install gfortran-12
and adjusting the Makefile. I assume this uses the same system installed OpenMPI 5.x. Issue exists -> Not gfortran.MPI_INIT
line above suggests this may be related to the new OpenMPI.I am sorry but I am unable to create an MWE, but if someone did want to test this, I could help set up a dev environment. The latest GCM code is the last link at https://simplex.giss.nasa.gov/snapshots/ and the system can run in Docker (see https://github.com/nasa-giss/docker).
The text was updated successfully, but these errors were encountered: