Closed
Description
Hi, I've been unable to start mpi jobs under slurm reservations with the latest main.
I'm under salloc -N 1 -n 48
, and the mesage is:
$ mpirun -n 1 hostname
--------------------------------------------------------------------------
Your job failed to map. Either no mapper was available, or none
of the available mappers was able to perform the requested
mapping operation.
Mapper result: Out of resource
Application: hostname
#procs to be mapped: 1
Mapping policy: BYSLOT
Binding policy: CORE
--------------------------------------------------------------------------
This happens with 1 as well as with 2 nodes in the reservation. It also doesn't work in 5.0.x, but in 5.0.0rc8 all is well. It doesn't happen when not under slurm.
I tried to chase it down a bit:
$ mpirun -n 1 --prtemca rmaps_base_verbose 10 --display alloc --output tag hostname
[deepv:02593] mca: base: component_find: searching NULL for rmaps components
[deepv:02593] mca: base: find_dyn_components: checking NULL for rmaps components
[deepv:02593] pmix:mca: base: components_register: registering framework rmaps components
[deepv:02593] pmix:mca: base: components_register: found loaded component ppr
[deepv:02593] pmix:mca: base: components_register: component ppr register function successful
[deepv:02593] pmix:mca: base: components_register: found loaded component rank_file
[deepv:02593] pmix:mca: base: components_register: component rank_file has no register or open function
[deepv:02593] pmix:mca: base: components_register: found loaded component round_robin
[deepv:02593] pmix:mca: base: components_register: component round_robin register function successful
[deepv:02593] pmix:mca: base: components_register: found loaded component seq
[deepv:02593] pmix:mca: base: components_register: component seq register function successful
[deepv:02593] [prterun-deepv-2593@0,0] rmaps:base set policy with slot
[deepv:02593] mca: base: components_open: opening rmaps components
[deepv:02593] mca: base: components_open: found loaded component ppr
[deepv:02593] mca: base: components_open: component ppr open function successful
[deepv:02593] mca: base: components_open: found loaded component rank_file
[deepv:02593] mca: base: components_open: found loaded component round_robin
[deepv:02593] mca: base: components_open: component round_robin open function successful
[deepv:02593] mca: base: components_open: found loaded component seq
[deepv:02593] mca: base: components_open: component seq open function successful
[deepv:02593] mca:rmaps:select: checking available component ppr
[deepv:02593] mca:rmaps:select: Querying component [ppr]
[deepv:02593] mca:rmaps:select: checking available component rank_file
[deepv:02593] mca:rmaps:select: Querying component [rank_file]
[deepv:02593] mca:rmaps:select: checking available component round_robin
[deepv:02593] mca:rmaps:select: Querying component [round_robin]
[deepv:02593] mca:rmaps:select: checking available component seq
[deepv:02593] mca:rmaps:select: Querying component [seq]
[deepv:02593] [prterun-deepv-2593@0,0]: Final mapper priorities
[deepv:02593] Mapper: rank_file Priority: 100
[deepv:02593] Mapper: ppr Priority: 90
[deepv:02593] Mapper: seq Priority: 60
[deepv:02593] Mapper: round_robin Priority: 10
====================== ALLOCATED NODES ======================
dp-dam01: slots=48 max_slots=0 slots_inuse=0 state=UP
Flags: DAEMON_LAUNCHED:SLOTS_GIVEN
aliases: 10.2.10.41,10.2.17.81
=================================================================
[deepv:02593] mca:rmaps: mapping job prterun-deepv-2593@1
[deepv:02593] mca:rmaps: setting mapping policies for job prterun-deepv-2593@1 inherit TRUE hwtcpus FALSE
[deepv:02593] mca:rmaps mapping given by MCA param
[deepv:02593] mca:rmaps[540] default binding policy given
[deepv:02593] mca:rmaps:rf: job prterun-deepv-2593@1 not using rankfile policy
[deepv:02593] mca:rmaps:ppr: job prterun-deepv-2593@1 not using ppr mapper PPR NULL policy PPR NOTSET
[deepv:02593] [prterun-deepv-2593@0,0] rmaps:seq called on job prterun-deepv-2593@1
[deepv:02593] mca:rmaps:seq: job prterun-deepv-2593@1 not using seq mapper
[deepv:02593] mca:rmaps:rr: mapping job prterun-deepv-2593@1
[deepv:02593] [prterun-deepv-2593@0,0] Starting with 1 nodes in list
[deepv:02593] [prterun-deepv-2593@0,0] Filtering thru apps
[deepv:02593] [prterun-deepv-2593@0,0] Retained 1 nodes in list
[deepv:02593] [prterun-deepv-2593@0,0] node dp-dam01 has 48 slots available
[deepv:02593] AVAILABLE NODES FOR MAPPING:
[deepv:02593] node: dp-dam01 daemon: 1 slots_available: 48
[deepv:02593] mca:rmaps:rr: mapping by slot for job prterun-deepv-2593@1 slots 48 num_procs 1
[deepv:02593] mca:rmaps:rr:slot working node dp-dam01
[deepv:02593] [prterun-deepv-2593@0,0] get_avail_ncpus: node dp-dam01 has 0 procs on it
[deepv:02593] mca:rmaps:rr:slot job prterun-deepv-2593@1 is oversubscribed - performing second pass
[deepv:02593] mca:rmaps:rr:slot working node dp-dam01
[deepv:02593] [prterun-deepv-2593@0,0] get_avail_ncpus: node dp-dam01 has 0 procs on it
--------------------------------------------------------------------------
Your job failed to map. Either no mapper was available, or none
of the available mappers was able to perform the requested
mapping operation.
Mapper result: Out of resource
Application: hostname
#procs to be mapped: 1
Mapping policy: BYSLOT
Binding policy: CORE
--------------------------------------------------------------------------
It looked to me like the failure starts happening because prte_rmaps_base_get_ncpus()
returned 0. These debug prints:
diff --git a/src/mca/rmaps/base/rmaps_base_support_fns.c b/src/mca/rmaps/base/rmaps_base_support_fns.c
index 8a2974a90f..c345c2e727 100644
--- a/src/mca/rmaps/base/rmaps_base_support_fns.c
+++ b/src/mca/rmaps/base/rmaps_base_support_fns.c
@@ -668,6 +668,7 @@ int prte_rmaps_base_get_ncpus(prte_node_t *node,
int ncpus;
#if HWLOC_API_VERSION < 0x20000
+ printf("HWLOC_API_VERSION < 0x20000\n");
hwloc_obj_t root;
root = hwloc_get_root_obj(node->topology->topo);
if (NULL == options->job_cpuset) {
@@ -679,6 +680,7 @@ int prte_rmaps_base_get_ncpus(prte_node_t *node,
hwloc_bitmap_and(prte_rmaps_base.available, prte_rmaps_base.available, obj->allowed_cpuset);
}
#else
+ printf("HWLOC_API_VERSION >= 0x20000\n");
if (NULL == options->job_cpuset) {
hwloc_bitmap_copy(prte_rmaps_base.available, hwloc_topology_get_allowed_cpuset(node->topology->topo));
} else {
diff --git a/src/mca/rmaps/round_robin/rmaps_rr_mappers.c b/src/mca/rmaps/round_robin/rmaps_rr_mappers.c
index 484449ce7a..b3e631fea6 100644
--- a/src/mca/rmaps/round_robin/rmaps_rr_mappers.c
+++ b/src/mca/rmaps/round_robin/rmaps_rr_mappers.c
@@ -123,6 +123,7 @@ pass:
* the user didn't specify a required binding, then we set
* the binding policy to do-not-bind for this node */
ncpus = prte_rmaps_base_get_ncpus(node, NULL, options);
+ printf("prte_rmaps_base_get_ncpus() = %d\n", ncpus);
if (options->nprocs > ncpus &&
options->nprocs <= node->slots_available &&
!PRTE_BINDING_POLICY_IS_SET(jdata->map->binding)) {
Produce:
prte_rmaps_base_get_ncpus() = 0
HWLOC_API_VERSION >= 0x20000