Skip to content

coll/han: dynamic selection does not work for simple algorithms #9883

Open
@mkurnosov

Description

@mkurnosov

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

9704f0f (master)

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

git clone

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

4d07260d9f79bb7f328b1fc9107b45e683cf2c4e ../../../../3rd-party/openpmix (v1.1.3-3319-g4d07260d) 9ac0b7ecee2c97c357bf6751fdaab7a10e62df14 ../../../../3rd-party/prrte (psrvr-v2.0.0rc1-4133-g9ac0b7ec)

Please describe the system on which you are running

  • Operating system/version: Linux 4.16.3-301.fc28.x86_64
  • Computer hardware:
  • Network type: InfiniBand

Details of the problem

Dynamic selection provided via MCA parameters does not work for simple algorithms. Simple algorithm (coll_han_use_simple_<op>) splits global communicator into intra- and inter-node sub-communicators with disabled HAN component (mca_coll_han_comm_create()):

opal_info_set(&comm_info, "ompi_comm_coll_preference", "tuned,^han");

By this reason on sub-communicators simple algorithm uses a collective operation from component with a highest priority.

In the following example we want to choose Bcast from tuned component for intra- and inter-node communication. But simple algorithm calls Bcast from basic component (component with a highest priority).

mpiexec --host cn2:8,cn3:8,cn4:8,cn5:8,cn6:8 --n 40 \
        --map-by core --bind-to core --mca pml ucx \
        --mca coll_basic_priority 90 \
        --mca coll_libnbc_priority 10 \
        --mca coll_adapt_priority 0 \
        --mca coll_sm_priority 0 \
        --mca coll_han_priority 100 \
        --mca coll_han_bcast_dynamic_intra_node_module 4 \
        --mca coll_han_bcast_dynamic_inter_node_module 4 \
        --mca coll_han_use_simple_bcast 1 \
        ./bcast_test

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions