Skip to content

Atomics broken on ARMv5 #1162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jsquyres opened this issue Nov 30, 2015 · 21 comments
Closed

Atomics broken on ARMv5 #1162

jsquyres opened this issue Nov 30, 2015 · 21 comments
Assignees
Labels
Milestone

Comments

@jsquyres
Copy link
Member

Per @pterjan note on fe09355:

This breaks openmpi builds in other cases as it is used in various places:

ompi/mca/osc/portals4/osc_portals4_active_target.c: OPAL_THREAD_ADD64(&module->opcount, 1);
ompi/mca/osc/portals4/osc_portals4_active_target.c: OPAL_THREAD_ADD64(&module->opcount, 1);
ompi/mca/osc/pt2pt/osc_pt2pt_passive_target.c: lock->serial_number = OPAL_THREAD_ADD64((int64_t ) &module->lock_serial_number, 1);
ompi/mca/btl/portals4/btl_portals4.c: frag->segments[0].key = OPAL_THREAD_ADD64(&(portals4_btl->portals_rdma_key), 1);
ompi/mca/btl/portals4/btl_portals4.c: frag->segments[0].key = OPAL_THREAD_ADD64(&(portals4_btl->portals_rdma_key), 1);
ompi/mca/mtl/portals4/mtl_portals4_recv.c: ptl_request->opcount = OPAL_THREAD_ADD64((int64_t) &ompi_mtl_portals4.recv_opcount, 1);
ompi/mca/mtl/portals4/mtl_portals4_send.c: ptl_request->opcount = OPAL_THREAD_ADD64((int64_t*)&ompi_mtl_portals4.opcount, 1);

When compiling on armv5 I get:

/home/iurt/rpmbuild/BUILD/openmpi-1.10.0/ompi/.libs/libmpi.so: undefined reference to `OPAL_THREAD_ADD64'
@jsquyres jsquyres added the bug label Nov 30, 2015
@jsquyres jsquyres added this to the v2.0.0 milestone Nov 30, 2015
@hjelmn hjelmn self-assigned this Dec 2, 2015
@hjelmn
Copy link
Member

hjelmn commented Dec 2, 2015

My guess is osc/pt2pt is to blame. I will update it in a bit.

@hppritcha hppritcha modified the milestones: v2.1.0, v2.0.0 Dec 10, 2015
@pterjan
Copy link

pterjan commented Dec 15, 2015

This is unfortunately only true on arm > v6.
I guess I should disable threads on armv5 as there is no smp support
anyway, a patch would still be needed to allow using the simple version
when OMPI_ENABLE_THREAD_MULTIPLE is not enabled:

--- opal/threads/thread_usage.h 2015-12-15 00:10:58.412181433 +0000
+++ opal/threads/thread_usage.h.thr 2015-12-15 00:10:41.521798534 +0000
@@ -126,8 +126,8 @@
}
#endif

-#if OPAL_HAVE_ATOMIC_MATH_64
#if OMPI_ENABLE_THREAD_MULTIPLE
+#if OPAL_HAVE_ATOMIC_MATH_64
static inline int64_t
OPAL_THREAD_ADD64(volatile int64_t *addr, int delta)
{
@@ -141,6 +141,7 @@

 return ret;

}
+#endif
#else
static inline int64_t
OPAL_THREAD_ADD64(volatile int64_t *addr, int delta)
@@ -150,7 +151,6 @@
return ret;
}
#endif
-#endif

#if OMPI_ENABLE_THREAD_MULTIPLE
static inline size_t

On 3 December 2015 at 19:38, Todd Kordenbrock [email protected]
wrote:

I don't have an armv5 to test on, but opal_atomic_cmpset_64 is defined in
ARM.asm which means the fallback implementation of opal_atomic_add_64 in
atomic_impl.h should work. And the definition of OPAL_THREAD_ADD64 in
mutex.h really only needs to be protected by OPAL_HAVE_ATOMIC_ADD_64. So I
think this patch should fix it.

diff --git a/opal/threads/mutex.h b/opal/threads/mutex.h
index d10eb93..b7f761b 100644
--- a/opal/threads/mutex.h
+++ b/opal/threads/mutex.h
@@ -280,7 +280,7 @@ OPAL_THREAD_ADD32(volatile int32_t *addr, int delta)
return ret;
}

-#if OPAL_HAVE_ATOMIC_MATH_64
+#if OPAL_HAVE_ATOMIC_ADD_64
static inline int64_t
OPAL_THREAD_ADD64(volatile int64_t *addr, int delta)
{


Reply to this email directly or view it on GitHub
#1162 (comment).

@jsquyres
Copy link
Member Author

Per discussion 24 Feb 2016, @hjelmn will look at this.

@amckinstry
Copy link

Is there a pointer to the discussion? I've been debugging this in Debian
(https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=816303) where it breaks mpi4py on armel, mips, mipsel, powerpc (ie. basically any 32-bit platform).

@hjelmn
Copy link
Member

hjelmn commented Mar 8, 2016

It was in a conference call. Planning to fix this today.

@amckinstry
Copy link

I've been testing a patch (attached).
Can you review it? appears to work where I've tested it.

regards
Alastair

On 08/03/2016 16:20, Nathan Hjelm wrote:

It was in a conference call. Planning to fix this today.


Reply to this email directly or view it on GitHub
#1162 (comment).

Alastair McKinstry, [email protected], [email protected], https://diaspora.sceal.ie/u/amckinstry
Misentropy: doubting that the Universe is becoming more disordered.

@amckinstry
Copy link

Apologies for the inline patch; the Debian repo is in subversion :-(

--- openmpi-1.10.2.orig/ompi/mca/btl/portals4/btl_portals4.c
+++ openmpi-1.10.2/ompi/mca/btl/portals4/btl_portals4.c
@@ -319,7 +319,11 @@ mca_btl_portals4_prepare_src(struct mca_

         frag->segments[0].base.seg_len = max_data;
         frag->segments[0].base.seg_addr.pval = iov.iov_base;
+#ifdef OPAL_HAVE_ATOMIC_MATH_64
         frag->segments[0].key = OPAL_THREAD_ADD64(&(portals4_btl->portals_rdma_key), 1);
+#else
+        frag->segments[0].key = OPAL_THREAD_ADD32(&(portals4_btl->portals_rdma_key), 1);
+#endif
         frag->base.des_src_cnt = 1;

         /* either a put or get.  figure out which later */
@@ -405,7 +409,11 @@ mca_btl_portals4_prepare_dst(struct mca_

     frag->segments[0].base.seg_len = *size;
     opal_convertor_get_current_pointer( convertor, (void**)&(frag->segments[0].base.seg_addr.pval) );
+#ifdef OPAL_HAVE_ATOMIC_MATH_64
     frag->segments[0].key = OPAL_THREAD_ADD64(&(portals4_btl->portals_rdma_key), 1);
+#else
+    frag->segments[0].key = OPAL_THREAD_ADD32(&(portals4_btl->portals_rdma_key), 1);
+#endif
     frag->base.des_src = NULL;
     frag->base.des_src_cnt = 0;
     frag->base.des_dst = &frag->segments[0].base;
--- openmpi-1.10.2.orig/ompi/mca/mtl/portals4/mtl_portals4_send.c
+++ openmpi-1.10.2/ompi/mca/mtl/portals4/mtl_portals4_send.c
@@ -411,7 +411,11 @@ ompi_mtl_portals4_send_start(struct mca_
     ret = ompi_mtl_datatype_pack(convertor, &start, &length, &free_after);
     if (OMPI_SUCCESS != ret) return ret;

+#ifdef OPAL_HAVE_ATOMIC_MATH_64
     ptl_request->opcount = OPAL_THREAD_ADD64((int64_t*)&ompi_mtl_portals4.opcount, 1);
+#else
+    ptl_request->opcount = OPAL_THREAD_ADD32((int64_t*)&ompi_mtl_portals4.opcount, 1);
+#endif
     ptl_request->buffer_ptr = (free_after) ? start : NULL;
     ptl_request->event_count = 0;
--- openmpi-1.10.2.orig/ompi/mca/osc/portals4/osc_portals4_active_target.c
+++ openmpi-1.10.2/ompi/mca/osc/portals4/osc_portals4_active_target.c
@@ -102,7 +102,11 @@ ompi_osc_portals4_complete(struct ompi_w
                             PTL_SUM,
                             PTL_INT32_T);
             if (ret != OMPI_SUCCESS) return ret;
+#ifdef OPAL_HAVE_ATOMIC_MATH_64
             OPAL_THREAD_ADD64(&module->opcount, 1);
+#else
+            OPAL_THREAD_ADD32(&module->opcount, 1);
+#endif
         }

         ret = ompi_osc_portals4_complete_all(module);
@@ -151,7 +155,11 @@ ompi_osc_portals4_post(struct ompi_group
                             PTL_SUM,
                             PTL_INT32_T);
             if (ret != OMPI_SUCCESS) return ret;
+#ifdef OPAL_HAVE_ATOMIC_MATH_64
             OPAL_THREAD_ADD64(&module->opcount, 1);
+#else
+            OPAL_THREAD_ADD64(&module->opcount, 1);
+#endif
         }
     } else {
         module->post_group = NULL;
--- openmpi-1.10.2.orig/ompi/mca/osc/pt2pt/osc_pt2pt_passive_target.c
+++ openmpi-1.10.2/ompi/mca/osc/pt2pt/osc_pt2pt_passive_target.c
@@ -331,7 +331,11 @@ static int ompi_osc_pt2pt_lock_internal
     lock->target = target;
     lock->lock_acks_expected = 0;
     lock->unlock_acks_expected = 0;
+#ifdef OPAL_HAVE_ATOMIC_MATH_64
     lock->serial_number = OPAL_THREAD_ADD64((int64_t *) &module->lock_serial_number, 1);
+#else
+    lock->serial_number = OPAL_THREAD_ADD32((int64_t *) &module->lock_serial_number, 1);
+#endif
     lock->type = lock_type;
     lock->assert = assert;

@hjelmn
Copy link
Member

hjelmn commented Mar 8, 2016

Not quite correct, the types have to be changed as well.

@hjelmn
Copy link
Member

hjelmn commented Mar 8, 2016

Will post a PR to master in a bit.

@hjelmn
Copy link
Member

hjelmn commented Mar 8, 2016

@regrant Does portals4 support 32-bit platforms? If not we can leave the portals4 component as is.

@regrant
Copy link
Contributor

regrant commented Mar 8, 2016

@hjelmn You can leave the Portals4 component as is, there is no stated support for 32-bit and there are no plans for 32-bit system support in the future.

@hjelmn
Copy link
Member

hjelmn commented Mar 8, 2016

@regrant Thats what I figured. That leaves just osc/pt2pt.

hjelmn added a commit to hjelmn/ompi-release that referenced this issue Mar 8, 2016
Some platforms do no support 64-bit atomic math. osc/pt2pt was using
OPAL_THREAD_ADD64 unconditionally which was causing compilation to
fail on these platforms. The function in question was alreay grabbing
a lock on the osc/pt2pt module so moved the increment to after the
lock to protect the module member (instead of using an atomic).

Fixes open-mpi/ompi#1162

Signed-off-by: Nathan Hjelm <[email protected]>
@hjelmn
Copy link
Member

hjelmn commented Mar 8, 2016

@pterjan
Copy link

pterjan commented Mar 8, 2016

That seems wrong, the atomic add is still there in addition to t asecond increment

@pterjan
Copy link

pterjan commented Mar 8, 2016

Ah hjelmn/ompi-release@f8c72a1 is correct but not the .patch...

@pterjan
Copy link

pterjan commented Mar 8, 2016

Hmm strange it is correct now, sorry...

@hjelmn hjelmn modified the milestones: v1.10.3, v2.1.0 Mar 8, 2016
@amckinstry
Copy link

@hjelmn Thanks, this works for me.

jsquyres pushed a commit to jsquyres/ompi that referenced this issue Aug 23, 2016
…dproc

bug fix: race condition when multiple threads create the same endpoint simultaneously
@eatdust
Copy link

eatdust commented May 23, 2017

Hi there,
we are encountering similar issues again on ARM5 with current openmpi-2.1.1, compilation fails with missing symbols:

../../../ompi/.libs/libmpi.so: undefined reference to opal_atomic_add_64' ../../../ompi/.libs/libmpi.so: undefined reference to opal_atomic_cmpset_64'

Cheers.

@amckinstry
Copy link

There is one fragment from an old arm_detection patch at Debian that was not included in 2.1.1:

Index: openmpi-2.1.0rc2/config/opal_config_asm.m4
===================================================================
--- openmpi-2.1.0rc2.orig/config/opal_config_asm.m4
+++ openmpi-2.1.0rc2/config/opal_config_asm.m4
@@ -1027,7 +1027,7 @@ AC_DEFUN([OPAL_CONFIG_ASM],[
             OPAL_GCC_INLINE_ASSIGN='"mov %0, #0" : "=&r"(ret)'
             ;;

-        armv7*)
+        armv7*|arm-*-linux-gnueabihf)
             opal_cv_asm_arch="ARM"
             OPAL_ASM_SUPPORT_64BIT=1
             OPAL_ASM_ARM_VERSION=7
@@ -1046,7 +1046,7 @@ AC_DEFUN([OPAL_CONFIG_ASM],[
             OPAL_GCC_INLINE_ASSIGN='"mov %0, #0" : "=&r"(ret)'
             ;;

-        armv5*linux*|armv4*linux*)
+        armv5*linux*|armv4*linux*|arm-*-linux-gnueabi)
             # uses Linux kernel helpers for some atomic operations
             opal_cv_asm_arch="ARM"
             OPAL_ASM_SUPPORT_64BIT=0

Now the armv7 is possibly unneeded, but the second fragment looks relevant. We currently have no armv5 hardware in the build systems to test, though

@eatdust
Copy link

eatdust commented May 23, 2017

Thanks for the feedback. It seems not to be enough, even with the correct detection of armv5-*-gnueabi, it seems that some 64 instructions remain in mca/osc:

make[2]: Entering directory 'openmpi-2.1.1/ompi/mca/osc/sm'
  CC       osc_sm_active_target.lo
osc_sm_active_target.c: In function 'ompi_osc_sm_start':
osc_sm_active_target.c:165:23: warning: implicit declaration of function 'opal_atomic_cmpset_64' [-Wimplicit-function-declaration]
             } while (!opal_atomic_cmpset_64 ((int64_t *) module->posts[my_rank] + rank_byte, old, old ^ rank_bit));
                       ^
osc_sm_active_target.c: In function 'ompi_osc_sm_post':
osc_sm_active_target.c:247:20: warning: implicit declaration of function 'opal_atomic_add_64' [-Wimplicit-function-declaration]
             (void) opal_atomic_add_64 ((int64_t *) module->posts[ranks[i]] + my_byte, my_bit);

@jsquyres
Copy link
Member Author

@eatdust GitHub pro tip: surround verbatim blocks with a line of 3 tick marks, and GitHub will render it in fixed width font, etc. See https://help.github.com/articles/creating-and-highlighting-code-blocks/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants