Add ARM tests on Travis #1864

aytekinar · 2018-11-08T23:32:56Z

I have updated Travis' YAML file to add emulated tests for ARMV6 and ARMV8 architectures using the Alpine Docker images.

The idea has come from @brada4 (see comment in #1861). Because the emulated builds take a lot of time, I have allowed the jobs to fail.

This is just an initial attempt to provide such a support in Travis. Please feel free to comment, improve/modify or even close the PR as you wish.

martin-frbg · 2018-11-09T07:29:52Z

Looks like it almost made it in time, pity qemu is so slow when it has to emulate a cpu architecture.
Do you see the same arm_neon.h failure in local builds of ARMV8 ? I wonder if this failure is real, as there was some recent controversy over how much of the thunderx2 optimizations are generic enough to allow using them for the ARMV8 target.

brada4 · 2018-11-09T08:07:06Z

Maybe it is possible to sabotage slow tests by not installing gfortran and make the builds finish in time? It is like 10% of time still left to do at that point of failure if it was x86_64

martin-frbg · 2018-11-09T08:09:49Z

Doubt it - he already dropped LAPACK and CBLAS from the build if I read that correctly.

aytekinar · 2018-11-09T08:20:23Z

Do you see the same arm_neon.h failure in local builds of ARMV8 ? I wonder if this failure is real, as there was some recent controversy over how much of the thunderx2 optimizations are generic enough to allow using them for the ARMV8 target.

Uhm... I can try, when I have time. However, I will do an emulated build locally, as I do not have an ARM64 device at hand. I do use my RPi2s for ARM32v7 builds when needed for my project. As for the other thing, i.e., the controversy, I do not have much information (and I am not knowledgable, at all, when it comes to architectures). But what I can see there is that it might be a clang related thing (the build starts and proceeds with gcc). It might be worth trying with some debian:stretch-slim image --- I have had some problems with musl-c libraries and their transition packages and headers in alpine. Maybe I am missing some package there, too.

Maybe it is possible to sabotage slow tests by not installing gfortran and make the builds finish in time? It is like 10% of time still left to do at that point of failure if it was x86_64

I had thought about the same thing. I have created an organization openblas at Docker Hub, ownership of which I can happily hand over to any of the contributors here, and I will create and push the alpine images which have the dependencies preinstalled. Then, changing the Dockerfile to have as fewer layers as possible could make at least the builds finish.

EDIT. Well, it indeed helped --- now it times out at 99% 🤣 Anyways... It was a nice try, IMO.

Doubt it - he already dropped LAPACK and CBLAS from the build if I read that correctly.

Correct. That was intentional, as I had thought that the BLAS part could be more than enough to test if the builds proceed.

brada4 · 2018-11-09T11:29:55Z

MUSL is tested in other alpine linux tests , does not hurt to test extra. Debian (and glibc) might be 1c bigger to stuff in 50 minutes we have... I think first thing that does not time out is already a revolution....

aytekinar · 2018-11-09T11:42:32Z

Well, base image for Alpine is ~5MB whereas Debian's slim images are around 17MB. But you are probably right in that when we bundle them together with glibc and friends, they will eventually add up and exceed the size of Alpine by far.

Anyways... It is a pity that the builds timeout at 99% :(

martin-frbg · 2018-11-09T11:54:12Z

Maybe restricting the build to the static libopenblas would help (though this would drop the completeness check with gensymbol/linktest.c)

brada4 · 2018-11-09T22:01:32Z

Either way worth keeping as an open PR for reminder and retrying when ARM code changes a lot..

martin-frbg · 2018-11-10T10:30:27Z

BUILD_SHARED=NO does not get us past 98/99 percent either. Trying with make -j 3 now in my fork although I doubt it is i/o-bound.

martin-frbg · 2018-11-10T11:18:56Z

According to pkgs.alpinelinux.org, arm_neon.h should be available in /usr/lib/clang/5.0.1/include as part of the clang-dev package on aarch64.

aytekinar · 2018-11-10T14:41:41Z

According to pkgs.alpinelinux.org, arm_neon.h should be available in /usr/lib/clang/5.0.1/include as part of the clang-dev package on aarch64.

I have updated the images --- now they have clang-dev. Please feel free to make PR's to aytekinar/openblas-alpine to modify the Docker image as needed. In fact, I have also invited you, @martin-frbg, as a collaborator to the repo.

martin-frbg · 2018-11-10T15:10:29Z

Thank you. The clang-aarch64 build no longer errors now, though obviously it still times out like the others. I am still experimenting with running more than 2 concurrent make jobs in my fork - make -j 3 looked like it made the two gcc-based builds run to completion but I am no longer sure if it was actually using 3 threads or if the setting was overruled by getarch and I was just lucky.

aytekinar · 2018-11-10T15:47:14Z

How about -j$(nproc)? Do you think overloading is better?

martin-frbg · 2018-11-10T15:53:01Z

It will choose -j 2 automatically for the two cores provided by travis, still testing if -j3 or -j4 would really help.

brada4 · 2018-11-10T22:17:38Z

The way emulator could work jit-ing chains of ARM code - doubling ncpus will keep compiler code more time in RAM and JIT-ed, and tests are highly undesirabe as fresh advanced code to be jitted again.

martin-frbg · 2018-11-11T10:02:59Z

Seems 3 concurrent make jobs do not help. With the latest changes to the docker environment the ARM32 gcc job manages to finish in a bit over 47 minutes. ARM64 gcc sometimes barely makes it as well while the two clang-based jobs consistently time out at around 90 percent.

brada4 · 2018-11-11T11:41:37Z

In a year it will fit in timeout...

martin-frbg · 2018-11-11T12:26:53Z

We could look into moving the ARM tests to CircleCI, which appears to have a per-month limit on cpu time rather than per task. (Though with the cap at 1000 minutes we would only get 4 or 5 commits checked per month so probably not worth the trouble).
Perhaps we could just live with the incomplete builds for now, but it looks to me as if the allow_failures is not parsed as intended right now - I do not see it mentioned in the log, and the entire CI run gets flagged as failed which is probably not what we want.

aytekinar · 2018-11-11T13:37:26Z

Perhaps we could just live with the incomplete builds for now, but it looks to me as if the allow_failures is not parsed as intended right now - I do not see it mentioned in the log, and the entire CI run gets flagged as failed which is probably not what we want.

I have tried adding matrix: in front of allow_failures as it seems to me, as per the documentation, that Travis might be checking for matrix.allow_failures, which we had not had before (your naming convention has jobs instead of matrix in the top part). Let's see if this time it is parsed properly. Then we can merge this PR?

martin-frbg · 2018-11-11T14:36:39Z

The current setup uses something called "build stages", but as I understand allow_failures should work there as well.

brada4 · 2018-11-11T18:32:31Z

I think it depends from where the wind blows....
Probably dumping gfortran and dynamic library can make gcc version through most of the time I think it must be rested for the time travis gets speedier computers....

martin-frbg · 2018-11-11T19:10:51Z

@aytekinar looks like the "env" is not matched for some reason, but matching on "name" and/or "services" appears to work. (Tried both at the same time so cannot say for sure which it is)

aytekinar · 2018-11-11T19:23:25Z

Trying either does not work. I do not get it :(

Normally, it should separate as in this, right? Why do I not see this in this build? Is this a bug?

martin-frbg · 2018-11-11T19:35:02Z

The lack of separation seems to be a travis bug (or missing feature) for "build stages" (saw a bug open for it but did not keep the number or link). Haven't tried with "matrix" but looks like I got it to work with stages (no separation but a line acknowledging that the jobs were allowed to fail, and the overall status set to "passed". Now to see if it is "-names" or "-services" that actually did the trick.

aytekinar · 2018-11-11T19:41:31Z

I have removed stage: test from the YAML file, which seems to have fixed the problem. Finally, it is working. I am not sure if you definitely need stage: test tag, as the YAML file seems to contain only one stage, anyways. Then, we might try appending

stages:
  - test

to exactly specify what stages there are.

EDIT. I have removed services: docker, and listed env:s under matrix.allow_failures. This way, some other build/test can still use services: docker and will not be allowed to fail.

Updated `.travis.yml` file to add emulated tests for `ARMV6` and `ARMV8` architectures with `gcc` and `clang`. Created prebuilt images with required dependencies. Squashed layers into one.

martin-frbg · 2018-11-11T19:52:19Z

The alternative would appear to be to keep using "jobs" where you switched to "matrix" at some point

aytekinar · 2018-11-11T19:54:02Z

The alternative would appear to be to keep using "jobs" where you switched to "matrix" at some point

... while keeping stage: test everywhere?

martin-frbg · 2018-11-11T19:58:44Z

The alternative would appear to be to keep using "jobs" where you switched to "matrix" at some point

... while keeping stage: test everywhere?

Yes, at least that works for me - I copied your original PR to my fork for experimenting, and what I now have is just your original additions with "-name" in place of "-env" in the allowed_failures. (job still running so not sure if it was name or services keyword that did the trick - though we might as well use both)

aytekinar · 2018-11-11T20:03:09Z

Weird. I could not get them working without removing stage: test from everywhere. Apparently, jobs and matrix are simply aliases, so it should not change much. However, I could only get the separation between build jobs and allowed failures by removing the stage tag.

It's your call. You can simply add your version here from your branch.

martin-frbg · 2018-11-12T13:30:32Z

Misunderstanding actually - "my" version does not produce the nice separation, it just adds a comment naming the allowed failures. As the "stage: test" does not appear to serve any purpose (now at least),
removing it to get a cleaner look of the results panel is alright I think.

martin-frbg closed this Nov 11, 2018

martin-frbg reopened this Nov 11, 2018

Update .travis.yml

e366693

Updated `.travis.yml` file to add emulated tests for `ARMV6` and `ARMV8` architectures with `gcc` and `clang`. Created prebuilt images with required dependencies. Squashed layers into one.

martin-frbg merged commit 2c5725c into OpenMathLib:develop Nov 12, 2018

martin-frbg added this to the 0.3.4 milestone Nov 12, 2018

aytekinar deleted the patch-1 branch November 12, 2018 13:55

martin-frbg mentioned this pull request Nov 12, 2018

ARM Travis tests running on Emulator == SLOW #1867

Closed

martin-frbg mentioned this pull request Jul 25, 2019

OpenBLAS on docker hub #2195

Closed

Add ARM tests on Travis #1864

Add ARM tests on Travis #1864

Uh oh!

Conversation

aytekinar commented Nov 8, 2018

Uh oh!

martin-frbg commented Nov 9, 2018

Uh oh!

brada4 commented Nov 9, 2018

Uh oh!

martin-frbg commented Nov 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aytekinar commented Nov 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brada4 commented Nov 9, 2018

Uh oh!

aytekinar commented Nov 9, 2018

Uh oh!

martin-frbg commented Nov 9, 2018

Uh oh!

brada4 commented Nov 9, 2018

Uh oh!

martin-frbg commented Nov 10, 2018

Uh oh!

martin-frbg commented Nov 10, 2018

Uh oh!

aytekinar commented Nov 10, 2018

Uh oh!

martin-frbg commented Nov 10, 2018

Uh oh!

aytekinar commented Nov 10, 2018

Uh oh!

martin-frbg commented Nov 10, 2018

Uh oh!

brada4 commented Nov 10, 2018

Uh oh!

martin-frbg commented Nov 11, 2018

Uh oh!

brada4 commented Nov 11, 2018

Uh oh!

martin-frbg commented Nov 11, 2018

Uh oh!

aytekinar commented Nov 11, 2018

Uh oh!

martin-frbg commented Nov 11, 2018

Uh oh!

brada4 commented Nov 11, 2018

Uh oh!

martin-frbg commented Nov 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aytekinar commented Nov 11, 2018

Uh oh!

martin-frbg commented Nov 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aytekinar commented Nov 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martin-frbg commented Nov 11, 2018

Uh oh!

aytekinar commented Nov 11, 2018

Uh oh!

martin-frbg commented Nov 11, 2018

Uh oh!

aytekinar commented Nov 11, 2018

Uh oh!

martin-frbg commented Nov 12, 2018

Uh oh!

Uh oh!

martin-frbg commented Nov 9, 2018 •

edited

Loading

aytekinar commented Nov 9, 2018 •

edited

Loading

martin-frbg commented Nov 11, 2018 •

edited

Loading

martin-frbg commented Nov 11, 2018 •

edited

Loading

aytekinar commented Nov 11, 2018 •

edited

Loading