Skip to content

LLVM 9 #3260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Sep 20, 2019
Merged

LLVM 9 #3260

merged 40 commits into from
Sep 20, 2019

Conversation

andrewrk
Copy link
Member

@andrewrk andrewrk commented Sep 20, 2019

andrewrk added 30 commits July 16, 2019 22:23
Upstream commits:
 * 8eb49e0485fc547eead9e47200bbee6d81f391c1
 * 2dcbeabd917e404a0dde0195388da401b849b9a4
 * f0eb2e77b2132a88e2f00d8e06ffa7638c40b4bc

These will be in the next version of musl, so no harm carrying them
here.
llvm is giving me `error: couldn't allocate output register for
constraint '{a0}'` which is a bug that needs to be fixed upstream.
upstream commit 1931d3cb20a00da732c5210b123656632982fde0
upstream commit 1931d3cb20a00da732c5210b123656632982fde0
upstream commit 1931d3cb20a00da732c5210b123656632982fde0
upstream commit 1931d3cb20a00da732c5210b123656632982fde0
This reapplies 182cd0e
to the embedded LLD.
upstream commit 67a4a12d61bfb10b2410b53c5a43ef9b4a03de7d
upstream commit 67a4a12d61bfb10b2410b53c5a43ef9b4a03de7d
upstream commit 67a4a12d61bfb10b2410b53c5a43ef9b4a03de7d
This reapplies 5ce1a96
to the embedded LLD.
This applies 91864f82c7d7bd1a151fdfd076a3a67a2893b868 from LLVM trunk to
the embedded LLD.

Once Zig upgrades to LLD 10, there will be no difference between Zig's
fork and upstream, and Zig's fork can be dropped.
```
Assertion `!isa<DIType>(Scope) && "shouldn't
make a namespace scope for a type"
```

We've had this problem and solved it before; see #579.
@andrewrk andrewrk merged commit e81156b into master Sep 20, 2019
@andrewrk andrewrk deleted the llvm9 branch September 20, 2019 01:13
const lower_mask = (~u32(0)) >> bits_in_word_2;

var r: dwords = undefined;
r.s.low = (a & lower_mask) *% (b & lower_mask);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason LLVM lowers this operation into a __muldi3 call, leading to a wonderful stack overflow whenever it's used.
Same goes for __umoddi3 and udivmod and probably others, I got bored halfway trough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why this isn't happening in LLVM's own compiler-rt. They have the same code:

https://github.com/llvm/llvm-project/blob/llvmorg-9.0.0/compiler-rt/lib/builtins/muldi3.c#L21

We already put "nobuiltin" on all functions:

addLLVMFnAttr(llvm_fn, "nobuiltin");

<andrewrk> LLVM is optimizing part of my implementation of __muldi3 with a call to __muldi3. Is there a way to prevent this? I already have the "nobuiltin" attribute on the function
<andrewrk> what does compiler_rt do to solve this problem?
<nbjoerg> nothing special
<nbjoerg> but it shouldn't be doing that

I guess the next step is a bug report.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did you do to get this behavior? This function is covered by compiler-rt tests, and they passed. Help me reproduce it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did you do to get this behavior? This function is covered by compiler-rt tests, and they passed. Help me reproduce it?

It passes because you're only testing on x64, I saw this problem when targeting riscv64-linux-none (you gotta adjust a bit the stdlib, there are some missing pieces atm).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright I'm going to learn to do this qemu userspace emulation thing today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait so this is not a regression then? It's just riscv-64?

Also I was unable to duplicate it with this test:

const builtin = @import("builtin");

// Ported from
// https://github.com/llvm/llvm-project/blob/llvmorg-9.0.0/compiler-rt/lib/builtins/muldi3.c

const dwords = extern union {
    all: i64,
    s: switch (builtin.endian) {
        .Little => extern struct {
            low: u32,
            high: u32,
        },
        .Big => extern struct {
            high: u32,
            low: u32,
        },
    },
};

fn __muldsi3(a: u32, b: u32) i64 {
    @setRuntimeSafety(builtin.is_test);

    const bits_in_word_2 = @sizeOf(i32) * 8 / 2;
    const lower_mask = (~u32(0)) >> bits_in_word_2;

    var r: dwords = undefined;
    r.s.low = (a & lower_mask) *% (b & lower_mask);
    var t: u32 = r.s.low >> bits_in_word_2;
    r.s.low &= lower_mask;
    t += (a >> bits_in_word_2) *% (b & lower_mask);
    r.s.low +%= (t & lower_mask) << bits_in_word_2;
    r.s.high = t >> bits_in_word_2;
    t = r.s.low >> bits_in_word_2;
    r.s.low &= lower_mask;
    t +%= (b >> bits_in_word_2) *% (a & lower_mask);
    r.s.low +%= (t & lower_mask) << bits_in_word_2;
    r.s.high +%= t >> bits_in_word_2;
    r.s.high +%= (a >> bits_in_word_2) *% (b >> bits_in_word_2);
    return r.all;
}

export fn __muldi3(a: i64, b: i64) i64 {
    @setRuntimeSafety(builtin.is_test);

    const x = dwords{ .all = a };
    const y = dwords{ .all = b };
    var r = dwords{ .all = __muldsi3(x.s.low, y.s.low) };
    r.s.high +%= x.s.high *% y.s.low +% x.s.low *% y.s.high;
    return r.all;
}
0000000000000000 __muldi3:
       0: 13 01 01 fb                  	addi	sp, sp, -80
       4: 23 34 11 04                  	sd	ra, 72(sp)
       8: 23 30 81 04                  	sd	s0, 64(sp)
       c: 23 3c 91 02                  	sd	s1, 56(sp)
      10: 23 38 21 03                  	sd	s2, 48(sp)
      14: 23 34 31 03                  	sd	s3, 40(sp)
      18: 23 30 41 03                  	sd	s4, 32(sp)
      1c: 23 3c 51 01                  	sd	s5, 24(sp)
      20: 23 38 61 01                  	sd	s6, 16(sp)
      24: 23 34 71 01                  	sd	s7, 8(sp)
      28: 23 30 81 01                  	sd	s8, 0(sp)
      2c: 93 8a 05 00                  	mv	s5, a1
      30: 13 09 05 00                  	mv	s2, a0
      34: 37 05 01 00                  	lui	a0, 16
      38: 1b 04 f5 ff                  	addiw	s0, a0, -1
      3c: b3 f4 85 00                  	and	s1, a1, s0
      40: b3 79 89 00                  	and	s3, s2, s0
      44: 13 85 04 00                  	mv	a0, s1
      48: 93 85 09 00                  	mv	a1, s3
      4c: 97 00 00 00                  	auipc	ra, 0
      50: e7 80 40 fb                  	jalr	-76(ra)
      54: 13 0a 05 00                  	mv	s4, a0
      58: 13 55 09 01                  	srli	a0, s2, 16
      5c: 33 7b 85 00                  	and	s6, a0, s0
      60: 93 5b 0a 01                  	srli	s7, s4, 16
      64: 13 85 04 00                  	mv	a0, s1
      68: 93 05 0b 00                  	mv	a1, s6
      6c: 97 00 00 00                  	auipc	ra, 0
      70: e7 80 40 f9                  	jalr	-108(ra)
      74: b3 8b ab 00                  	add	s7, s7, a0
      78: 13 d5 0a 01                  	srli	a0, s5, 16
      7c: b3 74 85 00                  	and	s1, a0, s0
      80: 1b dc 0b 01                  	srliw	s8, s7, 16
      84: 13 85 04 00                  	mv	a0, s1
      88: 93 05 0b 00                  	mv	a1, s6
      8c: 97 00 00 00                  	auipc	ra, 0
      90: e7 80 40 f7                  	jalr	-140(ra)
      94: 33 0b ac 00                  	add	s6, s8, a0
      98: b3 fb 8b 00                  	and	s7, s7, s0
      9c: 13 85 04 00                  	mv	a0, s1
      a0: 93 85 09 00                  	mv	a1, s3
      a4: 97 00 00 00                  	auipc	ra, 0
      a8: e7 80 c0 f5                  	jalr	-164(ra)
      ac: 33 85 ab 00                  	add	a0, s7, a0
      b0: b3 75 8a 00                  	and	a1, s4, s0
      b4: 1b 56 05 01                  	srliw	a2, a0, 16
      b8: b3 09 cb 00                  	add	s3, s6, a2
      bc: 13 15 05 01                  	slli	a0, a0, 16
      c0: b3 64 b5 00                  	or	s1, a0, a1
      c4: 13 55 09 02                  	srli	a0, s2, 32
      c8: 93 85 0a 00                  	mv	a1, s5
      cc: 97 00 00 00                  	auipc	ra, 0
      d0: e7 80 40 f3                  	jalr	-204(ra)
      d4: 13 04 05 00                  	mv	s0, a0
      d8: 13 d5 0a 02                  	srli	a0, s5, 32
      dc: 93 05 09 00                  	mv	a1, s2
      e0: 97 00 00 00                  	auipc	ra, 0
      e4: e7 80 00 f2                  	jalr	-224(ra)
      e8: 33 05 85 00                  	add	a0, a0, s0
      ec: 93 95 09 02                  	slli	a1, s3, 32
      f0: 13 96 04 02                  	slli	a2, s1, 32
      f4: 13 56 06 02                  	srli	a2, a2, 32
      f8: b3 e5 c5 00                  	or	a1, a1, a2
      fc: 13 15 05 02                  	slli	a0, a0, 32
     100: 33 85 a5 00                  	add	a0, a1, a0
     104: 03 3c 01 00                  	ld	s8, 0(sp)
     108: 83 3b 81 00                  	ld	s7, 8(sp)
     10c: 03 3b 01 01                  	ld	s6, 16(sp)
     110: 83 3a 81 01                  	ld	s5, 24(sp)
     114: 03 3a 01 02                  	ld	s4, 32(sp)
     118: 83 39 81 02                  	ld	s3, 40(sp)
     11c: 03 39 01 03                  	ld	s2, 48(sp)
     120: 83 34 81 03                  	ld	s1, 56(sp)
     124: 03 34 01 04                  	ld	s0, 64(sp)
     128: 83 30 81 04                  	ld	ra, 72(sp)
     12c: 13 01 01 05                  	addi	sp, sp, 80
     130: 67 80 00 00                  	ret

Can you open a new issue and describe the steps you're doing to observe the behavior?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you open a new issue and describe the steps you're doing to observe the behavior?

(Once you fill in all the missing pieces in bits/linux/riscv64.zig)

zig0 test -target riscv64-linux-none --override-std-dir ../std ../std/special/compiler_rt/muldi3.zig`
qemu-riscv64 <bin-path>

A small excerpt from gdb:

#12202 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47
#12203 0x00000000004e0004 in compiler_rt.muldi3.__muldsi3 (a=0x0, b=0x0) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:27
#12204 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47
#12205 0x00000000004e0004 in compiler_rt.muldi3.__muldsi3 (a=0x0, b=0x0) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:27
#12206 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47
#12207 0x00000000004e0004 in compiler_rt.muldi3.__muldsi3 (a=0x0, b=0x0) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:27
#12208 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47
#12209 0x00000000004e0004 in compiler_rt.muldi3.__muldsi3 (a=0x0, b=0x0) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:27
#12210 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47
#12211 0x00000000004e0004 in compiler_rt.muldi3.__muldsi3 (a=0x0, b=0x0) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:27
#12212 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47
#12213 0x00000000004e0004 in compiler_rt.muldi3.__muldsi3 (a=0x0, b=0x0) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:27
#12214 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47
#12215 0x00000000004e0004 in compiler_rt.muldi3.__muldsi3 (a=0x0, b=0x0) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:27
#12216 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47
#12217 0x00000000004e0004 in compiler_rt.muldi3.__muldsi3 (a=0x0, b=0x0) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:27
#12218 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47
#12219 0x00000000004e0004 in compiler_rt.muldi3.__muldsi3 (a=0x0, b=0x0) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:27
#12220 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47
#12221 0x00000000004e0004 in compiler_rt.muldi3.__muldsi3 (a=0x0, b=0x0) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:27
#12222 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47
#12223 0x00000000004e0004 in compiler_rt.muldi3.__muldsi3 (a=0x0, b=0x0) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:27
#12224 0x00000000004c2f64 in compiler_rt.muldi3.__muldi3 (a=0x0, b=0x38) at /home/lemonboy/code/zig/std/special/compiler_rt/muldi3.zig:47

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Confirmed, I can reproduce this now. I didn't realize running stuff with qemu was so easy. I was able to get the gdb integration working too. Incredible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants