Skip to content

[MIPS] __builtin_bswap16 miscompilation #103035

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
markus-oberhumer opened this issue Aug 13, 2024 · 14 comments · Fixed by #103398
Closed

[MIPS] __builtin_bswap16 miscompilation #103035

markus-oberhumer opened this issue Aug 13, 2024 · 14 comments · Fixed by #103398

Comments

@markus-oberhumer
Copy link

markus-oberhumer commented Aug 13, 2024

(EDIT: I've added a new Godbolt link and updated the example to make it clear that the bug needs -march=mips32 to trigger)

Godbolt link: https://gcc.godbolt.org/z/vMGc86deh

clang 18.1.8 MIPS miscompilation in Debug mode

// clang -target mips-linux-musl -march=mips32 bug.c -o bug.out
// qemu-mips ./bug.out

unsigned bswap16_fails(unsigned v) {
    // FAILS - cast seems ignored ??
    return __builtin_bswap16((unsigned short) v);
}

unsigned bswap16_ok(unsigned v) {
    return __builtin_bswap16((unsigned short) (v & 0xffff)); // OK
}

int main(void) {
    if (bswap16_ok(0x04030201) != 0x0102)
        return 1;
    if (bswap16_fails(0x04030201) != 0x0102)
        return 2; // ERROR HERE
    return 0;
}
@llvmbot
Copy link
Member

llvmbot commented Aug 13, 2024

@llvm/issue-subscribers-backend-mips

Author: Markus F.X.J. Oberhumer (markus-oberhumer)

clang 18.1.8 MIPS miscompilation in Debug mode
// clang -target mips-linux-musl bug.c -o bug.out
// qemu-mips ./bug.out

unsigned bswap16_fails(unsigned v) {
    // FAILS - cast seems ignored ??
    return __builtin_bswap16((unsigned short) v);
}

unsigned bswap16_ok(unsigned v) {
    return __builtin_bswap16((unsigned short) (v & 0xffff)); // OK
}

int main() {
    if (bswap16_ok(0x04030201) != 0x0102)
        return 1;
    if (bswap16_fails(0x04030201) != 0x0102)
        return 2; // ERROR HERE
}

@dtcxzyw
Copy link
Member

dtcxzyw commented Aug 13, 2024

https://godbolt.org/z/MGWGPz1aE

bswap16_fails:                          # @bswap16_fails
        addiu   $sp, $sp, -8
        sw      $4, 4($sp)
        sll     $1, $4, 8
        andi    $2, $4, 65280
        srl     $2, $2, 8
        or      $1, $1, $2
        andi    $2, $1, 65535
        jr      $ra
        addiu   $sp, $sp, 8
bswap16_ok:                             # @bswap16_ok
        addiu   $sp, $sp, -8
        sw      $4, 4($sp)
        sll     $1, $4, 8
        andi    $2, $4, 65280
        srl     $2, $2, 8
        or      $1, $1, $2
        andi    $2, $1, 65535
        jr      $ra
        addiu   $sp, $sp, 8
main:                                   # @main
        addiu   $sp, $sp, -32
        sw      $ra, 28($sp)                    # 4-byte Folded Spill
        sw      $17, 24($sp)                    # 4-byte Folded Spill
        sw      $16, 20($sp)                    # 4-byte Folded Spill
        sw      $zero, 16($sp)
        lui     $1, 1027
        ori     $16, $1, 513
        jal     bswap16_ok
        move    $4, $16
        addiu   $17, $zero, 258
        beq     $2, $17, $BB2_2
        nop
        addiu   $1, $zero, 1
        j       $BB2_4
        sw      $1, 16($sp)
$BB2_2:                                 # %if.end
        jal     bswap16_fails
        move    $4, $16
        beq     $2, $17, $BB2_4
        nop
        addiu   $1, $zero, 2
        sw      $1, 16($sp)
$BB2_4:                                 # %if.end4
        lw      $2, 16($sp)
        lw      $16, 20($sp)                    # 4-byte Folded Reload
        lw      $17, 24($sp)                    # 4-byte Folded Reload
        lw      $ra, 28($sp)                    # 4-byte Folded Reload
        jr      $ra
        addiu   $sp, $sp, 32

I cannot reproduce this issue.

@markus-oberhumer
Copy link
Author

markus-oberhumer commented Aug 13, 2024

@dtcxzyw Please try this link https://gcc.godbolt.org/z/cseeoKe35

This bug might be a clang/frontend issue.

(Edit: note that the bug only happens when using NO optimizations)

@dtcxzyw
Copy link
Member

dtcxzyw commented Aug 13, 2024

bswap16_fails(unsigned int):
        addiu   $sp, $sp, -16
        sw      $ra, 12($sp)
        sw      $fp, 8($sp)
        move    $fp, $sp
        sw      $4, 4($fp)
        lw      $1, 4($fp)
        wsbh    $1, $1
        andi    $2, $1, 65535
        move    $sp, $fp
        lw      $fp, 8($sp)
        lw      $ra, 12($sp)
        addiu   $sp, $sp, 16
        jr      $ra
        nop

bswap16_ok(unsigned int):
        addiu   $sp, $sp, -16
        sw      $ra, 12($sp)
        sw      $fp, 8($sp)
        move    $fp, $sp
        sw      $4, 4($fp)
        lw      $1, 4($fp)
        ori     $2, $zero, 65535
        and     $1, $1, $2
        wsbh    $1, $1
        andi    $2, $1, 65535
        move    $sp, $fp
        lw      $fp, 8($sp)
        lw      $ra, 12($sp)
        addiu   $sp, $sp, 16
        jr      $ra
        nop

These two pieces of assembly code look equivalent.
Can you print out the result of bswap16_fails(0x04030201)?

cc @topperc

@markus-oberhumer
Copy link
Author

markus-oberhumer commented Aug 13, 2024

I get

bswap16_fails(0x04030201) == 0x00000302 WRONG

bswap16_ok(0x04030201) == 0x00000102 OK

(Edit: using qemu-mips)

@topperc
Copy link
Collaborator

topperc commented Aug 13, 2024

Best guess is it's a problem with fast-isel. Can you try with -mllvm -fast-isel=false

@markus-oberhumer
Copy link
Author

@topperc Indeed, -mllvm -fast-isel=false fixes the problem!

@topperc topperc self-assigned this Aug 13, 2024
topperc added a commit to topperc/llvm-project that referenced this issue Aug 13, 2024
We need to mask the SRL result to 8 bits before ORing in the
SLL. This is needed in case bits 23:16 of the input aren't zero.
They will have been shifted into bits 15:8.

We don't need to AND the result with 0xffff. It's ok if the
upper 16 bits of the register is garbage.

Fixes llvm#103035.
@topperc
Copy link
Collaborator

topperc commented Aug 13, 2024

Can you check what you get for bswap16_fails(0x14233241)?

@markus-oberhumer
Copy link
Author

bswap16_fails(0x14233241) gives 0x00006332 for me.

Are there any command line flags I could testing enabling/disabling hasMips32r2 ?

@topperc
Copy link
Collaborator

topperc commented Aug 14, 2024

bswap16_fails(0x14233241) gives 0x00006332 for me.

So it does appear that byte 0 and byte 2 are ORed together to form byte 1 or the result. But the godbolt link doesn't show any OR instructions https://gcc.godbolt.org/z/cseeoKe35 Can you confirm with objdump if you are getting the same assembly as godbolt?

@markus-oberhumer
Copy link
Author

Ok, here is what I get - it looks like zig cc [1] defaults differ from Godbolt clang defaults. Maybe some -march setting - how can I enable/disable hasMips32r2?

$ zig cc -target mips-linux-musl -c bug.c
$ llvm-objdump -d bug.o
bug.o:	file format elf32-mips

Disassembly of section .text:

00000000 <bswap16_fails>:
       0: 27 bd ff f0  	addiu	$sp, $sp, -0x10 <bswap16_ok+0xffffffffffffffac>
       4: af bf 00 0c  	sw	$ra, 0xc($sp)
       8: af be 00 08  	sw	$fp, 0x8($sp)
       c: 03 a0 f0 25  	move	$fp, $sp
      10: af c4 00 04  	sw	$4, 0x4($fp)
      14: 8f c2 00 04  	lw	$2, 0x4($fp)
      18: 00 02 0a 00  	sll	$1, $2, 0x8 <bswap16_fails+0x8>
      1c: 00 02 12 02  	srl	$2, $2, 0x8 <bswap16_fails+0x8>
      20: 00 22 08 25  	or	$1, $1, $2
      24: 30 21 ff ff  	andi	$1, $1, 0xffff <bswap16_ok+0xffbb>
      28: 30 22 ff ff  	andi	$2, $1, 0xffff <bswap16_ok+0xffbb>
      2c: 03 c0 e8 25  	move	$sp, $fp
      30: 8f be 00 08  	lw	$fp, 0x8($sp)
      34: 8f bf 00 0c  	lw	$ra, 0xc($sp)
      38: 27 bd 00 10  	addiu	$sp, $sp, 0x10 <bswap16_fails+0x10>
      3c: 03 e0 00 08  	jr	$ra
      40: 00 00 00 00  	nop <bswap16_fails>

00000044 <bswap16_ok>:
      44: 27 bd ff f0  	addiu	$sp, $sp, -0x10 <bswap16_ok+0xffffffffffffffac>
      48: af bf 00 0c  	sw	$ra, 0xc($sp)
      4c: af be 00 08  	sw	$fp, 0x8($sp)
      50: 03 a0 f0 25  	move	$fp, $sp
      54: af c4 00 04  	sw	$4, 0x4($fp)
      58: 8f c1 00 04  	lw	$1, 0x4($fp)
      5c: 34 02 ff ff  	ori	$2, $zero, 0xffff <bswap16_ok+0xffbb>
      60: 00 22 10 24  	and	$2, $1, $2
      64: 00 02 0a 00  	sll	$1, $2, 0x8 <bswap16_fails+0x8>
      68: 00 02 12 02  	srl	$2, $2, 0x8 <bswap16_fails+0x8>
      6c: 00 22 08 25  	or	$1, $1, $2
      70: 30 21 ff ff  	andi	$1, $1, 0xffff <bswap16_ok+0xffbb>
      74: 30 22 ff ff  	andi	$2, $1, 0xffff <bswap16_ok+0xffbb>
      78: 03 c0 e8 25  	move	$sp, $fp
      7c: 8f be 00 08  	lw	$fp, 0x8($sp)
      80: 8f bf 00 0c  	lw	$ra, 0xc($sp)
      84: 27 bd 00 10  	addiu	$sp, $sp, 0x10 <bswap16_fails+0x10>
      88: 03 e0 00 08  	jr	$ra
      8c: 00 00 00 00  	nop <bswap16_fails>

[1] https://ziglang.org/download/

@markus-oberhumer
Copy link
Author

Update: just found the obvious -march=mips32r2 option, and testing seems fine with that switch.

So your patch at #103398 looks good.

P.S: Sorry for the confusion, I'm no MIPS expert.

topperc added a commit that referenced this issue Aug 16, 2024
We need to mask the SRL result to 8 bits before ORing in the SLL. This
is needed in case bits 23:16 of the input aren't zero. They will have
been shifted into bits 15:8.

We don't need to AND the result with 0xffff. It's ok if the upper 16
bits of the register are garbage.

Fixes #103035.
@nikic nikic added this to the LLVM 19.X Release milestone Aug 19, 2024
@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status Aug 19, 2024
@nikic
Copy link
Contributor

nikic commented Aug 19, 2024

/cherry-pick ebe7265

llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Aug 19, 2024
We need to mask the SRL result to 8 bits before ORing in the SLL. This
is needed in case bits 23:16 of the input aren't zero. They will have
been shifted into bits 15:8.

We don't need to AND the result with 0xffff. It's ok if the upper 16
bits of the register are garbage.

Fixes llvm#103035.

(cherry picked from commit ebe7265)
@llvmbot
Copy link
Member

llvmbot commented Aug 19, 2024

/pull-request #104745

@nikic nikic moved this from Needs Triage to Done in LLVM Release Status Aug 19, 2024
tru pushed a commit to llvmbot/llvm-project that referenced this issue Aug 20, 2024
We need to mask the SRL result to 8 bits before ORing in the SLL. This
is needed in case bits 23:16 of the input aren't zero. They will have
been shifted into bits 15:8.

We don't need to AND the result with 0xffff. It's ok if the upper 16
bits of the register are garbage.

Fixes llvm#103035.

(cherry picked from commit ebe7265)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

7 participants