Skip to content

[AMDGPU] Wrong O0 codegen for workgroup id x + control flow #61083

Open
@jdoerfert

Description

@jdoerfert

I tried to narrow down what the OpenMP AMDGPU bugs are all about, the ones that do not make sense. Unrelated changes cause it to pass or fail. Anyhow, I stumbled upon something interesting that I think is broken. I have a runnable reproducer but it's a little tricky (I use the JIT to splice in the IR). Anyhow, here is what I think should suffice, I hope, to see a problem. That is, for someone that actually understands AMDGCN.

In the attached zip file is a good.ll and a broken.ll. I got the respective .s files with llc -O0.
In my experiments, good.ll will not run into the trap, broken.ll will.
The trap should not execute, assuming I didn't break stuff doing the manual reduction.
The initial code asserted that workgroup.id.x < workgroup.size.x.
However, when I store away the latter in the broken version, I get 0, in the good version I get 256.

I think the underlying problem is some value propagation along the control edges.
If I store %i15 (workgroup.size.x) in %bb I get 256, if I do it in %bb194 I get 0, the same value that triggers my trap in the broken case.

@arsenm @nhaehnle Help appreciated.

amdgpu_backend_bug.zip

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions