[AMDGPU] Wrong O0 codegen for workgroup id x + control flow

I tried to narrow down what the OpenMP AMDGPU bugs are all about, the ones that do not make sense. Unrelated changes cause it to pass or fail. Anyhow, I stumbled upon something interesting that I think is broken. I have a runnable reproducer but it's a little tricky (I use the JIT to splice in the IR). Anyhow, here is what I think should suffice, I hope, to see a problem. That is, for someone that actually understands AMDGCN.

In the attached zip file is a good.ll and a broken.ll. I got the respective .s files with `llc -O0`.
In my experiments, good.ll will not run into the trap, broken.ll will.
The trap should not execute, assuming I didn't break stuff doing the manual reduction.
The initial code asserted that `workgroup.id.x < workgroup.size.x`.
However, when I store away the latter in the broken version, I get 0, in the good version I get 256.

I think the underlying problem is some value propagation along the control edges.
If I store `%i15 (workgroup.size.x)` in `%bb` I get 256, if I do it in `%bb194` I get 0, the same value that triggers my trap in the broken case.

@arsenm @nhaehnle Help appreciated.

[amdgpu_backend_bug.zip](https://github.com/llvm/llvm-project/files/10859017/amdgpu_backend_bug.zip)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Wrong O0 codegen for workgroup id x + control flow #61083

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[AMDGPU] Wrong O0 codegen for workgroup id x + control flow #61083

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions