Skip to content

codegen uses undefined identifier #1502

@jjsjann123

Description

@jjsjann123

🐛 Describe the bug

codegen error generated from simple view+native_layer_norm_backward.

Repro script:

import torch                                                                                                               
                                                                                                                           
inps = [(torch.Size([768]), torch.float32), (torch.Size([768]), torch.float32), (torch.Size([4, 512, 768]), torch.float32), (torch.Size([4, 512, 1]), torch.float32), (torch.Size([4, 512, 1]), torch.float32), (torch.Size([2048, 768]), torch.float32)]
inps = [torch.ones(shape, dtype=dtype, device='cuda') for (shape, dtype) in inps]                                          
                                                                                                                           
def forward(primals_5, primals_6, primals_15, getitem_2, getitem_1, mm_6):                                                 
    view_19 = torch.ops.aten.view(mm_6, [4, 512, 768])                                                                     
    native_layer_norm_backward_1 = torch.ops.aten.native_layer_norm_backward(view_19, primals_15, [768], getitem_1, getitem_2, primals_6, primals_5, [True, True, True])
    return (native_layer_norm_backward_1,)                                                                                 
                                                                                                                           
f = torch.jit.script(forward)                                                                                                                                                                                      
with torch.jit.fuser("fuser2"):                                                                                            
  for _ in range(5):                                                                                                       
    f(*inps)

Run script with PYTORCH_NVFUSER_DISABLE_FALLBACK=1

Generated error log:

CUDA NVRTC compile error: default_program(3358): error: identifier "i11" is undefined

where i11 was used earlier in a predicate

if (((i221 < T8.size[2]) && (((((((((nvfuser_index_t)blockIdx.y) * (ceilDiv((ceilDiv((ceilDiv((ceilDiv((T8.size[0] * T8.size[1]), ((nvfuser_index_t)blockDim.y))), 4)), 1)), ((nvfuser_index_t)gridDim.y)))) + i174) * 4) + (i179 + nvfuser_zero)) * ((nvfuser_index_t)blockDim.y)) + ((nvfuser_index_t)threadIdx.y)) < (4 * (ceilDiv(i11, 4)))))) {

full log attached

repro.txt

Versions

Reproed on ToT devel. Looks like a real issue.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions