forked from pytorch/pytorch
-
Notifications
You must be signed in to change notification settings - Fork 7
Closed
Description
🐛 Describe the bug
To reproduce:
PYTORCH_TEST_WITH_SLOW=1 python test/test_jit_cuda_fuser.py -v -k test_nvfuser_extremal_values_masked_amin_cuda_float32
Fusion math is:
Inputs:
T0_g[ iS0{i1}, iS1{i2}, bS2{1}, iS3{i4} ], float
T1_g[ 0 ], float
T2_g[ iS4{i5}, iS5{i6}, bS6{1}, iS7{i7} ], bool
Outputs:
T9_g[ 0 ], float
%kernel_math {
T3_l[ iS8{i5}, iS9{i6}, bS10{1}, iS11{i7} ]
= T2_g[ iS4{i5}, iS5{i6}, bS6{1}, iS7{i7} ];
T4_l[ iS12{i5}, iS13{i6}, bS14{1}, iS15{i7} ]
= T3_l[ iS8{i5}, iS9{i6}, bS10{1}, iS11{i7} ];
T5_l[ bS16{1}, bS17{1}, bS18{1}, bS19{1} ]
= broadcast( T1_g[ 0 ] )
T6_l[ iS20{i5}, iS21{i6}, bS22{1}, iS23{i7} ]
= where(T4_l[ iS12{i5}, iS13{i6}, bS14{1}, iS15{i7} ]
, T0_g[ iS0{i1}, iS1{i2}, bS2{1}, iS3{i4} ]
, T5_l[ bS16{1}, bS17{1}, bS18{1}, bS19{1} ]);
T7_l[ iS24{i5}, iS25{i6}, iS26{i7} ]
= squeeze( T6_l[ iS20{i5}, iS21{i6}, bS22{1}, iS23{i7} ] )
T8_l[ rS27{i5}, rS28{i6}, rS29{i7} ]
= reduction( T7_l[ iS24{i5}, iS25{i6}, iS26{i7} ], op = fmin, initial value = double(inf), allreduce = false )
T9_g[ 0 ]
= T8_l[ rS27{i5}, rS28{i6}, rS29{i7} ];
}
but the given input T2 has shape (3, 2, 1, 1)
. The last dim was not correctly marked as broadcast, so I think during codegen, our system is assuming i4 == i7
and generating code based on that.
Versions
TOT devel
Metadata
Metadata
Assignees
Labels
No labels