forked from pytorch/pytorch
-
Notifications
You must be signed in to change notification settings - Fork 7
Closed
Description
🐛 Bug
Non-static memory allocation used to be rejected at the lowering time by the check in the constructor of Allocate
. It isn't now. E.g.,
Fusion fusion;
FusionGuard fg(&fusion);
TensorView* tv0 = makeDummyTensor(1);
fusion.addInput(tv0);
TensorView* tv1 = add(tv0, new Float(0));
TensorView* tv2 = add(tv1, new Float(0));
fusion.addOutput(tv2);
torch::jit::fuser::cuda::FusionExecutor fe;
fe.compileFusion(&fusion);
This is invalid as we can't allocate a buffer for tv1
, and that was detected before. Now it passes the lowering phase and fails when the generated code is compiled.
__global__ void kernel1(Tensor<float, 1> T0, Tensor<float, 1> T2){
float T1[T2.size[0]];
for(size_t i12 = 0; i12 < T2.size[0]; ++i12 ) {
T1[ i12 ]
= T0[ ( i12 * T0.stride[0] ) ]
+ float(0);
}
for(size_t i13 = 0; i13 < T2.size[0]; ++i13 ) {
T2[ ( i13 * T2.stride[0] ) ]
= T1[ i13 ]
+ float(0);
}
}
Metadata
Metadata
Assignees
Labels
No labels