forked from pytorch/pytorch
-
Notifications
You must be signed in to change notification settings - Fork 7
Closed
Labels
Description
🚀 Feature
It was noted the the weirdness of combining the Fusion IR creation and the CudaKernel object creation as they are two separate events and a CudaKernel doesn't necessarily have to be generated.
A common pattern in tests is:
torch::jit::fuser::cuda::CudaKernel prog;
prog.setFusionPtr(std::make_unique<Fusion>());
Fusion* fusion = prog.fusion();
FusionGuard fg(fusion);
It would be better to have:
auto fusion = std::make_unique<Fusion>();
FusionGuard fg(fusion.get());
<sometime later>
//At kernel Generation time.
torch::jit::fuser::cuda::CudaKernel prog(std::move(fusion));
This requires modifying the CudaKernel
Object declaration in kernel_cache.h
to include an explicit constructor that takes a unique_ptr
.
tlemo