forked from pytorch/pytorch
-
Notifications
You must be signed in to change notification settings - Fork 7
Closed
Labels
Description
Un-prioritized list of things that generally should be done:
- Support inter-block reductions (similar template function approach)
- Reduction to a scalar does not work as we no longer have a tensor axis. Need to figure out how to fix this. Likely want to implement a zero-dim tensor which is just a scalar (this is how PyTorch does it).
- Fusion printer that only prints math exprs from outputs. Rework the ir_printer class.
- SetRandom on fusion is unlikely necessary, lets see if we can pull this out of so much of the logic in the codebase.
- Remove TensorView Reorder code, use tensor domain.
- Cross thread reduction, predicate blocks of code not using threads (i.e. downstream of reduction)
- Remove active view from lower2device
- Move logic out of lower2device, so lower2device is just a wrapper around lowering passes
- Reduction op can only be run on tensor views, we should restrict the IR node to TensorView input/output.
- Rework predicates, per loop, include thread guards (if thread dim doesn't participate, predicate out on threads (i.e. threadIdx.y>0). This can be done at the highest for-loop that doesn’t use that thread dim.
- Remove predicate logic (besides that required for unrolling) out of unrolling.
- Get an external compilation working with torchlib like we do with test_gpu. I'd like to be able to create tutorials that can be individually compiled and run.
- Cleanup decltype in loops in favor of size_t
- Work through graph viz to update usage for
TensorDomain::rootDomain()
TensorDomain::rfactorDomain()
andTensorDomain::domain()
#1396 - Explicitly deleting just the copy operations is sufficient. See: https://abseil.io/tips/143 Transform replay refactor #53 (comment)
- Work back through graph viz and maybe add labels to the Domains we have/want.
- Change split semantics to directly take a parallel type. Maybe call splitOnThreadDim.
- Make threadDim and threadIndices special values we don't keep re-generating. Maybe attach to the fusion?