You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Simplify a few test cases
Replace custom exception checks with ASSERT_THROW macros.
* ExpressionEvaluator
* Stricter EvaluationContext binding rules
1. Don't allow overwriting concrete values
2. Don't allow binding values to expression results
* Fix clang-format errors
* Switch to Int::ScalarType
The expression evaluator is now using Int::ScalarType instead
of plain int.
* Avoid a fight with clang-tidy
* Check the numbers of kernel input and output parameters
* Add an optional arc from TensorView to its root domain
This is generated for detail_level >= DetailLevel::Explicit
* Checks kernel arguments
* Prefer pointers over references
* Bug fix
* Fix accidental construction of IValue
* Use noReduction
* Add const to const pointer
* Make an integer tensor an error as it is not yet supported
* clang-tidy
* Incorporate review feedback
* added lerp support in parser
* add missing addcmul parser and tests
* clang_format
* Return TensorView* from binary/compound/ternary ops
* clang-format
* Use TensorView* param in reductionOp and sum
* Prefer as instead of static_cast
* Transform replay refactor (#53)
Goal of this work is to have the transformation history be specific to IterDomains instead of TensorDomains. This should make it a lot easier to match up IterDomains during replay which can be complicated when taking into consideration reduction axes, rfactors, and broadcast axes.
Co-authored-by: Jie <[email protected]>
Co-authored-by: Kevin Stephano <[email protected]>
* python test fixes (#52)
fix python tests failure:
1. put Fusion inside cudaKernel to facilitate runtime arg check.
2. relax rank check for broadcast support in integration;
3. add shape propagation for newly added opeartion: [addcmul, lerp];
4. adding utility function to create FusionGuard from CudaKernel directly.
* [nvFuser] add torch.jit.fuser context manager (pytorch#38993) (#54)
Summary:
1. `torch.jit.fuser(str)` context manager facilitates switch between backend fusers:
str - 'fuser0' enables only legacy fuser;
str - 'fuser1' enables only NNC;
str - 'fuser2' enables only nvFuser;
2. cleanup updated python tests.
Pull Request resolved: pytorch#38993
Reviewed By: nairbv, pbelevich
Differential Revision: D21800620
Pulled By: soumith
fbshipit-source-id: 7fe855f5a5b97368e5e84c98c28d04b2e1276c85
* Add another reduction example, change fusion printMath.
* Small test fix.
* Change Reduction4 test to use TIDx.x
* Minor cleanup.
* Clean up some noexcepts.
* More cleanup.
* Refactor computeAt, get first broadcast example working.
* Validate first non-trivial broadcast kernel.
* Fix replay when broadcast is merged with non-broadcast dim.
* Add constness in replay and index compute.
* Add another broadcast test. Rework index computation for producers, base on consumer computed indices.
* Val isCconst fix.
* Add dot product gemm example.
* Clang.
* Minor bug fixes.
* Format and add comments to GEMM test.
* WIP: Fix for enabling broadcast after reduction plus a Softmax test. (#66)
* Fix for enabling broadcast after reduction plus a Softmax test.
* Cleaner way of fixing checks for matching non-broadcast dims to non-reduction dims.
* Clang.
Co-authored-by: Kevin Stephano <[email protected]>
Co-authored-by: Christian Sarofeen <[email protected]>
* Backout bad merge conflict resolutions.
* More post rebase cleanup.
* Refix a few tests. Some from a bad rebase.
* Address comments.
* Missed some review comments.
* tmp
Co-authored-by: Lemo <[email protected]>
Co-authored-by: Naoya Maruyama <[email protected]>
Co-authored-by: Jie <[email protected]>
Co-authored-by: Kevin Stephano <[email protected]>
Co-authored-by: Kevin Stephano <[email protected]>
0 commit comments