-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Description
Introduction
The TVM community has worked since the last release to deliver the following new exciting improvements!
The main tags are below (bold text is with lots of progress): Relax (especial PyTorch frontend), FFI etc.
Please visit the full listing of commits for a complete view: v0.21.dev0...v0.21.0.rc0.
Community
None.
RFCs
None.
Arith
- #18067 - Add IsBound method to ConstIntBoundAnalyzer
- #18031 - Canonicalize mul-coefficient to rhs
- #18025 - Fix canonical simplify for LE with incorrect range assumptions
BugFix
- #18115 - [Fix][Serialization] Add support for NaN value serialization
- #18103 - [Fix] Replace dmlc::Error with std::exception in VerifyGPUCode
- #18092 - [Fix] Fix ExecBuilderDeclareFunction method name in exec_builder.py
- #18087 - fix exception when tvm not built with llvm support
- #18035 - [CUDA] Fix: Update settings for rerun on Increase FloatImm precision when printing 64 bit values in CUDA codegen
- #17968 - [Relax][Pytorch] Bugfix of conv_transpose1d and conv_transpose2d
- #17950 - [Fix][Relax] Fix dangling reference in GetTargetFunctions()
- #17902 - Fix off-by-one error in the type index range check within Object::IsInstance()
- #17882 - [Relax][Pytorch] Fix incorrect behaviour of % (mod) operator in TVM frontend
- #17875 - [Relax][Pytorch] Incorrect Handling of In-Place Ops in FX-Based TVM Frontend
- #17838 - [TIR] Schedule support reverse-inline with reduction blocks
CI
- #18071 - Update windows to 2025
- #18058 - [TEST] Move temp files into tempdir
- #18037 - Further robustify is_last_build check
- #17981 - Update images to
20250513-063354-70aa3797
- #17891 - Update images to 20250428-080833-03eadc65
- #17905 - Install PyTorch 2.7 compatible with CUDA 11.8
- #17887 - Upgrade pytorch to 2.7.0, torchvision to 0.22.0, and vulkan sdk to 1.4.309
- #17846 - Upgrade ubuntu runner image for GitHub CI
Docker
- #17955 - [CI] Reintroduce NNEF to CI images
Docs
- #18056 - Update installation instruction based ffi refactor
Frontend
- #18090 - [Relax][ONNX] Update Reduce ops to support axes as input
- #18072 - [Relax][ONNX] Update ReduceL1 to opset 18
- #18016 - [Relax][ONNX] Replace deprecated
mapping.TENSOR_TYPE_TO_NP_TYPE
usage - #18001 - [Relax][ONNX] Fix: bitwise_not misclassified as binary (is …
- #17990 - [Relax]Fix: Output tensor with zero dimension after torch.u…
- #17925 - [Relax][PyTorch] Re-enable test_subgraph_capture in dynamo test
- #17980 - [ONNX] Make bias input optional in LayerNormalization
- #17918 - [Relax][PyTorch] Add ReLU6 Op Support for Exported Program and FX graph
- #17930 - [Relax][PyTorch] Add torch.outer Op Support for Exported Program and FX graph
- #17932 - [Relax][PyTorch] Add UpSample Bicubic Op Support for Exported Program and FX graph
- #17921 - [Relax][PyTorch] Add AvgPool 1D and 3D Op Support for Exported Program and FX graph
- #17922 - [Relax][PyTorch] Add Adaptive AvgPool 1D and 3D Op Support for Exported Program and FX graph
- #17863 - [Relax][PyTorch] CrossEntropyLoss
- #17919 - [Relax][PyTorch] Add MaxPool 1D and 3D Op Support for Exported Program and FX graph
- #17926 - [Relax][PyTorch] Add tests for all the dtypes supported in the PyTorch frontend
- #17924 - [Relax][PyTorch] Add div.Tensor_mode and trunc Op Support for Exported Program and FX graph
- #17904 - [Relax][PyTorch] Add Meshgrid Op Support for Exported Program and FX graph
- #17915 - [Relax][PyTorch] Add support for linspace op in fx graph
- #17886 - [Relax][PyTorch] Add Pixel Shuffle Op Support for Exported Program and FX graph
- #17908 - [Relax][PyTorch] Add support for eye op in fx graph
- #17893 - [Relax][Pytorch] Add fmod support
- #17894 - [Relax][PyTorch] Support torch.bfloat16 dtype in pytorch frontend
- #17878 - [Relax][PyTorch] Add torch.isin Op Support for Exported Program and FX graph
- #17889 - [Relax][PyTorch] Support linspace op for ExportedProgram importer
- #17868 - [Relax][Pytorch] Add support for ones_like, zero_, zeros, type_as, item ops
- #17857 - [Relax][PyTorch] Refactor norm op for ExportedProgram importer
- #17852 - [Relax][PyTorch] Sort.default
- #17871 - [Relax][Pytorch] Add support for bitwise_or op support
- #17836 - [Relax][PyTorch] support for index.Tensor
- #17864 - [Relax][PyTorch] Support eye op for ExportedProgram importer
- #17858 - [Relax][PyTorch] Add copy_ op support in fxGraph
- #17851 - [Relax][PyTorch] Support
leaky_relu_.default
andreshape_as.default
in ExportedProgram frontend - #17843 - [Relax][PyTorch] Add mul_.Tensor, max.default, min.default and pow.Scalar Op Support into Exported Program Frontend
- #17821 - [Relax][PyTorch] Add Pad Op Support for Exported Program and FX graph
- #17819 - [Relax][PyTorch] Add Stack Op Support for Exported Program
- #17849 - [Relax][PyTorch] Add RSub Op Support for Exported Program and FX graph
- #17850 - [Relax][Pytorch] Add masked_fill op support in ExportedProgram
- #17816 - [Relax][PyTorch] Add PReLU Op Support for Exported Program and FX graph
- #17803 - [Relax][PyTorch] Add Logaddexp op support for exported program
- #17841 - [Relax][PyTorch] Add support for norm op
- #17832 - [Relax][PyTorch] full.default, full_like.default, ones.default
- #17830 - [Relax][PyTorch] Support narrow and broadcast_to ops for ExportedProgram importer
LLVM
- #17859 - [Codegen] Enable SVE/VLA for RISCV targets
- #17958 - Fix JIT unknown reloc issue for case of RISCV
- #17954 - [FFI]Fix compilation errors with clang20
Metal
- #18034 - Fix
GetFunction
of metal runtime
ROCm
- #18029 - Fix ROCm build after FFI refactor
Relax
- #18102 - Fix rotary embedding buffer size calculation
- #17928 - [KVCache] Per Layer Sliding Window
- #17840 - Refactor missing op check into shared utility for Torch frontends
- #17826 - Fix Torch frontends to report all the missing ops
Runtime
- #18097 - CutensorMap support
TIR
- #18068 - Extend address_of to support Buffer objects
- #18069 - Fix block access region detection for nested let bindings
- #18057 - Phase out ProducerStore, ProducerRealize and Prefetch
TOPI
- #18039 - [Relax] Support InstanceNorm & Bugfix of InstanceNorm
- #18063 - [NN][Layer_Norm] Fix layer_norm error with reduce-only axes
- #18006 - Fix index handling in expand_like operator for axis expansion
- #18015 - Support integer type input for log10
- #17942 - Add shape validation to prevent negative dimensions in conv operations
Vulkan
- #18005 - Add TIR unary trigonometric/hyperbolic intrinsic definitions
cuda & cutlass & tensorrt
- #18064 - [CUTLASS] Fix CUTLASS kernel build on Hopper
- #18033 - [CUTLASS] Add GeMM kernels for Blackwell GPUs
- #18024 - [CUDA] Fix thrust with latest FFI refactor
- #18118 - bump cutlass_fpA_intB_gemm
- #18113 - [CMake] Refine C++/CUDA standard settings in CMakeLists.txt
FFI
- #18076 - [FFI][REFACTOR] Stablize container ABI and implementation
- #18091 - [FFI] Provide Field Visit bridge so we can do gradual transition
- #18095 - [FFI][REFACTOR] Migrate attrs to use new reflection
- #18083 - [FFI] Update typeinfo to speedup parent reflection
- #18077 - [FFI] Optimize atomic decref in Object
- #18065 - [FFI] Introduce FFI reflection support in python
- #18062 - [FFI][REFACTOR] Update registry to have complete meta-data
- #18059 - [FFI][REFACTOR] Enhance reflection
- #18050 - [FFI] Enhance FFI Object exception safety during init
- #18121 - Revert "[FFI] Replace
Arg2Str
with a more powerfulfor_each
" - #18117 - [FFI] Replace
Arg2Str
with a more powerfulfor_each
- #18116 - [FFI] Use fold expression to simplify for_each
- #18114 - [FFI] Replace
__attribute__
with C++ standard attributes - #18112 - [FFI] Cleanup visit_attrs attribute after refactor
- #18111 - [FFI] Introduce GlobalDef for function registration
- #18106 - [REFACTOR][FFI] Phase out old VisitAttrs mechanism
- #18042 - [REFACTOR][FFI] Update symbol name for library module
- #18023 - [FFI] More strict tuple constructor checking
- #18022 - [REFACTOR][FFI] Cleanup PackedFunc redirections
- #18020 - [REFACTOR][PYTHON] Phase out tvm._ffi and Limited API support
- #17979 - [FFI][REFACTOR] Update to distinguish as and cast
- #17983 - [FFI][JVM] Upgrade tvm4j to latest FFI
- #18010 - [REFACTOR][FFI] Phase out legacy C API
- #17943 - [FFI] Variant specialize for all ObjectRef
- #17939 - [REFACTOR] Phase out legacy rust ffi
- #17940 - [REFACTOR] Phase out legacy go ffi
- #17931 - [REFACTOR][FFI][RPC] Migrate RPC to use the latest FFI ABI
- #17929 - [REFACTOR][FFI] Cleanup container redirections
- #17927 - [FFI][FEAT] AutoDLPack for taking external tensor objects
- #17923 - [REFACTOR][FFI] Cleanup PackedFunc related redirection
- #17920 - [REFACTOR] Introduce and modernize ffi system
web
- #17946 - [REFACTOR][FFI]Upgrade Web Runtime to new FFI
- #17917 - [WebGPU][CodeGen] Override PrintVecElemLoad and Store for WebGPU
Misc
- #18104 - Add LLVM Legalization for tir.erf
- #18107 - fix: guard tensormap with cuda version check
- #18101 - [REFACTOR] Formalize namespace for all objects
- #18040 - Add support for bucketize
- #18098 - [REFACTOR] Transition VisitAttrs to new reflection mechanism
- #18096 - [REFACTOR] Transition VisitAttrs to new reflection mechanism in tir/ir_builder/meta_schedule
- #18093 - [NVSHMEM] Extend CUDA backend to compile and link TIR modules with NVSHMEM
- #18088 - [Script] Enhance alloc buffer handling in nested frames
- #18086 - [SCRIPT] Bump Python minimum version to 3.9 and update AST compatibility
- #18075 - add support for softsign op
- #18079 - [Script] Add support for merging block annotations
- #18080 - [REFACTOR] Phase out LegacyReprPrinter and improve CommonSubExprElim
- #18078 - [REFACTOR] Phase out the RelaxExpr.checked_type in favor of struct_info
- #18073 - [NVSHMEM] Update NDArray allocation
- #18066 - [Script] Remove deprecated attributes from Constant AST node
- #18060 - Add Python functor support for TIR expressions and statements
- #18054 - [Pytest] Remove obsolete test suite entries
- #18036 - Add support for hamming_window op
- #18049 - [Refactor] Rename
relax_vm
tovm
- #18046 - [3rdparty] Phasing out FlashInfer AOT from 3rdparty
- #18047 - [Backend] JIT compile FlashInfer kernel with FFI header
- #18041 - [DTYPE] Fix dtype functions after dtype refactor
- #18043 - [REFACTOR] Phase out the relax tuning_api
- #18038 - Resolving inconsistency between attention/attention_bias
- #18027 - [Dtype] Low-precision Blackwell Datatype Support
- #17985 - [Codegen] Resolve issue [Bug] inconsistent results for the CPU and CUDA targets. #17965 where the same model produces different outputs on the LLVM (CPU) and CUDA (GPU) backends
- #17978 - Fix IR generation conflict in topi.nn.simplify by separating Tensor and PrimExpr handling
- #18026 - [Python] Fix library lookup path for pip installed packages
- #18019 - Add op support for slice_scatter
- #17974 - Fix FLOP estimation for EvaluateNode by implementing VisitStmt_ handler
- #18013 - Fix RuntimeError: parallel_for_dynamic
- #18014 - Fix division truncation in window size calculation for small dtypes in average_pool
- #17995 - Fix zero-extent loops in PerStoreFeature to prevent crashes
- #17969 - Add registion for the operator asinh, acosh, atanh in llvm
- #17972 - Fix g.costs
- #17953 - Fix sqrt/rsqrt Compatibility with Integer Data Types
- #17961 - Fix basic FLOP estimation for WhileNode
- #17945 - Add registion for the operator asin and acos in llvm
- #17951 - [NODE] Fix structural equality for Array specialization
- #17913 - [Triton] Support latest
triton.compile
interface - #17911 - Add op support for new_zeros op in Exported Program and fx graph frontend
- #17909 - Add masked_fill_.scalar, logical_not.default in Exported Program frontend
- #17910 - [RPC] Fix Bug That Change Dict When Iterate The Keys
- #17896 - Add op support for zeros_like and fill_
- #17900 - Fix onnx expand op
- #17865 - Add support for index_put_ op
- #17839 - Add op support for roll op
- #17844 - Fix incorrect docstring in topi softmax
- #17831 - [3rdparty] Bump DLPack to v1.1 for float8/6/4 dtype supports
- #17848 - Fix docstring in batch_to_space_nd and bitpack
- #17845 - fixing incorrect docstring in upsampling.py
- #17808 - [Install] Fix error during python/tvm installation
vacu9708 and mshr-h
Metadata
Metadata
Assignees
Labels
No labels