kurisu add assume attr patch 1 #8

kurisu6912 · 2025-09-05T05:23:40Z

[Relax] Allow PrimValue as index in relax.op.take ([Relax] Allow PrimValue as index in relax.op.take apache/tvm#16940)
[CI] Enable Conda setup v3 ([CI] Enable Conda setup v3 apache/tvm#16942)
[tir][Compute-at] Make compute-ated block simple when the predicate could be merged ([tir][Compute-at] Make compute-ated block simple when the predicate could be merged apache/tvm#16945)
[CI] Use LLVM17 for tests on ci_cpu ([CI] Use LLVM17 for tests on ci_cpu apache/tvm#16931)
[CI] Update image tag to 20240428-060115-0b09ed018 ([CI] Update image tag to 20240428-060115-0b09ed018 apache/tvm#16948)
[Runtime] Allow offset to be specified in NDArray::CreateView ([Runtime] Allow offset to be specified in NDArray::CreateView apache/tvm#16938)
Enable gemv schedule for adreno (Enable gemv schedule for adreno apache/tvm#16932)
[TOPI] Revert unification of conv2d NHWC hybrid scheduling for arm_cpu targets ([TOPI] Revert unification of conv2d NHWC hybrid scheduling for arm_cpu targets apache/tvm#16951)
Overriding the StructuralEqual() for easy usage (Overriding the StructuralEqual() for easy usage apache/tvm#16908)
[Misc] Add script for testing release package ([Misc] Add script for testing release package apache/tvm#16956)
[TIR] Enhance CLZ intrinsic support ([TIR] Enhance CLZ intrinsic support apache/tvm#16952)
[Unity][Cutlass] Fix C source generation of dense operation ([Unity][Cutlass] Fix C source generation of dense operation apache/tvm#16476)
[Relax] Express dynamic arguments of strided_slice as arguments ([Relax] Express dynamic arguments of strided_slice as arguments apache/tvm#16826)
[CUBLAS] Enable offloading of R.matmul + R.dequantize ([CUBLAS] Enable offloading of R.matmul + R.dequantize apache/tvm#16896)
[SVE] Add get_active_lane_mask builtin ([SVE] Add get_active_lane_mask builtin apache/tvm#16965)
[Bugfix][ONNX] Improve broadcast and batch_matmul conversion ([Bugfix][ONNX] Improve broadcast and batch_matmul conversion apache/tvm#16961)
[TVMScript] Fix error reporting inside Macro func ([TVMScript] Fix error reporting inside Macro func apache/tvm#16967)
[LLVM] Stringref API deprecation fixes ([LLVM] Stringref API deprecation fixes apache/tvm#16968)
[TIR] Support narrow dtype for let binding ([TIR] Support narrow dtype for let binding apache/tvm#16947)
[Relax] Support nested ModuleList in nn.Module ([Relax] Support nested ModuleList in nn.Module apache/tvm#16971)
[SVE] Add codegen support for vscale_range() function attribute ([SVE] Add codegen support for vscale_range() function attribute apache/tvm#16962)
[CUBLAS][FP8] Enable R.matmul + R.multiply offloading ([CUBLAS][FP8] Enable R.matmul + R.multiply offloading apache/tvm#16974)
[Relax] Implement relax.op.view ([Relax] Implement relax.op.view apache/tvm#16955)
[Unity][BYOC] Use arith.Analyzer to check batch equality of matmul in cublas ([Unity][BYOC] Use arith.Analyzer to check batch equality of matmul in cublas apache/tvm#16982)
[BugFix][Relax] change FuseOpsByPattern strategy to pattern-match maximal subgraph ([BugFix][Relax] change FuseOpsByPattern strategy to pattern-match maximal subgraph apache/tvm#16922)
[TOPI] Remove blockIdx.z in topi sort ([TOPI] Remove blockIdx.z in topi sort apache/tvm#16977)
[JVM] Automatic Compatibility of JVM AttachCurrentThread ([JVM] Automatic Compatibility of JVM AttachCurrentThread apache/tvm#16987)
[KVCache] Fix the aux data syncing order of paged KV cache ([KVCache] Fix the aux data syncing order of paged KV cache apache/tvm#16988)
[UnitTest] Use pytest's scope='session' for tvm.testing.parameter ([UnitTest] Use pytest's scope='session' for tvm.testing.parameter apache/tvm#16930)
[Disco] Expose disco.Session.shutdown through the python API ([Disco] Expose disco.Session.shutdown through the python API apache/tvm#16979)
[Cuda] Skip FreeDataSpace when CUDA driver is in inconsistent state ([Cuda] Skip FreeDataSpace when CUDA driver is in inconsistent state apache/tvm#16980)
[DLight] Check for target in function attributes ([DLight] Check for target in function attributes apache/tvm#16958)
[Unity] Check for transpose and dynamic shape in AdjustMatmulOrder ([Unity] Check for transpose and dynamic shape in AdjustMatmulOrder apache/tvm#16589)
[QoL][IR] Provide std::hash and std::equal_to for IR Variable types ([QoL][IR] Provide std::hash and std::equal_to for IR Variable types apache/tvm#16909)
[Relax][Transform] Handle identical PrimFunc with distinct VDevice ([Relax][Transform] Handle identical PrimFunc with distinct VDevice apache/tvm#16959)
[Disco] Allow allocation that only exists on worker0 ([Disco] Allow allocation that only exists on worker0 apache/tvm#16993)
[Disco] Treat hangup of disco worker process as kShutdown ([Disco] Treat hangup of disco worker process as kShutdown apache/tvm#16989)
[Disco] Implement num_workers property for disco.Session ([Disco] Implement num_workers property for disco.Session apache/tvm#16978)
[Bugfix][Disco] Handle NDArray larger than OS buffer for pipe ([Bugfix][Disco] Handle NDArray larger than OS buffer for pipe apache/tvm#16992)
[Relay] fixed to make TupleGetItem inherits the previous span ([Relay] fixed to make TupleGetItem inherits the previous span apache/tvm#16996)
chore: remove repetitive words (chore: remove repetitive words apache/tvm#16957)
[SME] Introduce scalable fp32 dense schedule ([SME] Introduce scalable fp32 dense schedule apache/tvm#16921)
[Runtime][Disco] Restore checks for hangup of disco pipe ([Runtime][Disco] Restore checks for hangup of disco pipe apache/tvm#16997)
[WebGPU] Handle device OOM in createBuffer ([WebGPU] Handle device OOM in createBuffer apache/tvm#17005)
[Runtime] Allow query of available device memory through DeviceAPI ([Runtime] Allow query of available device memory through DeviceAPI apache/tvm#16994)
[KVCache] Support KVCache decode from forked sequence and pop more tokens ([KVCache] Support KVCache decode from forked sequence and pop more tokens apache/tvm#16995)
[DLIGHT][GPU] Improved gemv outer fallback schedule ([DLIGHT][GPU] Improved gemv outer fallback schedule apache/tvm#16973)
[DLIGHT][GPU] Enhance opencl thread limit for schedules ([DLIGHT][GPU] Enhance opencl thread limit for schedules apache/tvm#16972)
[DLight] Update Adreno GEMV Rules ([DLight] Update Adreno GEMV Rules apache/tvm#17016)
[SVE] Use only powers of two as possible vscale values ([SVE] Use only powers of two as possible vscale values apache/tvm#17001)
[TOPI][Testing] Enable conv2d NHWC fp16 topi testing for arm_cpu ([TOPI][Testing] Enable conv2d NHWC fp16 topi testing for arm_cpu apache/tvm#17007)
[COMMUNITY] New committer: Balint Cristian ([COMMUNITY] New committer: Balint Cristian apache/tvm#17018)
[USMP] add missing const specifier for global_const_workspace ([USMP] add missing const specifier for global_const_workspace apache/tvm#16999)
[Metal] Support metal device profiling ([Metal] Support metal device profiling apache/tvm#17025)
Support multinomial_from_uniform dispatch (Support multinomial_from_uniform dispatch apache/tvm#17010)
** [Relax][UnitTest] Validate IRModule with multiple targets ( [Relax][UnitTest] Validate IRModule with multiple targets apache/tvm#16960)**
[DLight] Perf improvement for low_batch_gemv on Metal ([DLight] Perf improvement for low_batch_gemv on Metal apache/tvm#17026)
[WebGPU] Update error messages to be more user-friendly ([WebGPU] Update error messages to be more user-friendly apache/tvm#17021)
[picojson] Let objects be ordered when serializing ([picojson] Let objects be ordered when serializing apache/tvm#17027)
[Web] Add dtype and offset for CreateView in runtime ([Web] Add dtype and offset for CreateView in runtime apache/tvm#17028)
[TIR] Fix Shuffle rewrite ([TIR] Fix Shuffle rewrite apache/tvm#17030)
[Contrib] Implement NDArray cache update ([Contrib] Implement NDArray cache update apache/tvm#17029)
[SVE] Add support for representing and creating buffer-level predicates ([SVE] Add support for representing and creating buffer-level predicates apache/tvm#16966)
[SME] Add scalable fp16->fp32 dense schedule ([SME] Add scalable fp16->fp32 dense schedule apache/tvm#16981)
[SME][TOPI] Add conv2d NHWC SME fp32 schedule ([SME][TOPI] Add conv2d NHWC SME fp32 schedule apache/tvm#17003)
[Web] Fix string to uint8 array for special characters ([Web] Fix string to uint8 array for special characters apache/tvm#17031)
[Relax][Bugfix] Bind symbolic variables in R.match_cast ([Relax][Bugfix] Bind symbolic variables in R.match_cast apache/tvm#17034)
[Relax][Bugfix] Annotate ComputePrimValue output as host function ([Relax][Bugfix] Annotate ComputePrimValue output as host function apache/tvm#17032)
[BugFix][MSC] split name_string with index by colon from the right ([BugFix][MSC] split name_string with index by colon from the right apache/tvm#17000)
[Relax][Bugfix] Apply FuseOps to nested DataflowBlock ([Relax][Bugfix] Apply FuseOps to nested DataflowBlock apache/tvm#17033)
[TOPI] Fix SME conv2d schedule import and intrin argument ([TOPI] Fix SME conv2d schedule import and intrin argument apache/tvm#17040)
[Runtime] Use preferred host memory (pinned memory) in KV cache ([Runtime] Use preferred host memory (pinned memory) in KV cache apache/tvm#17036)
[TIR] Fix Bug in VectorizeLoop ([TIR] Fix Bug in VectorizeLoop apache/tvm#17039)
[Runtime][ROCm] Enable ROCm host memory support ([Runtime][ROCm] Enable ROCm host memory support apache/tvm#17037)
[Bugfix][Support] Fix copy constructor for support::OrderedSet ([Bugfix][Support] Fix copy constructor for support::OrderedSet apache/tvm#17044)
[Disco][QoL] Implement broadcast/scatter methods for Session ([Disco][QoL] Implement broadcast/scatter methods for Session apache/tvm#17035)
[Runtime] Compatibility with dmlc::Stream API changes ([Runtime] Compatibility with dmlc::Stream API changes apache/tvm#16998)
[Runtime] Fix PagedKVCache for PopN and enhance tests ([Runtime] Fix PagedKVCache for PopN and enhance tests apache/tvm#17045)
[DLight] Skip GEMV rules when more than one vector ([DLight] Skip GEMV rules when more than one vector apache/tvm#17052)
[Runtime] Support PagedKVCache with tree attention ([Runtime] Support PagedKVCache with tree attention apache/tvm#17049)
Use adapter.info when available instead of requestAdapterInfo (Use adapter.info when available instead of requestAdapterInfo apache/tvm#17051)
[Runtime] Stateless interface of PagedKVCache leaf node commit ([Runtime] Stateless interface of PagedKVCache leaf node commit apache/tvm#17057)
Introduce outer reduction for metal (Introduce outer reduction for metal apache/tvm#17058)
[Relax][Frontend][Onnx] Cast Op special handling for ShapeExpr input ([Relax][Frontend][Onnx] Cast Op special handling for ShapeExpr input apache/tvm#17061)
[CODEGEN] Vector-Codegen support for llvm-pure-intrin ([CODEGEN] Vector-Codegen support for llvm-pure-intrin apache/tvm#16985)
Add docs of v0.15.0 and v0.16.0 (Add docs of v0.15.0 and v0.16.0 apache/tvm#17064)
[FP8][Codegen] Add make_fp8 vector constructors ([FP8][Codegen] Add make_fp8 vector constructors apache/tvm#17065)
[SME][TOPI] Add conv2d NHWC SME fp16->fp32 schedule ([SME][TOPI] Add conv2d NHWC SME fp16->fp32 schedule apache/tvm#17048)
[BugFix][MetaSchedule] Fix TensorIntrin ‘dot_4x4_i8i8s32_sdot’ is not registered ([BugFix][MetaSchedule] Fix TensorIntrin ‘dot_4x4_i8i8s32_sdot’ is not registered apache/tvm#17066)
[DOC] Update Model Links to Include Commit ([DOC] Update Model Links to Include Commit apache/tvm#17015)
[Relax] Add missing white spaces in error messages ([Relax] Add missing white spaces in error messages apache/tvm#17067)
[WebGPU] Translate int8x4 into u32 ([WebGPU] Translate int8x4 into u32 apache/tvm#17071)
[Metal] Enable Debug Label ([Metal] Enable Debug Label apache/tvm#17059)
[Arith][SVE] Add rewrite rules for indices split by scalable expressions ([Arith][SVE] Add rewrite rules for indices split by scalable expressions apache/tvm#17046)
[SME][Test] Add additional conv2d tests for asymmetric parameters ([SME][Test] Add additional conv2d tests for asymmetric parameters apache/tvm#17055)
[KVCache][Test] Fix TIR attn kernels for uncommon group size ([KVCache][Test] Fix TIR attn kernels for uncommon group size apache/tvm#17074)
[SME] Extract gemm block correctly when fused with bias ([SME] Extract gemm block correctly when fused with bias apache/tvm#17076)
[AOT] Correctly calculate workspace for vector types ([AOT] Correctly calculate workspace for vector types apache/tvm#17077)
[Relax] [PyTorch] Add support for torch.nn.Hardswish ([Relax] [PyTorch] Add support for torch.nn.Hardswish apache/tvm#17084)
[CMake] Show NVCC include directories in compile_commands.json ([CMake] Show NVCC include directories in compile_commands.json apache/tvm#17079)
[Bugfix][NCCL] Release NCCL thread_local resources in destructor ([Bugfix][NCCL] Release NCCL thread_local resources in destructor apache/tvm#17078)
[Relax] Ignore dynamic parameters in RewriteDataflowReshape ([Relax] Ignore dynamic parameters in RewriteDataflowReshape apache/tvm#17086)
[TVMScript][Relax] Preserve tir.SizeVar through TVMScript round-trip ([TVMScript][Relax] Preserve tir.SizeVar through TVMScript round-trip apache/tvm#17083)
[Relax] [PyTorch] Add support for torch.nn.Hardsigmoid ([Relax] [PyTorch] Add support for torch.nn.Hardsigmoid apache/tvm#17085)
[SME] Utilize predication in fp32 matmul and conv2d schedules ([SME] Utilize predication in fp32 matmul and conv2d schedules apache/tvm#17054)
[Relax] [ONNX] Add support for HardSwish ([Relax] [ONNX] Add support for HardSwish apache/tvm#17088)
[UnitTests] Use tvm.ir.assert_structural_equal whenever possible ([UnitTests] Use tvm.ir.assert_structural_equal whenever possible apache/tvm#17092)
[Transform] Modify FuseTIR pass to propagate buffer attributes ([Transform] Modify FuseTIR pass to propagate buffer attributes apache/tvm#17075)
** [KVCache] Unlimited depth blocks ( [KVCache] Unlimited depth blocks apache/tvm#17100)**
[TIR][RPC] Allow RPC calls to compiled PrimFuncs with no arguments ([TIR][RPC] Allow RPC calls to compiled PrimFuncs with no arguments apache/tvm#17098)
[Bugfix] Update FAttrsGetter to return Map<String, ObjectRef> ([Bugfix] Update FAttrsGetter to return Map<String, ObjectRef> apache/tvm#17096)
[Bugfix][CRT] Return error code on error from ModuleGetFunction ([Bugfix][CRT] Return error code on error from ModuleGetFunction apache/tvm#17097)
[RPC] Raise error if server process terminated ([RPC] Raise error if server process terminated apache/tvm#17101)
[Utility][Container] Support non-nullable types in Array::Map ([Utility][Container] Support non-nullable types in Array::Map apache/tvm#17094)
[Dlight] Use 16x32 spatial x reduction thread extents in GEMV scheduling ([Dlight] Use 16x32 spatial x reduction thread extents in GEMV scheduling apache/tvm#17082)
[TVMScript] Better Type Annotation for TIR OP ([TVMScript] Better Type Annotation for TIR OP apache/tvm#17107)
[BugFix][Relay] skip leaf args when matching 'path' part for dominator pattern ([BugFix][Relay] skip leaf args when matching 'path' part for dominator pattern apache/tvm#16983)
[Relax] [ONNX] Add support for HardSigmoid ([Relax] [ONNX] Add support for HardSigmoid apache/tvm#17089)
[TIR][DLight] Enable SimdGroup op for Metal ([TIR][DLight] Enable SimdGroup op for Metal apache/tvm#17112)
[Frontend][ArgParse] Pass default values to target compiler([Bug] Default option is not passed by TVMC Front end apache/tvm#13264) ([Frontend][ArgParse] Pass default values to target compiler(#13264) apache/tvm#17014)
[Relax] Alloc BYOC workspace with R.builtin.alloc_tensor ([Relax] Alloc BYOC workspace with R.builtin.alloc_tensor apache/tvm#17110)
[Relax][VM] Improved error messages for mismatched parameter count ([Relax][VM] Improved error messages for mismatched parameter count apache/tvm#17118)
[Bugfix][Relax] Set purity=false for LazySetOutput ([Bugfix][Relax] Set purity=false for LazySetOutput apache/tvm#17119)
[CudaGraph] Handle exceptions thrown while capturing cuda graph ([CudaGraph] Handle exceptions thrown while capturing cuda graph apache/tvm#17113)
[BugFix][MetaSchedule] MultiLevelTilingTensorCore generates inconsistent thread-binding sketch for batched matmul ([BugFix][MetaSchedule] MultiLevelTilingTensorCore generates inconsistent thread-binding sketch for batched matmul apache/tvm#17012)
[Relax] Support input_axis_separator to allow 2D to 1D conversion ([Relax] Support input_axis_separator to allow 2D to 1D conversion apache/tvm#17115)
[WebGPU] Add tir.dp4a ([WebGPU] Add tir.dp4a apache/tvm#17124)
[Hexagon] Add support for v75 ([Hexagon] Add support for v75 apache/tvm#17123)
[KVCache] Support fork in sliding window sink part ([KVCache] Support fork in sliding window sink part apache/tvm#17127)
[Bugfix] Restrict CopyOnWrite to _type_final ([Bugfix] Restrict CopyOnWrite to _type_final apache/tvm#17132)
[WebGPU] Implement tir.dp4a with WGSL built-in function dot4I8Packed ([WebGPU] Implement tir.dp4a with WGSL built-in function dot4I8Packed apache/tvm#16976)
[Compute-inline] Prefer T.where for reverse compute-inlined block with predicate ([Compute-inline] Prefer T.where for reverse compute-inlined block with predicate apache/tvm#17128)
[TOPI] Add dense schedule for fp16 and fp32 using gemm ([TOPI] Add dense schedule for fp16 and fp32 using gemm apache/tvm#17091)
[TIR][Schedule] Remove @type_check for set_axis_separator ([TIR][Schedule] Remove @type_check for set_axis_separator apache/tvm#17134)
[Backend][ROCm] Fix error when building TVM with LLVM 19 ([Backend][ROCm] Fix error when building TVM with LLVM 19 apache/tvm#17141)
[DOC] Fix typo for the "We utilize the intermediate representation of nn.Graph to convert the OneFlow model to Reley." ([DOC] Fix typo for the "We utilize the intermediate representation of nn.Graph to convert the OneFlow model to Reley." apache/tvm#17146)
[Relax] Fix cublas dispatch for corner cases ([Relax] Fix cublas dispatch for corner cases apache/tvm#17139)
[Fix][TIR] Fix outdated call to create extern buffer in make_extern ([Fix][TIR] Fix outdated call to create extern buffer in make_extern apache/tvm#17138)
[WebGPU] Fall back to 256MB for maxBufferSize if needed ([WebGPU] Fall back to 256MB for maxBufferSize if needed apache/tvm#17150)
GraphExecutor: Fix wild pointer assign when input and output are reshape (GraphExecutor: Fix wild pointer assign when input and output are reshape apache/tvm#17152)
[Utils] Define line-length for "ruff format" ([Utils] Define line-length for "ruff format" apache/tvm#17125)
[QoL][IR] Provide default constructor for NameSupply/GlobalVarSupply ([QoL][IR] Provide default constructor for NameSupply/GlobalVarSupply apache/tvm#17135)
[release] Update version to 0.17.0 on main branch
[release] Update version to 0.18.dev0 on main branch
[Bugfix] Allow import of TVM when current directory is read-only ([Bugfix] Allow import of TVM when current directory is read-only apache/tvm#17142)
[Relax] Fix fuseOps via pattern ([Relax] Fix fuseOps via pattern apache/tvm#17160)
[Hexagon] Support RPC execution of existing shared lib ([Hexagon] Support RPC execution of existing shared lib apache/tvm#17162)
[CI] Remove lint step from unity/pr-head step ([CI] Remove lint step from unity/pr-head step apache/tvm#17155)
[Relax][BugFix] Fix a bug about the IR construction in test file ([Relax][BugFix] Fix a bug about the IR construction in test file apache/tvm#17121)
[Meta Schedule][XGBoost] enable custom callback func test with xgboost>=1.6.0 ([Meta Schedule][XGBoost] enable custom callback func test with xgboost>=1.6.0 apache/tvm#17168)
[Relax] [ONNX] Add support for Sign and Not ([Relax] [ONNX] Add support for Sign and Not apache/tvm#17167)
[TVMJS] Check DataType.NUMPY2STR when saving array ([TVMJS] Check DataType.NUMPY2STR when saving array apache/tvm#17174)
Use packaging.version.parse instead of distutils.version.LooseVersion (Use packaging.version.parse instead of distutils.version.LooseVersion apache/tvm#17173)
[Relay][FQ2I]: Use appropriate dtype while quantizing relay.op.nn.pad… ([Relay][FQ2I]: Use appropriate dtype while quantizing relay.op.nn.pad… apache/tvm#17177)
[MetaSchedule]Add a testcase for padded conv2d in meta_schedule ([MetaSchedule]Add a testcase for padded conv2d in meta_schedule apache/tvm#17171)
[Relax] Integrate cuDNN attention ([Relax] Integrate cuDNN attention apache/tvm#17157)
[Relax][PyTorch] Add support for torch.permute ([Relax][PyTorch] Add support for torch.permute apache/tvm#17184)
[FFI] Add python signal handler for ctypes FFI ([FFI] Add python signal handler for ctypes FFI apache/tvm#17181)
[Hexagon] [CMake] Fix v66 build issue ([Hexagon] [CMake] Fix v66 build issue apache/tvm#17169)
Add packaging to python/gen_requirements.py (Add packaging to python/gen_requirements.py apache/tvm#17188)
[Relax][PyTorch] Add support for torch.einsum ([Relax][PyTorch] Add support for torch.einsum apache/tvm#17186)
[MetaSchedule] Replace xgboost.rabit with xgboost.collective because it's deprecated ([MetaSchedule] Replace xgboost.rabit with xgboost.collective because it's deprecated apache/tvm#17166)
[Disco] Group-wise operation ([Disco] Group-wise operation apache/tvm#17180)
[DLIGHT][GPU] Add OpenCL dequant matmul schedule ([DLIGHT][GPU] Add OpenCL dequant matmul schedule apache/tvm#17187)
Remove and replace deprecated distutils.util.strtobool() (Remove and replace deprecated distutils.util.strtobool() apache/tvm#17185)
[KVCache] Partial layers support ([KVCache] Partial layers support apache/tvm#17192)
[CLML][CI] Fix for few clml regression issues ([CLML][CI] Fix for few clml regression issues apache/tvm#17117)
[Disco] Cross-group and p2p send/receive primitives ([Disco] Cross-group and p2p send/receive primitives apache/tvm#17191)
[TIR][Analyzer] Simplify x==x expressions for all dtypes ([TIR][Analyzer] Simplify x==x expressions for all dtypes apache/tvm#17158)
Add support for torch.nn.functional.max_pool2d ([Relax][PyTorch] Add support for torch.nn.functional.max_pool2d apache/tvm#17189)
** [Relax] Implement Rewriter class for pattern-rewrite ( [Relax] Implement Rewriter class for pattern-rewrite apache/tvm#17149)**
Pass to eliminate redundant branch and overcompute (Pass to eliminate redundant branch and overcompute apache/tvm#17170)
[Cython][FFI] Fix crash when call del operator for handle ([Cython][FFI] Fix crash when call del operator for handle apache/tvm#17190)
[Disco] Implement SocketSession ([Disco] Implement SocketSession apache/tvm#17182)
[LLVM] Fix for getHostCPUFeatures API change ([LLVM] Fix for getHostCPUFeatures API change apache/tvm#17199)
[Hexagon] Fix LWP assembly handler (predicate register) ([Hexagon] Fix LWP assembly handler (predicate register) apache/tvm#17204)
[Relax] Disable fusion for fetching from the packed params in FuseOps ([Relax] Disable fusion for fetching from the packed params in FuseOps apache/tvm#17198)
[Runtime] Allow aborting fetchNDArray through AbortSignal ([Runtime] Allow aborting fetchNDArray through AbortSignal apache/tvm#17208)
[CI] Update dummy-variable regex for pylint ([CI] Update dummy-variable regex for pylint apache/tvm#17206)
[Transform][Relax] Handle is_group argument in IPC AllReduce ([Transform][Relax] Handle is_group argument in IPC AllReduce apache/tvm#17201)
[CI] Reduce logging level when checking if docker image exists ([CI] Reduce logging level when checking if docker image exists apache/tvm#17221)
[Relax] Handle presence of R.call_tir in MergeCompositeFunctions ([Relax] Handle presence of R.call_tir in MergeCompositeFunctions apache/tvm#17220)
[Relax] Fix segfault in rewrite_bindings for MatchCast node ([Relax] Fix segfault in rewrite_bindings for MatchCast node apache/tvm#17226)
[Runtime] Allow aborting fetchWithCache through AbortSignal ([Runtime] Allow aborting fetchWithCache through AbortSignal apache/tvm#17227)
[Relax] FuseTransposeMatmul Pass ([Relax] FuseTransposeMatmul Pass apache/tvm#17234)
[Runtime Patch] Add AbortSignal to fetchWithCache in ArtifactCacheTemplate interface ([Runtime Patch] Add AbortSignal to fetchWithCache in ArtifactCacheTemplate interface apache/tvm#17233)
[3rdparty] Bump FlashInfer ([3rdparty] Bump FlashInfer apache/tvm#17236)
[Bugfix][Cutlass] fix cutlass instantiate attention template bugs ([Bugfix][Cutlass] fix cutlass instantiate attention template bugs apache/tvm#17229)
[Runtime] Reorganize PagedKVCache attn kernel invocation ([Runtime] Reorganize PagedKVCache attn kernel invocation apache/tvm#17237)
[TIR] Validate tir::Buffer axis_separators on construction ([TIR] Validate tir::Buffer axis_separators on construction apache/tvm#17219)
** [Unity][Frontend] Add Sqrt Op ( [Unity][Frontend] Add Sqrt Op apache/tvm#17228)**
[FFI][RUNTIME] Introduce runtime boxed types for int/float/bool ([FFI][RUNTIME] Introduce runtime boxed types for int/float/bool apache/tvm#16183)
[Relax] Remove segfault in R.call_tir_inplace validation ([Relax] Remove segfault in R.call_tir_inplace validation apache/tvm#17242)
[Relax] Implement R.ensure_zero_offset and update memory planning for R.view ([Relax] Implement R.ensure_zero_offset and update memory planning for R.view apache/tvm#17145)
Revert "[FFI][RUNTIME] Introduce runtime boxed types for int/float/bool" (Revert "[FFI][RUNTIME] Introduce runtime boxed types for int/float/bool" apache/tvm#17252)
[WebGPU] Fix unexpected device lost error when intentional dispose ([WebGPU] Fix unexpected device lost error when intentional dispose apache/tvm#17250)
Replacing unary ops with LookUpTable and Take op to improve performance (Replacing unary ops with LookUpTable and Take op to improve performance apache/tvm#17214)
[Relax] Add KVCache Interface for Relax NNModule ([Relax] Add KVCache Interface for Relax NNModule apache/tvm#17261)
[ROCm] Support ROCm 6 ([ROCm] Support ROCm 6 apache/tvm#17256)
[DLIGHT][ADRENO] Fix for opencl adreno matmul schedule ([DLIGHT][ADRENO] Fix for opencl adreno matmul schedule apache/tvm#17259)
[CompileBugfix][contrib] meet 'base64.h: No such file or directory' and '‘tvm::runtime::vm::AllocatorType’ has not been declared' while compiling ([CompileBugfix][contrib] meet 'base64.h: No such file or directory' and '‘tvm::runtime::vm::AllocatorType’ has not been declared' while compiling apache/tvm#17265)
[Disco] Disable splitting nccl communicator in single-group ([Disco] Disable splitting nccl communicator in single-group apache/tvm#17264)
[Relax][Bugfix] Preserve dtype in ToMixedPrecision for kNever ops ([Relax][Bugfix] Preserve dtype in ToMixedPrecision for kNever ops apache/tvm#17263)
** [FFI] Re-introduce the boxed primitive values ( [FFI] Re-introduce the boxed primitive values apache/tvm#17257)**
[CODEGEN][OPENCL] Fix opencl codegen for few ops ([CODEGEN][OPENCL] Fix opencl codegen for few ops apache/tvm#17273)
[Disco] Fix double free of nccl communicator ([Disco] Fix double free of nccl communicator apache/tvm#17275)
[KVCache] Increase coalesce threshold ([KVCache] Increase coalesce threshold apache/tvm#17280)
[TOPI][ADRENO] Add Group Conv2d texture schedule ([TOPI][ADRENO] Add Group Conv2d texture schedule apache/tvm#17274)
[CI] Resolve CI compilation failures on MacOSX ([CI] Resolve CI compilation failures on MacOSX apache/tvm#17271)
[Relay][Pytorch] Add support for aten::tile ([Relay][Pytorch] Add support for aten::tile apache/tvm#17277)
[IR] Handle NaN in StructuralEqual and StructuralHash ([IR] Handle NaN in StructuralEqual and StructuralHash apache/tvm#17249)
[WINDOWS] Compiler options for non x86 targets ([WINDOWS] Compiler options for non x86 targets apache/tvm#17260)
[Doc] Refactor install docs ([Doc] Refactor install docs apache/tvm#17287)
[Codegen] Emit tir::Let as var assignment explicitly ([Codegen] Emit tir::Let as var assignment explicitly apache/tvm#17278)
[Doc] Quick Start ([Doc] Quick Start apache/tvm#17289)
[Relax][Analysis] Handle recursive functions in CollectVarUsage ([Relax][Analysis] Handle recursive functions in CollectVarUsage apache/tvm#17224)
[Cleanup] Remove using namespace tvm::runtime from headers ([Cleanup] Remove using namespace tvm::runtime from headers apache/tvm#17246)
[FFI][Runtime] Use TVMValue::v_int64 to represent boolean values ([FFI][Runtime] Use TVMValue::v_int64 to represent boolean values apache/tvm#17240)
[ROCm] hipBLAS integration ([ROCm] hipBLAS integration apache/tvm#17290)
[Relax][PyTorch] Add support for torch.tile ([Relax][PyTorch] Add support for torch.tile apache/tvm#17291)
[Docs] Introduce Relax API and move legacy part to standalone page ([Docs] Introduce Relax API and move legacy part to standalone page apache/tvm#17286)
[Doc] IRModule ([Doc] IRModule apache/tvm#17298)
[Web] Add TVMArgBool to ArgTypeCode ([Web] Add TVMArgBool to ArgTypeCode apache/tvm#17251)
[Doc] Overview ([Doc] Overview apache/tvm#17296)
[Rocm] Fix non-standard rocm path ([Rocm] Fix non-standard rocm path apache/tvm#17295)
[Codegen][WebGPU] LetNode common subexpr override ([Codegen][WebGPU] LetNode common subexpr override apache/tvm#17302)
[Relax][Bugfix] Support torch.unbind op and fix bugs for expand && split ([Relax][Bugfix] Support torch.unbind op and fix bugs for expand && split apache/tvm#17292)
[Support] Fix the Read/Write of socket stream ([Support] Fix the Read/Write of socket stream apache/tvm#17284)
[Relax] Avoid wrapping TupleStructInfo into a Tuple for R.call_tir ([Relax] Avoid wrapping TupleStructInfo into a Tuple for R.call_tir apache/tvm#17243)
[TE][CreatePrimFunc] Fix create reduce block with spatial iter dependent init value ([TE][CreatePrimFunc] Fix create reduce block with spatial iter dependent init value apache/tvm#17301)
[Runtime] Support KV cache with RoPE extension factor array ([Runtime] Support KV cache with RoPE extension factor array apache/tvm#17294)
[Python][Relax] Rotary positional embedding scaling ([Python][Relax] Rotary positional embedding scaling apache/tvm#17305)
[Relax][PyTorch] Add support for torch.repeat ([Relax][PyTorch] Add support for torch.repeat apache/tvm#17304)
[Relax][Bugfix] Infer TIR values from shapes inside a tuple ([Relax][Bugfix] Infer TIR values from shapes inside a tuple apache/tvm#17312)
[Relax] Identify tuple unpack/repack in CanonicalizeBindings ([Relax] Identify tuple unpack/repack in CanonicalizeBindings apache/tvm#17313)
[Fix][TIR] LowerThreadAllreduce warp reduction mask ([Fix][TIR] LowerThreadAllreduce warp reduction mask apache/tvm#17307)
[Relax][Frontend][Onnx] fix expand bug in onnx frontend ([Relax][Frontend][Onnx] fix expand bug in onnx frontend apache/tvm#17309)
[Doc] Refactor How-To ([Doc] Refactor How-To apache/tvm#17306)
[TVM4J][BugFix] Fix unhandled return type in JNI ([TVM4J][BugFix] Fix unhandled return type in JNI apache/tvm#17308)
[Disco] Add NVSHMEM support (Add NVSHMEM support apache/tvm#17317)
[Doc] Fix doc build error in e2e_opt_model.py ([Doc] Fix doc build error in e2e_opt_model.py apache/tvm#17319)
[Doc] Customize Optimization ([Doc] Customize Optimization apache/tvm#17320)
[Fix] Remove tvm. prefix from image name when ./docker/build.sh ([Fix] Remove tvm. prefix from image name when ./docker/build.sh apache/tvm#17324)
[Relax][Transform] Compose preproc functions in LiftTransformParams ([Relax][Transform] Compose preproc functions in LiftTransformParams apache/tvm#17314)
[Target] Refine equality check on TargetKind instances ([Target] Refine equality check on TargetKind instances apache/tvm#17321)
[Relax][PyTorch] Add support for torch.nn.functional.conv* ([Relax][PyTorch] Add support for torch.nn.functional.conv* apache/tvm#17325)
[KVCache] Add tree attention with paged cache support ([KVCache] Add tree attention with paged cache support apache/tvm#17326)
[Doc] How to Optimize a Language Model ([Doc] How to Optimize a Language Model apache/tvm#17327)
[Doc] Deep Dive TensorIR ([Doc] Deep Dive TensorIR apache/tvm#17328)
[Relax] Allow dynamic shape argument to R.reshape ([Relax] Allow dynamic shape argument to R.reshape apache/tvm#17218)
[Relax][PyTorch][Bugfix] Update layer_norm converter to support immutable_list for normalized_shape ([Relax][PyTorch][Bugfix] Update layer_norm converter to support immutable_list for normalized_shape apache/tvm#17330)
[Relax][PyTorch] Add support for torchvision.ops.stochastic_depth ([Relax][PyTorch] Add support for torchvision.ops.stochastic_depth apache/tvm#17300)
[Rust] Remove mxnet dependency and re-enable rust example ([Rust] Remove mxnet dependency and re-enable rust example apache/tvm#17293)
[Relax][PyTorch][Fix] use_convert_torch_tensor_to_relax() where possible ([Relax][PyTorch][Fix] use_convert_torch_tensor_to_relax() where possible apache/tvm#17335)
[Apps] Remove mxnet dependency from /apps/ios_rpc ([Apps] Remove mxnet dependency from /apps/ios_rpc apache/tvm#17299)
[CI][Hexagon] Forward gtest tests into pytest as separate tests ([CI][Hexagon] Forward gtest tests into pytest as separate tests apache/tvm#17334)
[MSC][BugFix] Bugfix for strided_slice op ([MSC][BugFix] Bugfix for strided_slice op apache/tvm#17315)
[Relax][PyTorch] Add support for torch.ops.aten.sym_size.int ([Relax][PyTorch] Add support for torch.ops.aten.sym_size.int apache/tvm#17342)
[Relax] Update GlobalVar name in AttachGlobalSymbol ([Relax] Update GlobalVar name in AttachGlobalSymbol apache/tvm#17202)
[Relax] Require correct input/output shapes R.call_tir ([Relax] Require correct input/output shapes R.call_tir apache/tvm#17285)
[Relax][Bugfix] FCallPacked not checked in CodegenVMTIR ([Relax][Bugfix] FCallPacked not checked in CodegenVMTIR apache/tvm#17073)
[Apps] Remove mxnet dependency from /apps/android_camera/models ([Apps] Remove mxnet dependency from /apps/android_camera/models apache/tvm#17297)
[Relax][Transform] Handle tuple return in RemoveUnusedOutputs ([Relax][Transform] Handle tuple return in RemoveUnusedOutputs apache/tvm#17253)
[DOCS] Minor fix typo in developer howto guide ([DOCS] Minor fix typo in developer howto guide apache/tvm#17343)
[MSC] Reconstruct tensorrt module ([MSC] Reconstruct tensorrt module apache/tvm#17344)
[Relax] Refactor RealizeVDevice to remove in-place mutation ([Relax] Refactor RealizeVDevice to remove in-place mutation apache/tvm#17213)
[Fix][Relax] Add the missing tree-attn func arg for KV cache creation ([Fix][Relax] Add the missing tree-attn func arg for KV cache creation apache/tvm#17345)
[relay][qnn]: Fix qnn.avg_pool2d layout inference ([relay][qnn]: Fix qnn.avg_pool2d layout inference apache/tvm#17339)
[CI] Upgrade github upload-artifact action ([CI] Upgrade github upload-artifact action apache/tvm#17355)
[LLVM][RUNTIME] Fix RISC-V CodeModel propagation to ORCJIT runtime executor ([LLVM][RUNTIME] Fix RISC-V CodeModel propagation to ORCJIT runtime executor apache/tvm#17347)
[Docs] TVM pip Installation fix ([Docs] TVM pip Installation fix apache/tvm#17352)
[Relax] Fix inline source module cause path too long error ([Relax] Fix inline source module cause path too long error apache/tvm#17354)
[Relax][Frontend][Onnx] fix params name bug in onnx frontend ([Relax][Frontend][Onnx] fix params name bug in onnx frontend apache/tvm#17350)
[Relax] Validate StructInfo of variable bindings ([Relax] Validate StructInfo of variable bindings apache/tvm#17332)
[Relax][KV Cache] Refactor _attention_sequence_prefill function to … ([Relax][KV Cache] Refactor _attention_sequence_prefill function to … apache/tvm#17362)
[Relax][PyTorch] Cleanup unary op converters ([Relax][PyTorch] Cleanup unary op converters apache/tvm#17356)
[Relax] Add new NN allgather operator ([Relax] Add new NN allgather operator apache/tvm#17359)
[Relax][PyTorch] Cleanup binary op converters ([Relax][PyTorch] Cleanup binary op converters apache/tvm#17366)
[DLight] Fix Matmul rule for Conv3D ([DLight] Fix Matmul rule for Conv3D apache/tvm#17363)
Update tvmc_command_line_driver.py, modify the sentence, remove the duplicate "as" (Update tvmc_command_line_driver.py, modify the sentence, remove the duplicate "as" apache/tvm#17358)
[IR] Expose ReplaceGlobalVars utility in the Python API ([IR] Expose ReplaceGlobalVars utility in the Python API apache/tvm#17361)
[Relax] Fix BYOC removing existing ext mods ([Relax] Fix BYOC removing existing ext mods apache/tvm#17353)
[Relax][PyTorch] Cleanup Neural Network op converters ([Relax][PyTorch] Cleanup Neural Network op converters apache/tvm#17369)
[Bugfix][Relax] Preserve existing DataflowBlock in ConvertToDataflow ([Bugfix][Relax] Preserve existing DataflowBlock in ConvertToDataflow apache/tvm#17148)
[WEBGPU] Update runtime to remove deprecated API ([WEBGPU] Update runtime to remove deprecated API apache/tvm#17371)
[FIX] fix bug when normalize iter with different lower bounds ([FIX] fix bug when normalize iter with different lower bounds apache/tvm#17360)
[Relax][Transform] Add SelectNode handling in SymbolicMatcher ([Relax][Transform] Add SelectNode handling in SymbolicMatcher apache/tvm#17368)
[Relax][PyTorch] Cleanup Statistical, Search and DataType op converters ([Relax][PyTorch] Cleanup Statistical, Search and DataType op converters apache/tvm#17372)
[MSC][Refactor] Support dynamic shape ([MSC][Refactor] Support dynamic shape apache/tvm#17351)
[Relax][PyTorch] Cleanup Tensor Manipulation and Creation op converters ([Relax][PyTorch] Cleanup Tensor Manipulation and Creation op converters apache/tvm#17376)
[DOCS] Link to project-specific security page ([DOCS] Link to project-specific security page apache/tvm#17378)
[CI] Disable NNPACK build and fix error on Android SDK installaion ([CI] Disable NNPACK build and fix error on Android SDK installaion apache/tvm#17337)
[DOCS] Update document to include security model of RPC server ([DOCS] Update document to include security model of RPC server apache/tvm#17377)
[Doc] Relax Deep Dive ([Doc] Relax Deep Dive apache/tvm#17380)
[CI] Upgrade PyTorch to 2.4.1 ([CI] Upgrade PyTorch to 2.4.1 apache/tvm#17338)
[TVMScript] Avoid segfault from invalid TVMScript ([TVMScript] Avoid segfault from invalid TVMScript apache/tvm#17373)
[TVMScript][Relax] Allow return statement in DataflowBlock ([TVMScript][Relax] Allow return statement in DataflowBlock apache/tvm#17131)
[Relax] Validate StructInfo annotations in well-formed check ([Relax] Validate StructInfo annotations in well-formed check apache/tvm#17331)
[DOCS] More clarity on security model of RPC server ([DOCS] More clarity on security model of RPC server apache/tvm#17382)
[Relax][PyTorch] Fix output shape of torch.nn.functional.scaled_dot_product_attention ([Relax][PyTorch] Fix output shape of torch.nn.functional.scaled_dot_product_attention apache/tvm#17379)
[Disco] Enable float8 data type in disco ([Disco] Enable float8 data type in disco apache/tvm#17398)
[MSC] Support concat with constant inputs ([MSC] Support concat with constant inputs apache/tvm#17394)
[Bugfix][ONNX] Skip constant If node generated by PyTorch ([Bugfix][ONNX] Skip constant If node generated by PyTorch apache/tvm#17383)
[3rdparty] Bump FlashInfer for tmp workspace reduction ([3rdparty] Bump FlashInfer for tmp workspace reduction apache/tvm#17400)
[KVCache] Attention func accepting over-padded qkv and output NDArray ([KVCache] Attention func accepting over-padded qkv and output NDArray apache/tvm#17401)
[Fix][LLVM] Fix getHostCPUFeatures LLVM version cutoff ([Fix][LLVM] Fix getHostCPUFeatures LLVM version cutoff apache/tvm#17403)
[CI] Update image tag to 20240917-153130-9f281758 ([CI] Update image tag to 20240917-153130-9f281758 apache/tvm#17397)
[WASM] Implement concat embeddings ([WASM] Implement concat embeddings apache/tvm#17404)
[TIR, TVMScript] Add TIR - Triton integration ([TIR, TVMScript] Add TIR - Triton integration apache/tvm#17395)
[TVMjs] Modify web package description ([TVMjs] Modify web package description apache/tvm#17405)
[Doc] Update Architecture Overview ([Doc] Update Architecture Overview apache/tvm#17402)
[BYOC][NNAPI] Add NNAPI backend for BYOC ([BYOC][NNAPI] Add NNAPI backend for BYOC apache/tvm#17385)
[TIR][NarrowDataType] Bufferload's index should not inherit bits constraint of value ([TIR][NarrowDataType] Bufferload's index should not inherit bits constraint of value apache/tvm#17411)
[CI][Windows] Workaround for error in FindLLVM ([CI][Windows] Workaround for error in FindLLVM apache/tvm#17409)
[Runtime] Add property Module.is_device_module ([Runtime] Add property Module.is_device_module apache/tvm#17407)
[CUTLASS] Add FP8 gemm kernels ([CUTLASS] Add FP8 gemm kernels apache/tvm#17408)
[CI] Upgrade unity image tag to 20240917-153130-9f281758 ([CI] Upgrade unity image tag to 20240917-153130-9f281758 apache/tvm#17410)
[FFI][BUGFIX] Grab GIL when check env signals ([FFI][BUGFIX] Grab GIL when check env signals apache/tvm#17419)
[Relax][PyTorch] Add support for torch.export.ExportedProgram in Relax PyTorch Frontend ([Relax][PyTorch] Add support for torch.export.ExportedProgram in Relax PyTorch Frontend apache/tvm#17396)
[CMake] Add NCCL/RCCL header directory to include path ([CMake] Add NCCL/RCCL header directory to include path apache/tvm#17422)
[Relax][PyTorch] Support more unary ops for ExportedProgram importer ([Relax][PyTorch] Support more unary ops for ExportedProgram importer apache/tvm#17421)
[Relax][PyTorch] Support binary, statistical and search ops for ExportedProgram importer ([Relax][PyTorch] Support binary, statistical and search ops for ExportedProgram importer apache/tvm#17424)
[Web] Allow deprecated API requestAdapterInfo with any cast ([Web] Allow deprecated API requestAdapterInfo with any cast apache/tvm#17420)
[Relax][PyTorch] Support neural network ops for ExportedProgram importer ([Relax][PyTorch] Support neural network ops for ExportedProgram importer apache/tvm#17426)
[DLIGHT][GPU] Improve matmul schedule for adreno ([DLIGHT][GPU] Improve matmul schedule for adreno apache/tvm#17430)
[Relax] Introduce static shape tuning pipeline ([Relax] Introduce static shape tuning pipeline apache/tvm#17428)
[NVSHMEM] Enable nvshmem memory allocation ([NVSHMEM] Enable nvshmem memory allocation apache/tvm#17415)
[Relax][PyTorch] Support tensor manipulation and creation ops for ExportedProgram importer ([Relax][PyTorch] Support tensor manipulation and creation ops for ExportedProgram importer apache/tvm#17429)
[CI] Upgrade CI ([CI] Upgrade CI to Python 3.9 apache/tvm#17425)
[TVMScript][TIR] Add source kernel intetration via call_kernel ([TVMScript][TIR] Add source kernel intetration via call_kernel apache/tvm#17434)
[KVCACHE] Improved schedule for prefill attention ([KVCACHE] Improved schedule for prefill attention apache/tvm#17432)
[Relax][ONNX] Expand op support for ONNX frontend ([Relax][ONNX] Expand op support for ONNX frontend apache/tvm#17427)
[TVMScript] Enable T.macro decorateing class method ([TVMScript] Enable T.macro decorateing class method apache/tvm#17435)
[Docker][CI] Add NNEF dependency to CI images ([Docker][CI] Add NNEF dependency to CI images apache/tvm#17433)
[BugFix][TIR][Schedule] TileWithTensorIntrin skip ComputeInline if bu… ([BugFix][TIR][Schedule] TileWithTensorIntrin skip ComputeInline if bu… apache/tvm#17440)
[TIR] Add is_vector Method to DataType class and update usages across Codebase ([TIR] Add is_vector Method to DataType class and update usages across Codebase apache/tvm#17443)
[ONNX] Move relax related tests to the correct file ([ONNX] Move relax related tests to the correct file apache/tvm#17447)
[CI][Docs] Upgrade Sphinx ([CI][Docs] Upgrade Sphinx apache/tvm#17444)
[Relax] Support left_shift and right_shift op ([Relax] Support left_shift and right_shift op apache/tvm#17448)
[Relax][PyTorch][Docs] Use torch.export insteamd of fx.symbolic_trace for tutorial ([Relax][PyTorch][Docs] Use torch.export insteamd of fx.symbolic_trace for tutorial apache/tvm#17436)
[Community] update contributors ([Community] update contributors apache/tvm#17450)
[Relax] Add NonZero op ([Relax] Add NonZero op apache/tvm#17453)
[Relax] Add scatter_nd op support ([Relax] Add scatter_nd op support apache/tvm#17449)
[Relax][Frontend][Onnx] Add support for pad-2 ([Relax][Frontend][Onnx] Add support for pad-2 apache/tvm#17431)
Try to fix windows CI conda build issue (Try to fix windows CI conda build issue apache/tvm#17457)
[release] Update version to 0.18.0 on main branch
[release] Update version to 0.19.dev0 on main branch
Show the record if the escape sequence is unsupported (Show the record if the escape sequence is unsupported apache/tvm#17458)
[JVM] Align Java GraphModule Initialization with Python API ([JVM] Align Java GraphModule Initialization with Python API apache/tvm#17464)
Revert "[KVCACHE] Improved schedule for prefill attention" (Revert "[KVCACHE] Improved schedule for prefill attention" apache/tvm#17466)
[FIX][RELAX][ONNX] Fix typo in onnx frontend ([FIX][RELAX][ONNX] Fix typo in onnx frontend apache/tvm#17467)
[TIR][Schedule] Add annotate_buffer_access primitive ([TIR][Schedule] Add annotate_buffer_access primitive apache/tvm#17423)
[MetaSchedule] Fix a multilevel tiling error on dynamic relax workload ([MetaSchedule] Fix a multilevel tiling error on dynamic relax workload apache/tvm#17465)
[Relax] Enhance Relax op and ONNX frontend ([Relax] Enhance Relax op and ONNX frontend apache/tvm#17462)
[Relax][MetaSchedule] Support CPU weight prepack ([Relax][MetaSchedule] Support CPU weight prepack apache/tvm#17445)
[LLVM][Arith] Presburger compile fix for MLIR/LLVM 19.x ([LLVM][Arith] Presburger compile fix for MLIR/LLVM 19.x apache/tvm#17469)
[CI] Pin cpplint==1.6.1 ([CI] Pin cpplint==1.6.1 apache/tvm#17470)
Pin pytest-profiling==1.7.0 (Pin pytest-profiling==1.7.0 apache/tvm#17476)
[Device][OpenCL] add CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST to … ([Device][OpenCL] add CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST to … apache/tvm#17472)
[CI] Revert jax, keras, tensorflow, and tflite upgrades introduced [CI] Upgrade CI to Python 3.9 apache/tvm#17425 ([CI] Revert jax, keras, tensorflow, and tflite upgrades introduced #17425 apache/tvm#17485)
Replace np.int with np.int32 (Replace np.int with np.int32 apache/tvm#17484)
[Marvell BYOC]: global_max_pool2d and squeeze op support ([Marvell BYOC]: global_max_pool2d and squeeze op support apache/tvm#17481)
Compiled with Default Target(LLVM) and Built with USE_MRVL=ON (Compiled with Default Target(LLVM) and Built with USE_MRVL=ON apache/tvm#17455)
[KVCACHE] Improved schedule for prefill attention ([KVCACHE] Improved schedule for prefill attention apache/tvm#17482)
[FIX][ONNX][RELAX] Add support for dynamic ShapeExpr in Slice, Squeeze and Flatten ([FIX][ONNX][RELAX] Add support for dynamic ShapeExpr in Slice, Squeeze and Flatten apache/tvm#17490)
[mrvl][runtime]: Support Marvell Hardware Runtime ([mrvl][runtime]: Support Marvell Hardware Runtime apache/tvm#17498)
[RELAX][ONNX] Add support for dynamic shape expression in Expand ([RELAX][ONNX] Add support for dynamic shape expression in Expand apache/tvm#17504)
[CI] Upgrade oneflow==0.9.0 ([CI] Upgrade oneflow==0.9.0 apache/tvm#17503)
[RELAX][ONNX][FIX] add a parser to handle expression in the shape dim names ([RELAX][ONNX][FIX] add a parser to handle expression in the shape dim names apache/tvm#17505)
[Relax] support scatter ops ([Relax] support scatter ops apache/tvm#17509)
[FIX][TOPI][strided_slice] Fix topi.strided_slice output shape ([FIX][TOPI][strided_slice] Fix topi.strided_slice output shape apache/tvm#17502)
[TEST][CLML] Clip test case updated ([TEST][CLML] Clip test case updated apache/tvm#17517)
[Relax] Update ONNX frontend for unique, nonzero and compress ([Relax] Update ONNX frontend for unique, nonzero and compress apache/tvm#17511)
Fix InternalError in StaticPlanBlockMemory when visiting DataflowBlockNode (Fix InternalError in StaticPlanBlockMemory when visiting DataflowBlockNode apache/tvm#17501)
[DOCS] Fix Typo in Debugging TVM ([DOCS] Fix Typo in Debugging TVM apache/tvm#17528)
[DOCS] Fix typo in TensorIR ([DOCS] Fix typo in TensorIR apache/tvm#17527)
[Relax] Add gather_elements and gather_nd operators ([Relax] Add gather_elements and gather_nd operators apache/tvm#17523)
[TE][CreatePrimFunc] Fix loop carried dependency case with nested block levels ([TE][CreatePrimFunc] Fix loop carried dependency case with nested block levels apache/tvm#17474)
[OPENCL][ADRENO] Introduce Qualcomm extension support ([OPENCL][ADRENO] Introduce Qualcomm extension support apache/tvm#17519)
[DOCS] Few fixes for broken Adreno docs ([DOCS] Few fixes for broken Adreno docs apache/tvm#17518)
[RUNTIME][CLML] Dynamic backward compatibility ([RUNTIME][CLML] Dynamic backward compatibility apache/tvm#17516)
[Python][Relax] Update Rotary positional embedding scaling ([Python][Relax] Update Rotary positional embedding scaling apache/tvm#17506)
[3rdparty] Update Picojson with const operator[] function ([TOPI] Move ewise.h -> elemwise.h apache/tvm#327) ([3rdparty] Update Picojson with const operator[] function (#327) apache/tvm#17532)
[Relax] support masked_scatter ([Relax] support masked_scatter apache/tvm#17525)
[Contrib] Remove CLML version print ([Contrib] Remove CLML version print apache/tvm#17533)
[CI] Upgrade zephyr-sdk to 0.16.9 ([CI] Upgrade zephyr-sdk to 0.16.9 apache/tvm#17534)
[Relax][Frontend][Onnx] Add auto_pad support for conv ([Relax][Frontend][Onnx] Add auto_pad support for conv apache/tvm#17536)
[Relax] Add support for ONNX LPPool ([Relax] Add support for ONNX LPPool apache/tvm#17540)
[KVCache] Fix attention prefill kernel for Metal and Android ([KVCache] Fix attention prefill kernel for Metal and Android apache/tvm#17539)
[FIX][topi.scatter_nd] fixed shape equality assert by using analyzer to prove equality ([FIX][topi.scatter_nd] fixed shape equality assert by using analyzer to prove equality apache/tvm#17537)
[REFACTOR] Phase out VTA ([REFACTOR] Phase out VTA apache/tvm#17542)
[Relax] Fix bug in convert_layout pass ([Relax] Fix bug in convert_layout pass apache/tvm#17541)
[CI] Upgrade CI image to 20241105-030952-3e386fd3 ([CI] Upgrade CI image to 20241105-030952-3e386fd3 apache/tvm#17451)
[LLVM][RUNTIME] Make ORCJIT LLVM executor the default one ([LLVM][RUNTIME] Make ORCJIT LLVM executor the default one apache/tvm#17538)
[FIX][LLVM] Workaround -mcpu=apple-latest for llvm above 18.0 ([Bug] Default installation has LLVM errors apache/tvm#17492) ([FIX][LLVM] Workaround -mcpu=apple-latest for llvm above 18.0 (#17492) apache/tvm#17549)
[LLVM] Make compilable with LLVM-20 ([LLVM] Make compilable with LLVM-20 apache/tvm#17547)
[Refactor] Phase out microTVM ([Refactor] Phase out microTVM apache/tvm#17554)
[Web]Allows setting powerPreference on webgpu ([Web]Allows setting powerPreference on webgpu apache/tvm#17545)
[Runtime][Dist] Implementation of KV cache transfer ([Runtime][Dist] Implementation of KV cache transfer apache/tvm#17557)
[Test] Skip flaky test to unblock CI ([Test] Skip flaky test to unblock CI apache/tvm#17596)
Fix GPU detection in PerStoreFeatureNode (Fix GPU detection in PerStoreFeatureNode apache/tvm#17593)
[Fix][KVCache] Fix incorrect tile size calculation ([Fix][KVCache] Fix incorrect tile size calculation apache/tvm#17595)
[release] Update version to 0.19.0 on main branch
[release] Update version to 0.20.dev0 on main branch
[RUNTIME][OPENCL] Bugfix
Fixed neg operator conversion
Additional API support ([SCHEDULE] Improve bound inference, support reduce codegen. apache/tvm#30)
[RELAX][PASS] Convert layout pass and ops enhanced to support sub indexing ([RELAX][PASS] Convert layout pass and ops enhanced to support sub indexing apache/tvm#17568)
enforcement on loop partition control
[Refactor] Phase out legacy docs
[DOCS] Update README ([DOCS] Update README apache/tvm#17604)
[ADRENO][WINDOWS] Windows build dependencies for Adreno target
[FIX][TVMC] Fix the mixed precision conversion pipeline
[TVMC] Bug fix
[Flashinfer][Fix] fix missing args in flashinfer test ([Flashinfer][Fix] fix missing args in flashinfer test apache/tvm#17601)
[Refactor] Phase out legacy example apps ([Refactor] Phase out legacy example apps apache/tvm#17605)
[Relay]Disable InferType if it was done and no changes after previous pass ([Relay]Disable InferType if it was done and no changes after previous pass apache/tvm#17585)
[TIR][FIX] update FlopEstimator to include missing nodes ([TIR][FIX] update FlopEstimator to include missing nodes apache/tvm#17598)
Bug Fix: Removed unused code
[DOCKER] Tensorflow upgrade to 2.18.0
[Runtime][KVCache] Initial interface setup for MLA ([Runtime][KVCache] Initial interface setup for MLA apache/tvm#17616)
[KVCache] Add KV Cache for CPU Runtime ([KVCache] Add KV Cache for CPU Runtime apache/tvm#17615)
[CI] Remove legacy frontend tests
[skip ci] [CI] Remove legacy CI runners protection ([skip ci] [CI] Remove legacy CI runners protection apache/tvm#17621)
[CI] Unpin pytest-profiling ([CI] Unpin pytest-profiling apache/tvm#17620)
[docs] Download 3rd party embeds to local files
[KVCache] TIR attention kernel support for MLA ([KVCache] TIR attention kernel support for MLA apache/tvm#17618)
[skip ci][CI] Robustify CI for SPOT failure ([skip ci][CI] Robustify CI for SPOT failure apache/tvm#17629)
[Relax] Initial setup of relax backend pipeline
ensure non-overlap pipeline epilogue with prologue
[MetaSchedule] Adding post optimization in MetaSchedule to Improve Scheduling
[Codegen, CUDA] Add FP8 Tensor Core Codegen ([Codegen, CUDA] Add FP8 Tensor Core Codegen apache/tvm#16950)
[skip ci][CI] Improve build time
Handle vector width (VLEN) for RISCV arches
[CI] Cleanup legacy files ([CI] Cleanup legacy files apache/tvm#17635)
use higher version of ml_dtypes
trigger ci
[Relax] Pipeline file reorganization
[RUNTIME][CLML] Profiling options enabled for CLML (BYOC via JSON Runtime)
lint
[CUDA] Remove htanh from unsupported math ops for CUDA 12.8 ([CUDA] Remove htanh from unsupported math ops for CUDA 12.8 apache/tvm#17639)
[Docker] Update ml_dtypes version ([Docker] Update ml_dtypes version apache/tvm#17643)
[PYTHON] Build cython by default ([PYTHON] Build cython by default apache/tvm#17637)
Bump rollup from 2.79.1 to 2.79.2 in /web (Bump rollup from 2.79.1 to 2.79.2 in /web apache/tvm#17644)
[DOCS] Update README ([DOCS] Update README apache/tvm#17650)
[Refactor] Phase out python dependency attrs ([Refactor] Phase out python dependency attrs apache/tvm#17649)
[DOCKER] Tensorflow (aka TFLite) upgrade to 2.18.0 ([DOCKER] Tensorflow (aka TFLite) upgrade to 2.18.0 apache/tvm#17648)
[OPENCL][TEXTURE] Improved texture memory planning ([OPENCL][TEXTURE] Improved texture memory planning apache/tvm#17571)
[Relax][PyTorch] Add support for abs, ceil, erf, floor, log ops and refactor unary tests ([Relax][PyTorch] Add support for abs, ceil, erf, floor, log ops and refactor unary tests apache/tvm#17622)
[REFACTOR] Phase out relay python components ([REFACTOR] Phase out relay python components apache/tvm#17656)
Update images to 20250214-034537-bd1411f8 (Update images to 20250214-034537-bd1411f8 apache/tvm#17653)
[ONNX][RELAX] replace topi.split with relax.op.split in the onnx frontend ([ONNX][RELAX] replace topi.split with relax.op.split in the onnx frontend apache/tvm#17642)
[REFACTOR] Phase out te.schedule python components ([REFACTOR] Phase out te.schedule python components apache/tvm#17658)
[Relax][PyTorch] Add support for bitwise_not, isfinite, isinf, isnan, logical_not, sign and square ops ([Relax][PyTorch] Add support for bitwise_not, isfinite, isinf, isnan, logical_not, sign and square ops apache/tvm#17659)
Update argument order for relax.op.pad to make it round-trippable (Update argument order for relax.op.pad to make it round-trippable apache/tvm#17657)
Upgrading onnx and onnxrt verions (Upgrading onnx and onnxrt verions apache/tvm#17655)
[REFACTOR] Phase out relay c++ components ([REFACTOR] Phase out relay c++ components apache/tvm#17660)
[REFACTOR] Phase out te.Schedule c++ components ([REFACTOR] Phase out te.Schedule c++ components apache/tvm#17662)
[Refactor] Phrase out python dependency decorator ([Refactor] Phrase out python dependency decorator apache/tvm#17661)
[Relax][PyTorch] Add support for ge, gt, le, mod, ne ops ([Relax][PyTorch] Add support for ge, gt, le, mod, ne ops apache/tvm#17664)
[RELAX][BYOC] OpenCLML offload support for Relax ([RELAX][BYOC] OpenCLML offload support for Relax apache/tvm#17654)
[Refactor] Improve TargetHasSVE function with optional target handling ([Refactor] Improve TargetHasSVE function with optional target handling apache/tvm#17666)
Pick up vector length from 'zvlXXXb' (RVV) mattr for riscv (Pick up vector length from 'zvlXXXb' (RVV) mattr for riscv apache/tvm#17641)
[KVCache] Added support for normal MLA kernel (Added support for normal MLA kernel apache/tvm#17624)
[REFACTOR] move build flow from C++ to Python ([REFACTOR] move build flow from C++ to Python apache/tvm#17665)
[REFACTOR] Allow target dependent default tir pipeline dispatch in tir.build() ([REFACTOR] Allow target dependent default tir pipeline dispatch in tir.build() apache/tvm#17669)
[Relax][PyTorch] Add support for and_, lshift, min, or_, rshift, xor ops ([Relax][PyTorch] Add support for and_, lshift, min, or_, rshift, xor ops apache/tvm#17668)
[Docker] Use Torch GPU on gpu device ([Docker] Use Torch GPU on gpu device apache/tvm#17676)
[skip ci][CI] Update github tvmbot ([skip ci][CI] Update github tvmbot apache/tvm#17675)
[Dlight][CPU] Add CPU Backend Support for GEMV Optimization ([Dlight][CPU] Add CPU Backend Support for GEMV Optimization apache/tvm#17663)
Bump 3rdparty/cutlass_fpA_intB_gemm (Bump 3rdparty/cutlass_fpA_intB_gemm apache/tvm#17678)
[BugFix] Declare build backend for python package ([BugFix] Declare build backend for python package apache/tvm#17677)
[Relax][PyTorch] Support several unary ops for ExportedProgram importer ([Relax][PyTorch] Support several unary ops for ExportedProgram importer apache/tvm#17679)
[Relax][PyTorch] Refactor binary ops tests ([Relax][PyTorch] Refactor binary ops tests apache/tvm#17672)
[CI] update images to 20250225-035137-aeadc31c ([CI] update images to 20250225-035137-aeadc31c apache/tvm#17680)
[REFACTOR] Followup cleanup of relay phase out ([REFACTOR] Followup cleanup of relay phase out apache/tvm#17681)
[REFACTOR] Cleanup legacy TE-based passes ([REFACTOR] Cleanup legacy TE-based passes apache/tvm#17685)
[Docker] Update ml_dtypes to 0.5.1+ ([Docker] Update ml_dtypes to 0.5.1+ apache/tvm#17686)
[CI] Update images to 20250226-223225-63bc315f ([CI] Update images to 20250226-223225-63bc315f apache/tvm#17687)
[KVCache] PagedKVCache refactor, FlashInfer JIT and MLA integration ([KVCache] PagedKVCache refactor, FlashInfer JIT and MLA integration apache/tvm#17674)
[Codegen] FP4 support ([Codegen] FP4 support apache/tvm#17630)
[Relax][PyTorch] Support several binary ops for ExportedProgram importer ([Relax][PyTorch] Support several binary ops for ExportedProgram importer apache/tvm#17689)
[Relax] Add support for func attr inheritance in SplitLayoutRewritePreproc ([Relax] Add support for func attr inheritance in SplitLayoutRewritePreproc apache/tvm#17682)
[Docker] Fix ml_dtypes downgrade issue introduced by TensorFlow ([Docker] Fix ml_dtypes downgrade issue introduced by TensorFlow apache/tvm#17691)
Fix relax.ccl.scatter_from_worker0 assert (Fix relax.ccl.scatter_from_worker0 assert apache/tvm#17688)
[Relax][PyTorch] Add support for elu, hardtanh ops ([Relax][PyTorch] Add support for elu, hardtanh ops apache/tvm#17694)
[Fix] Include <chrono> for std::chrono ([Fix] Include <chrono> for std::chrono apache/tvm#17697)
[WASM] Update wasm include in accordance to kv cache revamp ([WASM] Update wasm include in accordance to kv cache revamp apache/tvm#17695)
[MSC] Remove relay ([MSC] Remove relay apache/tvm#17683)
[Refactor] Remove legacy TE schedule tag ([Refactor] Remove legacy TE schedule tag apache/tvm#17701)
[relax] Fix tree attention for Qwen2-1.5 models ([relax] Fix tree attention for Qwen2-1.5 models apache/tvm#17700)
updated the assert in BindParams to allow tvm.relax.Constant (updated the assert in BindParams to allow tvm.relax.Constant apache/tvm#17693)
[Relax][PyTorch] Add support for celu, selu, is_floating_point ops ([Relax][PyTorch] Add support for celu, selu, is_floating_point ops apache/tvm#17702)
Fix typos in multiple files (Fix typos in multiple files apache/tvm#17703)
[TIR] Minor fix for default GPU schedule ([TIR] Minor fix for default GPU schedule apache/tvm#17706)
[CUDA] FP4 cast and reinterpret support ([CUDA] FP4 cast and reinterpret support apache/tvm#17708)
[DataType] Rename FP8 dtypes to standard names ([DataType] Rename FP8 dtypes to standard names apache/tvm#17712)
[REFACTOR] Cleanup legacy relay runtime data structures ([REFACTOR] Cleanup legacy relay runtime data structures apache/tvm#17713)
[Relax][PyTorch] Add support for gather, flip and take ops ([Relax][PyTorch] Add support for gather, flip and take ops apache/tvm#17707)
[Refactor] Introduce base Executable class and tvm.compile interface ([Refactor] Introduce base Executable class and tvm.compile interface apache/tvm#17710)
Fix the get_target_compute_version for sm >= 100 (Fix the get_target_compute_version for sm >= 100 apache/tvm#17716)
[FFI] Phase out ctypes fallback in favor of cython ([FFI] Phase out ctypes fallback in favor of cython apache/tvm#17714)
[Fix][dlight] add an explicit reduction loop check in Reduce ([Fix][dlight] add an explicit reduction loop check in Reduce apache/tvm#17711)
[Refactor] Migrate build API to tvm.compile ([Refactor] Migrate build API to tvm.compile apache/tvm#17718)
[FFI] Fix dynamic FFI index to ensure compatibility ([FFI] Fix dynamic FFI index to ensure compatibility apache/tvm#17727)
[DataType] BF16 Support ([DataType] BF16 Support apache/tvm#17670)
[Relax][PyTorch] Add support for numel, empty_like and one_hot ops ([Relax][PyTorch] Add support for numel, empty_like and one_hot ops apache/tvm#17726)
[Relax] ingest Tensor.contiguous from torch export ([Relax] ingest Tensor.contiguous from torch export apache/tvm#17728)
[Relax] Allow ingesting vector_norm from torch.export ([Relax] Allow ingesting vector_norm from torch.export apache/tvm#17722)
[Relax] Allow ingesting Upsample module from torch.export either using Size or Scale Factor argument ([Relax] Allow ingesting Upsample module from torch.export either using Size or Scale Factor argument apache/tvm#17721)
[Relax] Add torch exported program ingestion capability for Tensor.detach(), Tensor.copy_, and aten.lift_fresh_copy ([Relax] Add torch exported program ingestion capability for Tensor.detach(), Tensor.copy_, and aten.lift_fresh_copy apache/tvm#17723)
[Relax] Add support to ingest Tensor.expand_as() ([Relax] Add support to ingest Tensor.expand_as() apache/tvm#17724)
Fix typos in comments and strings (Fix typos in comments and strings apache/tvm#17736)
[IR] Compact Functor vtable ([IR] Compact Functor vtable apache/tvm#17731)
Remove Google Analytics (Remove Google Analytics apache/tvm#17734)
[Relax] Ingest Tensor.clamp from torch export ([Relax] Ingest Tensor.clamp from torch export apache/tvm#17725)
[BF16] Support ndarray.asnumpy() to bfloat16 tensor natively using ml_dtypes ([BF16] Support ndarray.asnumpy() to bfloat16 tensor natively using ml_dtypes apache/tvm#17739)
[Relax][PyTorch] Support elu, celu, selu ops for ExportedProgram importer ([Relax][PyTorch] Support elu, celu, selu ops for ExportedProgram importer apache/tvm#17738)
[Codegen][CUDA] Fix codegen of cast among vector bfloat16, fp8 and fp4 ([Codegen][CUDA] Fix codegen of cast among vector bfloat16, fp8 and fp4 apache/tvm#17741)
[Refactor] Clean up Relay references in the codebase ([Refactor] Clean up Relay references in the codebase apache/tvm#17733)
[Attention] Added caching for flashinfer binaries during JIT ([Attention] Added caching for flashinfer binaries during JIT apache/tvm#17730)
Fix: Change variable i to x in split operation in cross_compilation_and_rpc.py (Fix: Change variable i to x in split operation in cross_compilation_and_rpc.py apache/tvm#17743)
[Relax][PyTorch] Support flip, gather, take ops for ExportedProgram importer ([Relax][PyTorch] Support flip, gather, take ops for ExportedProgram importer apache/tvm#17747)
[Codegen] Support codegen for vectorized tir.ShuffleNode (#17748)
BugFix: Relax comment (#17750)
[Relax][PyTorch] Support one_hot, empty_like ops for ExportedProgram importer (#17751)
[Relax] Move TIR backend to gpu_generic (#17749)
[Dlight] Fix general reduction rule to support non-last reduction axis (#17754)
[Relax][PyTorch] Add support for lerp, select and clone ops (#17760)
[Fix] Fix OpenCL header in attention utils (#17762)
[FIX][RELAX] fix fusion of transpose + matmul when constant weight (#17761)
[Relax][PyTorch] Add support for log2, log10 and log1p ops (#17766)
[Relax][PyTorch] Add support for prod, std and var ops (#17772)
[Relax] Tensor.split with uneven tensors (#17757)
[Relax] check for tensor_meta in exported_program_translator (#17774)
[Relax] Batch norm correctness on eval mode (#17752)
[Relax][PyTorch] Support log2, log10 and log1p ops for ExportedProgram importer (#17778)
[REFACTOR] Phase out StackVM (#17784)
[REFACTOR][TIR] remove legacy tir::any (#17783)
[3rdparty] Enable bfloat16 for custom allreduce kernel (#17780)
[Relax][Frontend] Support max/min in frontend op interface (#17782)
[Relax] Enable bfloat16 for softmax struct-info inference (#17781)
[Relax] Allow ingesting tensor.chunk() from exported torch program (#17758)
[Relax][PyTorch] Support prod, std and var ops for ExportedProgram importer (#17785)
[Relax][PyTorch] Add support for where, cumprod and reciprocal ops (#17788)
[Relax][PyTorch] Support softshrink op for ExportedProgram (#17786)
[CUTLASS] Add blockwise scale gemm/bmm kernels (#17789)
[Relax][PyTorch] Add support for index_select (#17790)
[Relax][PyTorch] Support where, cumprod and reciprocal ops for ExportedProgram importer (#17801)
[Relax][Pytorch] support for arange in exported programs translator (#17802)
[Relax][Pytorch] Update SELU Implementation Using Decomposed Core-Level Ops (#17797)
[TIR] Fix reduce buffer allocation position (#17799)
[Relax][PyTorch] Fix torch 2.6 compatibility issues (#17807)
[Relax][PyTorch] Delete duplicate converter function _to (#17809)
[Relax][PyTorch] Add support for argsort, sort, topk ops (#17810)
[NFC] Fix explict typo (#17811)
[Relax][PyTorch] Support argsort, topk ops for ExportedProgram importer (#17812)
[Flashinfer] Added jit flow for sampling kernel (#17763)
[Cublas] Added support for bfloat16 while dispatching to cublas kernels (#17796)
[Relax][PyTorch] Improve ExportedProgram frontend by supporting unflatten.int, hardtanh_.default, dropout_.default, silu_.default, add_.Tensor and relu_.default (#17813)
[Relax][PyTorch] Support dynamic shapes in ExportedProgram frontend (#17817)
[Relax][PyTorch] Add Softplus Op Support for Exported Program and FX graph (#17806)
[Relax][PyTorch] Cleanup tests for ExportedProgram frontend (#17822)
[Relax][PyTorch] Add support for broadcast_to, narrow ops (#17820)
[Relax][PyTorch] Add stack.default and sum.default to exported programs translator (#17814)
[release] Update version to 0.20.0 on main branch
[release] Update version to 0.21.dev0 on main branch
[Install] Fix error during python/tvm installation (#17808)
[Relax] Fix Torch frontends to report all the missing ops (#17826)
[Relax][PyTorch] Support narrow and broadcast_to ops for ExportedProgram importer (#17830)
[Relax][PyTorch] full.default, full_like.default, ones.default (#17832)
[BugFix][TIR] Schedule support reverse-inline with reduction blocks (#17838)
[Relax] Refactor missing op check into shared utility for Torch frontends (#17840)
[Relax][PyTorch] Add support for norm op (#17841)
[Relax][PyTorch] Add Logaddexp op support for exported program (#17803)
fixing incorrect docstring in upsampling.py (#17845)
[CI] Upgrade ubuntu runner image for GitHub CI (#17846)
[Relax][PyTorch] Add PReLU Op Support for Exported Program and FX graph (#17816)
[Relax][Pytorch] Add masked_fill op support in ExportedProgram (#17850)
[Relax][PyTorch] Add RSub Op Support for Exported Program and FX graph (#17849)
Fix docstring in batch_to_space_nd and bitpack (#17848)
[3rdparty] Bump DLPack to v1.1 for float8/6/4 dtype supports (#17831)
[Relax][PyTorch] Add Stack Op Support for Exported Program (#17819)
[Relax][PyTorch] Add Pad Op Support for Exported Program and FX graph (#17821)
[Relax][PyTorch] Add mul_.Tensor, max.default, min.default and pow.Scalar Op Support into Exported Program Frontend (#17843)
[Relax][PyTorch] Support leaky_relu_.default and reshape_as.default in ExportedProgram frontend (#17851)
Fix incorrect docstring in topi softmax (#17844)
Add op support for roll op (#17839)
[Relax][PyTorch] Add copy_ op support in fxGraph (#17858)
[Relax][PyTorch] Support eye op for ExportedProgram importer (#17864)
[Relax][PyTorch] support for index.Tensor (#17836)
[Relax][Pytorch] Add support for bitwise_or op support (#17871)
[Relax][PyTorch] Sort.default (#17852)
[Relax][PyTorch] Refactor norm op for ExportedProgram importer (#17857)
[BugFix][Relax][Pytorch] Incorrect Handling of In-Place Ops in FX-Based TVM Frontend (#17875)
[Relax][Pytorch] Add support for ones_like, zero_, zeros, type_as, item ops (#17868)
[BugFix][Relax][Pytorch] Fix incorrect behaviour of % (mod) operator in TVM frontend (#17882)
Add support for index_put_ op (#17865)
[CI] Upgrade pytorch to 2.7.0, torchvision to 0.22.0, and vulkan sdk to 1.4.309 (#17887)
[Relax][PyTorch] Support linspace op for ExportedProgram importer (#17889)
[Relax][PyTorch] Add torch.isin Op Support for Exported Program and FX graph (#17878)
Fix onnx expand op (#17900)
[Relax][PyTorch] Support torch.bfloat16 dtype in pytorch frontend (#17894)
[Relax][FRONTEND][Pytorch] Add fmod support (#17893)
Add op support for zeros_like and fill_ (#17896)
[CI] Install PyTorch 2.7 compatible with CUDA 11.8 (#17905)
[CI] Update images to 20250428-080833-03eadc65 (#17891)
[RPC] Fix Bug That Change Dict When Iterate The Keys
Add masked_fill_.scalar, logical_not.default in Exported Program frontend (#17909)
Add op support for new_zeros op in Exported Program and fx graph frontend (#17911)
[Relax][PyTorch] Add support for eye op in fx graph (#17908)
Fix off-by-one error in the type index range check within Object::IsInstance()
Make slot end calculation more readable
[Relax][PyTorch] Add Pixel Shuffle Op Support for Exported Program and FX graph (#17886)
[Triton] Support latest triton.compile interface (#17913)
[WebGPU][CodeGen] Override PrintVecElemLoad and Store for WebGPU (#17917)
[Relax][PyTorch] Add support for linspace op in fx graph (#17915)
[Relax][PyTorch] Add Meshgrid Op Support for Exported Program and FX graph (#17904)
[REFACTOR] Introduce and modernize FFI system (#17920)
[Relax][PyTorch] Add div.Tensor_mode and trunc Op Support for Exported Program and FX graph (#17924)
[REFACTOR][FFI] Cleanup PackedFunc related redirection (#17923)
[FFI][FEAT] AutoDLPack for taking external tensor objects (#17927)
[Relax][PyTorch] Add tests for all the dtypes supported in the PyTorch frontend (#17926)
[Relax][PyTorch] Add MaxPool 1D and 3D Op Support for Exported Program and FX graph (#17919)
[Relax][PyTorch] CrossEntropyLoss (#17863)
[REFACTOR][FFI] Cleanup container redirections (#17929)
[Relax][PyTorch] Add Adaptive AvgPool 1D and 3D Op Support for Exported Program and FX graph (#17922)
[REFACTOR][FFI][RPC] Migrate RPC to use the latest FFI ABI (#17931)
[REFACTOR] Phase out legacy go ffi (#17940)
[REFACTOR] Phase out legacy rust ffi (#17939)
[Relax][PyTorch] Add AvgPool 1D and 3D Op Support for Exported Program and FX graph (#17921)
[Relax][PyTorch] Add UpSample Bicubic Op Support for Exported Program and FX graph (#17932)
[Relax][PyTorch] Add torch.outer Op Support for Exported Program and FX graph (#17930)
[Relax][PyTorch] Add ReLU6 Op Support for Exported Program and FX graph (#17918)
[FFI] Variant specialize for all ObjectRef (#17943)
[TOPI] Add shape validation to prevent negative dimensions in conv operations (#17942)
[REFACTOR][FFI][Web] Upgrade Web Runtime to new FFI (#17946)
[NODE] Fix structural equality for Array specialization (#17951)
Add registion for the operator asin and acos in llvm (#17945)
[FFI][LLVM] Fix compilation errors with clang20 (#17954)
[LLVM] Fix JIT unknown reloc issue for case of RISCV
[Docker][CI] Reintroduce NNEF to CI images (#17955)
[Fix][Relax] Fix dangling reference in GetTargetFunctions() (#17950)
[LLVM][Codegen] Enable SVE/VLA for RISCV targets
Update estimate_flops.cc
Update estimate_flops.cc
Update estimate_flops.cc
Fix sqrt/rsqrt Compatibility with Integer Data Types (#17953)
[Bugfix][Relax][Pytorch] Bugfix of conv_transpose1d and conv_transpose2d (#17968)
[FFI][REFACTOR] Update to distinguish as and cast (#17979)
Fix g.costs (#17972)
Add registion for the operator asinh, acosh, atanh in llvm (#17969)
[FRONTEND][ONNX] Make bias input optional in LayerNormalization (#17980)
[CI] Update images to 20250513-063354-70aa3797 (#17981)
[FFI][JVM] Upgrade tvm4j to latest FFI (#17983)
[Relax][PyTorch] Re-enable test_subgraph_capture in dynamo test (#17925)
[Relax][Frontend]Fix: Output tensor with zero dimension after torch.u… (#17990)
Fix zero-extent loops in PerStoreFeature to prevent crashes (#17995)
[Vulkan] Add TIR unary trigonometric/hyperbolic intrinsic definitions (#18005)
[Relax][Frontend][ONNX] Fix: bitwise_not misclassified as binary (is … (#18001)
[REFACTOR][FFI] Phase out legacy C API (#18010)
Fix division truncation in window size calculation for small dtypes in average_pool (#18014)
Fix RuntimeError: parallel_for_dynamic (#18013)
[Relax][ONNX] Replace deprecated mapping.TENSOR_TYPE_TO_NP_TYPE usage (#18016)
Fix FLOP estimation for EvaluateNode by implementing VisitStmt_ handler (#17974)
Add op support for slice_scatter (#18019)
[REFACTOR][PYTHON] Phase out tvm._ffi and Limited API support (#18020)
[TOPI] Support integer type input for log10 (#18015)
[REFACTOR][FFI] Cleanup PackedFunc redirections (#18022)
[FFI] More strict tuple constructor checking (#18023)
[Python] Fix library lookup path for pip installed packages (#18026)
[ARITH] Fix canonical simplify for LE with incorrect range assumptions (#18025)
[CUDA] Fix thrust with latest FFI refactor (#18024)
[TOPI] Fix index handling in expand_like operator for axis expansion (#18006)
Fix IR generation conflict in topi.nn.simplify by separating Tensor and PrimExpr handling (#17978)
[ROCm] Fix ROCm build after FFI refactor (#18029)
[Codegen] Resolve issue #17965 where the same model produces different outputs on the LLVM (CPU) and CUDA (GPU) backends (#17985)
[ARITH] Canonicalize mul-coefficient to rhs (#18031)
[BugFix][CUDA] Fix: Update settings for rerun on Increase FloatImm precision when printing 64 bit values in CUDA codegen (#18035)
[Dtype] Low-precision Blackwell Datatype Support (#18027)
[Metal] Fix GetFunction of metal runtime (#18034)
[CI] Further robustify is_last_build check (#18037)
[REFACTOR][FFI] Update symbol name for library module (#18042)
Resolving inconsistency between attention/attention_bias (#18038)
[REFACTOR] Phase out the relax tuning_api (#18043)
[DTYPE] Fix dtype functions after dtype refactor (#18041)
[CUTLASS] Add GeMM kernels for Blackwell GPUs (#18033)
[Backend] JIT compile FlashInfer kernel with FFI header (#18047)
[3rdparty] Phasing out FlashInfer AOT from 3rdparty (#18046)
[Refactor] Rename relax_vm to vm (#18049)
[FFI] Enhance FFI Object exception safety during init (#18050)
Add support for hamming_window op (#18036)
[DOCS] Update installation instruction based ffi refactor (#18056)
[Pytest] Remove obsolete test suite entries (#18054)
[TIR] Phase out ProducerStore, ProducerRealize and Prefetch (#18057)
[TEST] Move temp files into tempdir (#18058)
[FFI][REFACTOR] Enhance reflection (#18059)
[FFI][REFACTOR] Update registry to have complete meta-data (#18062)
Add Python functor support for TIR expressions and statements (#18060)
[CUTLASS] Fix CUTLASS kernel build on Hopper (#18064)
[TIR] Fix block access region detection for nested let bindings (#18069)
[TIR] Extend address_of to support Buffer objects (#18068)
[Script] Remove deprecated attributes from Constant AST node (#18066)
[FFI] Introduce FFI reflection support in python (#18065)
[CI] Update windows to 2025 (#18071)
[NVSHMEM] Update NDArray allocation (#18073)
[TOPI][NN][Layer_Norm] Fix layer_norm error with reduce-only axes (#18063)
[ARITH] Add IsBound method to ConstIntBoundAnalyzer (#18067)
[FFI] Optimize atomic decref in Object (#18077)
[REFACTOR] Phase out the RelaxExpr.checked_type in favor of struct_info (#18078)
[FFI][REFACTOR] Stablize container ABI and implementation (#18076)
[REFACTOR] Phase out LegacyReprPrinter and improve CommonSubExprElim (#18080)
[FFI] Update typeinfo to speedup parent reflection (#18083)
[Script] Add support for merging block annotations (#18079)
add support for softsign op (#18075)
[Relax][ONNX] Update ReduceL1 to opset 18 (#18072)
[Relax] Support InstanceNorm & [TOPI] Bugfix of InstanceNorm (#18039)
[SCRIPT] Bump Python minimum version to 3.9 and update AST compatibility (#18086)
[Script] Enhance alloc buffer handling in nested frames (#18088)
[KVCache] Per Layer Sliding Window (#17928)
[Relax][ONNX] Update Reduce ops to support axes as input (#18090)
TVM Patch for TileLang
[FFI] Provide Field Visit bridge so we can do gradual transition (#18091)
[BugFix] Fix exception when tvm not built with llvm support (#18087)
[Fix] Fix ExecBuilderDeclareFunction method name in exec_builder.py (#18092)
[NVSHMEM] Extend CUDA backend to compile and link TIR modules with NVSHMEM (#18093)
[FFI][REFACTOR] Migrate attrs to use new reflection (#18095)
[REFACTOR] Transition VisitAttrs to new reflection mechanism in tir/ir_builder/meta_schedule (#18096)
[REFACTOR] Transition VisitAttrs to new reflection mechanism (#18098)
[Runtime] CutensorMap support (#18097)
Add support for bucketize (#18040)
[RELAX] Fix rotary embedding buffer size calculation (#18102)
[Fix] Replace dmlc::Error with std::exception in VerifyGPUCode (#18103)
[REFACTOR] Formalize namespace for all objects (#18101)
fix: guard tensormap with cuda version check (#18107)
[REFACTOR][FFI] Phase out old VisitAttrs mechanism (#18106)
Add LLVM Legalization for tir.erf (#18104)
[FFI] Introduce GlobalDef for function registration (#18111)
[FFI] Cleanup visit_attrs attribute after refactor (#18112)
[CMake] Refine C++/CUDA standard settings in CMakeLists.txt (#18113)
[Fix][Serialization] Add support for NaN value serialization (#18115)
[FFI] Replace __attribute__ with C++ standard attributes (#18114)
[FFI] Use fold expression to simplify for_each (#18116)
bump cutlass_fpA_intB_gemm (#18118)
[FFI] Replace Arg2Str with a more powerful for_each (#18117)
Revert "[FFI] Replace Arg2Str with a more powerful for_each" (#18121)
[release] Update version to 0.21.0 on main branch
[release] Update version to 0.22.dev0 on main branch
[Codegen] Update LLVM version requirement for insertDeclare (#18123)
[KVCache] Fix kernel dispatch based on attention kinds (#18122)
Update CMakeLists.txt to include Python include directory and clean up setup.py by removing unused import
[Relax][ONNX][Transform] Add mode choice, new mode, and warning for take() (#18061)
[Refactor] Build cython with isolate environment (#18124)
phaseout ck dependency
phaseout flashinfer
phase out vta
support T.address_of(B[i, j])
Phase out StackVM runtime support (#18125)
Revert "[Refactor] Build cython with isolate environment" (#18127)
[Doc] Visualize the architecture using a UML sequence diagram (#18128)
[Target] Support CUDA device function calls (#18055)
Delete redundant imports (#18129)
[MISC] Fix compilation warnings of unnecessary std::move() calls (#18130)
[REFACTOR] Migrate TVM_FFI_REGISTER_GLOBAL to new reflection style (#18142)
[FFI][PYTHON] Improve the traceback generation in python (#18141)
[Fix][ONNX] Fix CumSum conversion when loading ONNX model (#18137)
[Fix][ONNX] Fixed constant ROI handling in resize2d when loading onnx models (#18138)
[FRONTEND][ONNX] Extend axes for layer_norm when gamma/beta are multi-dimensional (#18143)
[Test] Use roi_list variable instead of hardcoded values in ROI tensor creation (#18145)
[CodeGen][CUDA] Add sinhf CUDA Math API for CodeGen (#18144)
[BugFix][NNAPI] Fix type mismatch and test_mean annotation (#18140)
[TIR] Add T.thread_return() for early thread exit in CUDA kernels (#18134)
[FFI][REFACTOR] Modularize refelection (#18147)
[FFI][REFACTOR] Phase out TVM_FFI_REGISTER_GLOBAL in favor of GlobalDef (#18148)
[FFI] Log and throw in function dup registration (#18149)
Fix CMakeLists.txt to remove unnecessary '-I' flag from Python build command for tvm_cython target
[Misc] Fix Release Package Test Script (#18153)
[TIR] Decouple DeepEqual from StructuralEqual (#18151)
[FFI] Structural equal and hash based on reflection (#18156)
[Fix][Relax] Fix potential out-of-bounds access in TupleRewriterNode (#18120)
[BugFix] Fix NCCL build with GlobalDef registration (#18158)
[FFI][REFACTOR] Introduce TypeAttr in reflection (#18160)
c api fix
[FFI] Remove unused Grid constant and add HANDLE_TO_REFERENCE conversion
preserve unit loop for reindex scheduling.
Add skip_simplify option to reindex method for improved index handling
fix
Update LetFrameNode to allow mutable value and register reflection accordingly
Refactor argument extraction in ExprEvaluator to streamline handling of BoolOp nodes, improving code clarity.
Enhance error reporting in IndexMapInverseImpl by including index map details in the error message for better debugging context.
Remove redundant type check in Allocate constructor for improved clarity and maintainability.
Change annotations type in Allocate constructor from Map<String, ObjectRef> to Map<String, Any> for improved flexibility.
Update minimum Python version requirement from 3.9 to 3.8 for compatibility.
Revert "Update minimum Python version requirement from 3.9 to 3.8 for compatibility."
Refactor stride naming in Namer to use name_hint when defined, improving variable naming consistency.
Refactor MergeAnnotations function to accept Map<Any, Any> instead of Map<String, Any> for enhanced flexibility in handling annotations.
phaseout legacy components
Add support for 'tir.exp2' operation and register 'hip' target kind with various attributes for enhanced GPU compatibility (#7)
Add tilelang assume attribute to support custom assumption

* fix reduce buffer allocation position * fix test_tir_analysis_detect_buffer_access_lca.py::test_buffer_load_store

* fix test_flatten * re-enable test_split * fix test_to_copy * re-enable test_batchnorm2d

) remove duplicate function

* Update fx_translator.py * Update base_fx_graph_translator.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * fix lint

fix typo

…er (apache#17812) * Update exported_program_translator.py * Update test_frontend_from_exported_program.py * Update test_frontend_from_exported_program.py * Update test_frontend_from_exported_program.py

In this PR I have added jit support for sampling flashinfer kernel. I have also added a unit test to test the jit compiled flashinfer kernel.

…ls (apache#17796) In this PR I have made changes so that we can support CUBLAS dispatch operations for bfloat16 data type.

…atten.int`, `hardtanh_.default`, `dropout_.default`, `silu_.default`, `add_.Tensor` and `relu_.default` (apache#17813) * support `relu_.default` * support `add_.Tensor` * support `silu_.default` * support `dropout_.default` * support `hardswish_.default` * support `hardtanh_.default` * support `unflatten.int` * fix lint error

…pache#17817) support dynamic shape

…graph (apache#17806) * add softplus op into exported program and fx graph frontend * fixing trailing whitespace issue * fixing lint issues * fix lint issue on docs * modify description to avoid cpplints issue * update softplus function with threshold attr * remove trailing spaces in softplus func * fix lint issues in legalize func * fixing cpp lints issue * test script for both exported and fx graph * trim trailing spaces iin test script * fix lint issues in test script * unit test script is added in test frontend op files * fixing lint issues in test_op_nn file * fixing attribute error in test script * fixing lint issues in test script functions * adding softplus wrapper function in op file --------- Co-authored-by: deivanayakisankaralingam <deiva@Deivanayaki>

…7822) * move gelu, relu, selu, sigmoid, silu tests to test_basic_unary_ops * remove unused torchversion * we don't need to manually call `test_*` functions * remove unused variable

* Update fx_translator.py * Update base_fx_graph_translator.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py

…ms translator (apache#17814) * stack correct * sum correct in side script * all pass

This PR fixes a bug where running "pip install -e /path-to-tvm/python" fails if installation files remain in python/tvm. The fix includes: - Preventing libraries from `python/tvm` from being appended to the library list, resolving the shutil.SameFileError exception raised by shutil.copy() - Adding cleanup logic earlier in case it was not executed due to a previous pip installation failure, resolving the FileExistsError exception raised by shutil.copytree()

* enhance missing func types finding in exported program and fx graph frontend * fix trailing space issue * fix lint issues by formatting the code * fix name error in fx frontend --------- Co-authored-by: deivanayakisankaralingam <deiva@Deivanayaki>

…ram importer (apache#17830) * Update exported_program_translator.py * Update test_frontend_from_exported_program.py * Update test_frontend_from_exported_program.py * Update test_frontend_from_exported_program.py

…e#17832) * unit test * full.default * linting * ones ok * tests for ones, full, and full like work

…pache#17838) This PR fixes a bug in reverse-compute-inline of tir Schedule, which generates incorrect TIR after inlining a transpose block into a reduction block.

…ends (apache#17840) * combine missing op logic of export and fx graph into common utilities * move func call above builder and fix lint issue * add type hint for nodes in helper function --------- Co-authored-by: deivanayakisankaralingam <deiva@Deivanayaki>

* Update fx_translator.py * Update base_fx_graph_translator.py * Update test_frontend_from_fx.py * Update base_fx_graph_translator.py * Update test_frontend_from_fx.py * Update base_fx_graph_translator.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py

…e#17803) * Add support for logaddexp core operator * Add test script for logaddexp * Add fix for lint issues * Adjust trailing spaces * Adjust leading whitespace * Add fix for lint inssues * Add fix for logaddexp test script * Fix lint issues * decomposition at op level * unity check --------- Co-authored-by: Pratheesh <[email protected]>

Update upsampling.py fix the incorrect docstring

* use `ubuntu-latest` for github ci * use `ubuntu-22.04` for android build

…ph (apache#17816) * prelu op support and test script added * end-of-file issue fixed * trailing whitespace issue fixed * fixing lint issues * fix assertion error in test_op_nn.py file * add test script in test_frontend_nn_op.py * include wrapper function for prelu in op.py * fixing unity check issue by modifying test func * conflicts resolved * add doc for prelu op axis arg * fixed failing checks issue --------- Co-authored-by: deivanayakisankaralingam <deiva@Deivanayaki>

…e#17850) * Add masked_fill support in exportedProgram * Fix lint issues

apache#17849) * add rsub op support into exported and fx graph frontend * fix trailing whitespace issue * fix lint issues in test scripts --------- Co-authored-by: deivanayakisankaralingam <deiva@Deivanayaki>

* Update batch_to_space_nd.py * Update bitserial_util.py

This PR modules reflection of ffi into registry.h and accessor.h the dependent items are updated accordingly

…ef (apache#18148) This PR migrates the remaining global def reg to use the new mechanism. It also phases out the TVM_FFI_REGISTER_GLOBAL macro in favor of the GlobalDef mechanism.

This PR changes the function global dup registration to log and throw so we have clear error message about the function duplication.

…command for tvm_cython target

* [COMMUNITY] Add new key for release signing * [Misc] Update test_release_package.sh Fix release script according tianqi advice (apache#17861 (comment)).

This PR decouples deep equal from structural equal implementation by providing a more direct implementatio through functor. DeepEqual is being used at heart of arith simplification as subroutine and it performs more direct nested checking without doing var remapping as structural equal for efficiency reasons. It also do not need to trace the wrong comparison since the failed path is also expected to happen often. This step likely will improve the deep equal efficiency because of the more direct approach and gives us opportunity to run simplify future refactor of structural equal to focus on struct path tracing.

This PR add initial support for structural equal and hash via the new reflection mechanism. It will helps us to streamline the structural equality/hash with broader support and clean error reports via AccessPath. It also gives us ability to unify all struct equal/hash registration into the extra meta-data in reflection registration.

apache#18120) * Root cause * Update

This PR fixes a build failure in nccl.cc due to the recent switch of global function registration.

This PR introduces TypeAttr to reflection to bring extra optional attribute registration that can be used to extend behaviors such as structural equality. Also renames TypeExtraInfo to TypeMetadata for better clarity.

…cordingly

…of BoolOp nodes, improving code clarity.

… details in the error message for better debugging context.

…ity and maintainability.

…ctRef> to Map<String, Any> for improved flexibility.

…bility.

… compatibility." This reverts commit 9574805.

…ing variable naming consistency.

… Map<String, Any> for enhanced flexibility in handling annotations.

…tilelang_main

…ith various attributes for enhanced GPU compatibility (tile-ai#7) Co-authored-by: xinyxiao <[email protected]>

wrongtest-intellif and others added 30 commits April 3, 2025 09:05

[TIR] Fix reduce buffer allocation position (apache#17799)

6365a30

* fix reduce buffer allocation position * fix test_tir_analysis_detect_buffer_access_lca.py::test_buffer_load_store

[Relax][PyTorch] Fix torch 2.6 compatibility issues (apache#17807)

cdebac1

* fix test_flatten * re-enable test_split * fix test_to_copy * re-enable test_batchnorm2d

[Relax][PyTorch] Delete duplicate converter function _to (apache#17809

1bf3423

) remove duplicate function

[Relax][PyTorch] Add support for argsort, sort, topk ops (apache#17810)

43adad7

* Update fx_translator.py * Update base_fx_graph_translator.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py * fix lint

[NFC] Fix explict typo (apache#17811)

9c97bba

fix typo

[Relax][PyTorch] Support argsort, topk ops for ExportedProgram import…

6a3e5e4

…er (apache#17812) * Update exported_program_translator.py * Update test_frontend_from_exported_program.py * Update test_frontend_from_exported_program.py * Update test_frontend_from_exported_program.py

[Flashinfer] Added jit flow for sampling kernel (apache#17763)

88d9aa6

In this PR I have added jit support for sampling flashinfer kernel. I have also added a unit test to test the jit compiled flashinfer kernel.

[Cublas] Added support for bfloat16 while dispatching to cublas kerne…

0d2eab2

…ls (apache#17796) In this PR I have made changes so that we can support CUBLAS dispatch operations for bfloat16 data type.

[Relax][PyTorch] Support dynamic shapes in ExportedProgram frontend (a…

52fe358

…pache#17817) support dynamic shape

[Relax][PyTorch] Cleanup tests for ExportedProgram frontend (apache#1…

03ba03e

…7822) * move gelu, relu, selu, sigmoid, silu tests to test_basic_unary_ops * remove unused torchversion * we don't need to manually call `test_*` functions * remove unused variable

[Relax][PyTorch] Add support for broadcast_to, narrow ops (apache#17820)

5842bdb

* Update fx_translator.py * Update base_fx_graph_translator.py * Update test_frontend_from_fx.py * Update test_frontend_from_fx.py

[Relax][PyTorch] Add stack.default and sum.default to exported progra…

5ae3db2

…ms translator (apache#17814) * stack correct * sum correct in side script * all pass

[release] Update version to 0.20.0 on main branch

1c60502

[release] Update version to 0.21.dev0 on main branch

3319b41

[Relax][PyTorch] full.default, full_like.default, ones.default (apach…

6bd55f0

…e#17832) * unit test * full.default * linting * ones ok * tests for ones, full, and full like work

[BugFix][TIR] Schedule support reverse-inline with reduction blocks (a…

f1ba5ed

…pache#17838) This PR fixes a bug in reverse-compute-inline of tir Schedule, which generates incorrect TIR after inlining a transpose block into a reduction block.

fixing incorrect docstring in upsampling.py (apache#17845)

899e121

Update upsampling.py fix the incorrect docstring

[CI] Upgrade ubuntu runner image for GitHub CI (apache#17846)

601d570

* use `ubuntu-latest` for github ci * use `ubuntu-22.04` for android build

[Relax][Pytorch] Add masked_fill op support in ExportedProgram (apach…

2abff88

…e#17850) * Add masked_fill support in exportedProgram * Fix lint issues

[Relax][PyTorch] Add RSub Op Support for Exported Program and FX graph (

aafb0db

apache#17849) * add rsub op support into exported and fx graph frontend * fix trailing whitespace issue * fix lint issues in test scripts --------- Co-authored-by: deivanayakisankaralingam <deiva@Deivanayaki>

Fix docstring in batch_to_space_nd and bitpack (apache#17848)

982b46c

* Update batch_to_space_nd.py * Update bitserial_util.py

tqchen and others added 29 commits July 14, 2025 20:03

[FFI][REFACTOR] Modularize refelection (apache#18147)

36fb02c

This PR modules reflection of ffi into registry.h and accessor.h the dependent items are updated accordingly

[FFI][REFACTOR] Phase out TVM_FFI_REGISTER_GLOBAL in favor of GlobalD…

b4a6b73

…ef (apache#18148) This PR migrates the remaining global def reg to use the new mechanism. It also phases out the TVM_FFI_REGISTER_GLOBAL macro in favor of the GlobalDef mechanism.

[FFI] Log and throw in function dup registration (apache#18149)

7707496

This PR changes the function global dup registration to log and throw so we have clear error message about the function duplication.

Fix CMakeLists.txt to remove unnecessary '-I' flag from Python build …

3c72b8f

…command for tvm_cython target

[Misc] Fix Release Package Test Script (apache#18153)

1a1d27c

* [COMMUNITY] Add new key for release signing * [Misc] Update test_release_package.sh Fix release script according tianqi advice (apache#17861 (comment)).

[Fix][Relax] Fix potential out-of-bounds access in TupleRewriterNode (

5e12a5c

apache#18120) * Root cause * Update

[BugFix] Fix NCCL build with GlobalDef registration (apache#18158)

5aa4dfd

This PR fixes a build failure in nccl.cc due to the recent switch of global function registration.

[FFI][REFACTOR] Introduce TypeAttr in reflection (apache#18160)

d4e7bd3

This PR introduces TypeAttr to reflection to bring extra optional attribute registration that can be used to extend behaviors such as structural equality. Also renames TypeExtraInfo to TypeMetadata for better clarity.

Merge branch 'main' of https://github.com/apache/tvm into upstream-dev

ce08d9c

c api fix

9611cc7

[FFI] Remove unused Grid constant and add HANDLE_TO_REFERENCE conversion

493f937

preserve unit loop for reindex scheduling.

9a00cd6

Add skip_simplify option to reindex method for improved index handling

fc29e7b

fix

5cc56c9

Update LetFrameNode to allow mutable value and register reflection ac…

763f196

…cordingly

Refactor argument extraction in ExprEvaluator to streamline handling …

ab733d1

…of BoolOp nodes, improving code clarity.

Enhance error reporting in IndexMapInverseImpl by including index map…

ccc68f5

… details in the error message for better debugging context.

Remove redundant type check in Allocate constructor for improved clar…

555cc71

…ity and maintainability.

Change annotations type in Allocate constructor from Map<String, Obje…

d39953f

…ctRef> to Map<String, Any> for improved flexibility.

Update minimum Python version requirement from 3.9 to 3.8 for compati…

9574805

…bility.

Revert "Update minimum Python version requirement from 3.9 to 3.8 for…

a08b7c3

… compatibility." This reverts commit 9574805.

Refactor stride naming in Namer to use name_hint when defined, improv…

cb0fd6d

…ing variable naming consistency.

Refactor MergeAnnotations function to accept Map<Any, Any> instead of…

e11521e

… Map<String, Any> for enhanced flexibility in handling annotations.

Merge branch 'tilelang_main' of https://github.com/TileLang/tvm into …

e5558ac

…tilelang_main

phaseout legacy components

5a433cc

Add support for 'tir.exp2' operation and register 'hip' target kind w…

a64a592

…ith various attributes for enhanced GPU compatibility (tile-ai#7) Co-authored-by: xinyxiao <[email protected]>

Add tilelang assume attribute to support custom assumption

ec79331

kurisu6912 closed this Sep 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

kurisu add assume attr patch 1 #8

kurisu add assume attr patch 1 #8

Uh oh!

kurisu6912 commented Sep 5, 2025

Uh oh!

Uh oh!

kurisu add assume attr patch 1 #8

kurisu add assume attr patch 1 #8

Uh oh!

Conversation

kurisu6912 commented Sep 5, 2025

Uh oh!

Uh oh!