Skip to content

Conversation

XuehaiPan
Copy link
Contributor

@XuehaiPan XuehaiPan commented Oct 10, 2025

This PR is a follow-up PR to:

Please review and merge that first if you are interested in this one.


This PR merges the three workflow files for supported platforms (currently CUDA, ROCm, and Metal) into a single file with matrix strategies.

This enables several advantages:

  • Easier maintenance: Only one workflow file to maintain instead of three separate files.
  • Scalability: Easier to add support for new platforms and Python versions in the future by simply extending the matrix.
  • Consistent testing: All platforms are tested with the same steps and configurations, ensuring consistency across different environments.
  • Reduce duplication: Common steps are defined once and reused for all platforms, reducing redundancy.
  • Better caching logic: Improved caching logic to avoid redundant installations and speed up the workflow.
  • Improved readability: A single workflow file provides a clearer overview of the CI process for the project.

Summary by CodeRabbit

  • Chores

    • Major CI/workflow overhaul: unified lint & cross-toolkit test pipelines, multi-runner matrices (CUDA/ROCm/Metal/macOS), conditional triggers, caching/ccache, per-target wheel builds, artifact staging, and repository-owner gating; legacy ROCm CI removed.
  • Refactor

    • Examples updated to accept optional argv for programmatic invocation; tests updated to call mains with explicit args.
  • Tests

    • Deterministic test seeding added; separate test requirement manifests for CUDA/ROCm/Metal introduced; one numeric tolerance relaxed.
  • Style

    • Added a multiprocessing-safe formatting pre-commit hook.
  • Config

    • Packaging, build tooling, and dev/test requirements reorganized; project metadata and build configs updated.

Copy link

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run bash format.sh in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work!

🚀

Copy link
Contributor

coderabbitai bot commented Oct 10, 2025

Walkthrough

Adds a matrixed GitHub Actions CI (lint + multi-toolkit tests), refactors several workflows, updates pyproject/build tooling and requirement manifests, makes tests deterministic, adapts example CLIs to accept explicit argv, tweaks a numeric tolerance, and adds a yapf pre-commit hook.

Changes

Cohort / File(s) Summary of Changes
New CI workflow
.github/workflows/ci.yml
Add a matrixed GitHub Actions workflow for lint and cross-toolkit tests (CUDA/ROCm/Metal) across github-hosted and self-hosted runners with toolkit setup, caching (ccache/UV), format checks, wheel install, logging, masking, and cleanup.
Workflow edits & removal
.github/workflows/pr-perfbench-bot.yml, .../publish-docs.yml, .../dist.yml, .../pr-reminder-bot.yml, .../rocm-ci.yml
Add repository-owner guards and job-level if-conditions, rename step labels to "Setup Python", refactor dist.yml to cibuildwheel multi-target matrix with per-target uploads and artifact listing; remove legacy rocm-ci.yml.
Example CLIs & tests
examples/topk/example_topk.py, examples/topk/test_topk_tilelang.py, examples/minference/test_vs_sparse_attn.py
Change example main to accept argv=None and update tests to invoke main(argv=[]); adjust __main__ test runner invocation.
Deterministic test seeding
examples/conftest.py, testing/conftest.py
Add deterministic seeding: set PYTHONHASHSEED="0", seed Python random, and conditionally seed torch and numpy under import guards.
Numerical tolerance tweak
examples/dequantize_gemm/...hopper.py
Increase final similarity-check tolerance from eps=1e-5 to eps=2e-5.
Project metadata & build tooling
pyproject.toml
Update project metadata (readme, requires-python, authors/maintainers), add [tool.pytest.ini_options], change build-system ordering (cython/setuptools), and modify cibuildwheel Linux/CUDA before-all setup and PATH handling.
Pre-commit config
.pre-commit-config.yaml
Add a yapf hook entry (yapf-multiproc-bugfix) targeting docs/conf.py with always_run: true and pass_filenames: false.
Requirements & test manifests
requirements.txt, requirements-test.txt, requirements-rocm.txt, requirements-lint.txt, requirements-dev.txt, requirements-test-*.txt, requirements-test-cuda.txt, requirements-test-metal.txt, requirements-test-rocm.txt
Rename ml_dtypesml-dtypes, add/remove/reorder dependencies, split toolkit-specific test requirement files (CUDA/ROCm/Metal), remove many entries from requirements-rocm.txt, and add requirements-test-cuda.txt pinning flash-attn.

Sequence Diagram(s)

sequenceDiagram
  participant Dev as Developer
  participant GH as GitHub Actions
  participant Runner as Runner (github-hosted / self-hosted)
  participant Setup as Toolkit Setup
  participant Build as Build & Install
  participant Tests as Test Suites
  note right of GH #f0f7ff: CI matrix triggers (lint + toolkit tests)

  Dev->>GH: push / open PR / dispatch
  GH->>Runner: schedule matrix jobs
  Runner->>Setup: Setup Python, caches (ccache/UV), env segregation
  Setup->>Build: configure toolkit (CUDA/ROCm/Metal), adjust indices
  Build->>Build: build wheel / install project
  Build->>Tests: run pytest (toolkit-specific + generic)
  Tests-->>GH: upload results & artifacts
  GH->>Runner: cleanup
Loading
sequenceDiagram
  participant Test as Test runner
  participant Example as Example main(argv)

  Test->>Example: call main(argv=[])
  Example->>Example: parser.parse_args(argv)
  Example-->>Test: execute with explicit empty args
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • LeiWang1999

Poem

I hop through CI with nimble paws,
Empty argv clears noisy flaws,
Matrix runners hum while tests compile,
Seeds set steady, results reconcile,
A yapf nibble, whiskers twitch — hooray! 🐇

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title clearly and concisely describes the key change of consolidating the CI workflow files for tests into a single refactored workflow, matching the PR’s main objective and avoiding vagueness or extraneous detail.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@XuehaiPan XuehaiPan force-pushed the merge-ci-files branch 2 times, most recently from 1245ffd to be474c9 Compare October 10, 2025 12:24
@XuehaiPan XuehaiPan marked this pull request as draft October 10, 2025 12:26
@XuehaiPan XuehaiPan force-pushed the merge-ci-files branch 2 times, most recently from efdf2a9 to f44e922 Compare October 10, 2025 12:30
@XuehaiPan XuehaiPan marked this pull request as ready for review October 10, 2025 12:31
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
.github/workflows/ci.yml (5)

2-11: Consider running on push to main as well

Currently only PRs and manual dispatch trigger. If you want tests on merges to main, add push.

 on:
   pull_request:
     types:
       - labeled
       - unlabeled
       - opened
       - synchronize
       - reopened
   # Allow to trigger the workflow manually
   workflow_dispatch:
+  push:
+    branches:
+      - main

136-139: Remove unused variable

ROCM_VERSION_MAJMIN_NODOT is computed but never used.

-          ROCM_VERSION_MAJMIN_NODOT="${ROCM_VERSION_MAJMIN//./}"

160-179: Ensure PyTorch extra index is honored by uv pip

Relying on env vars may be brittle across uv versions. Pass the index explicitly to guarantee usage, and mirror it for the CUDA/ROCm conditionals.

-          uv pip install -r requirements-build.txt -r requirements-test.txt
+          uv pip install --extra-index-url "${PIP_EXTRA_INDEX_URL:-}" -r requirements-build.txt -r requirements-test.txt
           if [[ "${{ matrix.runner.toolkit }}" == *"CUDA"* ]]; then
-            uv pip install flash-attn==2.5.8
+            uv pip install --extra-index-url "${PIP_EXTRA_INDEX_URL:-}" flash-attn==2.5.8
           elif [[ "${{ matrix.runner.toolkit }}" == *"ROCm"* ]]; then
-            uv pip install -r requirements-rocm.txt
+            uv pip install --extra-index-url "${PIP_EXTRA_INDEX_URL:-}" -r requirements-rocm.txt

If uv already respects PIP_EXTRA_INDEX_URL in your runners, feel free to skip; otherwise this removes ambiguity.


54-60: Prefer pre-commit/action for speed and reliability over pipx run

pipx may not be present or cached; the official action restores hook caches and is faster.

-      - name: Pre-commit Lint
-        run: |
-          if ! pipx run pre-commit run --all-files --show-diff-on-failure; then
-            echo "::error::Pre-commit checks failed. Please run 'pre-commit install' and 'pre-commit run --all-files' locally to see the issues."
-            exit 1
-          fi
+      - name: Pre-commit Lint
+        uses: pre-commit/[email protected]

If you keep pipx, please confirm pipx is available on ubuntu-latest.


188-206: Optional: run format checks in the lint job to fail fast

Formatting is independent of toolchain; moving this to the lint job reduces GPU runner load and speeds feedback.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7913fb1 and 55506a8.

📒 Files selected for processing (3)
  • .github/workflows/amd_ci.yml (0 hunks)
  • .github/workflows/ci.yml (1 hunks)
  • .github/workflows/metal_ci.yml (0 hunks)
💤 Files with no reviewable changes (2)
  • .github/workflows/metal_ci.yml
  • .github/workflows/amd_ci.yml

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
.github/workflows/ci.yml (2)

189-206: Consider moving format check to the lint job.

The format check currently runs on every platform (CUDA, ROCm, Metal), but code formatting is platform-independent. Moving it to the lint job would reduce redundancy and save CI time.

However, if the format check validates platform-specific CMake configuration, the current approach is appropriate.

If you decide to move it, the format check could run in the lint job after the pre-commit check:

- name: Run format check
  run: |
    python -m pip install -r requirements-build.txt
    mkdir -p build
    pushd build
    cmake .. -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
    popd
    if ! output=$(./format.sh 2>&1); then
      echo "::error::Format check failed."
      echo "$output"
      exit 1
    fi
    rm -rf build

232-239: Consider adding a timeout to ROCm tests.

CUDA and Metal tests specify --timeout=3600, but ROCm tests (line 239) don't. Adding a timeout ensures consistency and prevents hanging tests from blocking CI indefinitely.

Apply this diff to add a timeout:

-          python -m pytest -v --cache-clear test_tilelang_test_amd.py
+          python -m pytest -v --cache-clear --timeout=3600 test_tilelang_test_amd.py
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7913fb1 and 2a3e3a1.

📒 Files selected for processing (3)
  • .github/workflows/amd_ci.yml (0 hunks)
  • .github/workflows/ci.yml (1 hunks)
  • .github/workflows/metal_ci.yml (0 hunks)
💤 Files with no reviewable changes (2)
  • .github/workflows/metal_ci.yml
  • .github/workflows/amd_ci.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
🔇 Additional comments (11)
.github/workflows/ci.yml (11)

2-14: LGTM! Well-structured workflow triggers and permissions.

The PR trigger types and workflow_dispatch enable flexible CI execution, and read-only permissions follow security best practices.


34-60: LGTM! Well-structured lint job.

The lint job is properly configured with caching, full git history for pre-commit, and a helpful error message to guide developers.


62-82: LGTM! Effective test matrix consolidation.

The matrix successfully consolidates the three separate workflows. Using fail-fast: false ensures all platform failures are visible, and the draft PR check prevents unnecessary resource usage.


101-111: LGTM! Comprehensive venv caching strategy.

The cache key properly includes all relevant factors (toolkit version, requirements hash, etc.) to ensure cache validity while maximizing reuse. This aligns with the stated goal of avoiding redundant installations.


113-131: LGTM! Proper CUDA environment configuration.

The CUDA setup correctly extracts version information, configures PyTorch indexes, and validates the toolkit installation. The version validation at line 131 appropriately fails the job if nvcc is unavailable.


133-151: LGTM! Consistent ROCm environment configuration.

The ROCm setup follows the same pattern as CUDA and correctly configures the ROCm-specific PyTorch index and build options.


153-159: LGTM! Appropriate Metal environment configuration.

Metal configuration is correctly simplified compared to CUDA/ROCm since it doesn't require external package indexes.


161-179: LGTM! Well-structured venv setup with proper conditional logic.

The venv setup correctly runs only on cache miss and includes appropriate toolkit-specific dependencies. The error handling for unknown toolkits at lines 177-178 is a good safety measure.


208-211: LGTM! Efficient wheel installation.

Using --no-build-isolation is appropriate since build dependencies are already installed in the venv, and the verbose flag aids troubleshooting.


27-27: Confirm UV_INDEX_STRATEGY="unsafe-best-match" use.
This strategy merges candidates from all configured indexes and selects the highest-versioned release—mirroring pip’s default multi-index behavior but reintroducing dependency-confusion risk. Ensure this behavior is intentional when installing PyTorch wheels from custom indexes.

Likely an incorrect or invalid review comment.


213-219: Examples CI runs on CUDA only
The examples folder is entirely CUDA-specific—there are no ROCm or Metal targets in tests or code—so gating this step on matrix.runner.toolkit containing “CUDA” is correct. Add ROCm/Metal jobs when support is available.

@oraluben
Copy link
Contributor

oraluben commented Oct 10, 2025

I think this increase the maintain efforts in some ways: the self-hosted runners seems to be setup in peking university and have a much slower connection with github and pypi/uv. That makes restoring GA cache takes more time than it saves.
On the other hand, those runners have more than enough cpus to compile tilelang, which the metal runner don't, so ccache is necessary for metal. And it seems better to just use a pip mirror for cuda and rocm runner

Considering the different situation for those runners, it might be better to keep them separated? (or use multiple jobs with dependencies, do not merge them in one single job)

Some of my previous test could be found around: 76e9df7. I used uv on self-hosted runner locally to speedup package installation, but do not upload them as cache: df9ae25, and uv cache are preserved on host.
You can find full experiment in #939

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
.github/workflows/ci.yml (3)

13-14: Enable actions: write to allow cache saves (setup-python, cache v4)

With an explicit permissions block, unspecified scopes (actions) default to none; caches will restore but won’t save. Add actions: write at workflow or job level if you want cache saves.

Based on learnings

Example (workflow-level):

 permissions:
   contents: read
+  actions: write

If you intentionally prefer restore-only, ignore. Otherwise, please confirm whether saves are currently occurring in logs.


121-139: Fix CUDA → PyTorch cu tag mapping (avoid non-existent wheel indexes)

Deriving cuNNN by stripping dots (e.g., 12.8→cu128) may not exist. Use an explicit map with a sane default.

-          CUDA_VERSION="${TOOLKIT##*-}"
-          CUDA_VERSION_MAJMIN="$(echo ${CUDA_VERSION} | cut -d '.' -f-2)"
-          CUDA_VERSION_MAJMIN_NODOT="${CUDA_VERSION_MAJMIN//./}"
-          export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cu${CUDA_VERSION_MAJMIN_NODOT}"
+          CUDA_VERSION="${TOOLKIT##*-}"
+          CUDA_VERSION_MAJMIN="$(echo "${CUDA_VERSION}" | cut -d '.' -f-2)"
+          # Map CUDA to known PyTorch cu tags (adjust as your supported set evolves)
+          case "${CUDA_VERSION_MAJMIN}" in
+            12.4|12.5|12.6) CU_TAG="cu124" ;;
+            12.1|12.2|12.3) CU_TAG="cu121" ;;
+            11.8)           CU_TAG="cu118" ;;
+            *)              CU_TAG="cu124" ;;  # default/fallback
+          esac
+          export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/${CU_TAG}"
           export UV_INDEX="${PIP_EXTRA_INDEX_URL}"
           export CMAKE_CONFIGURE_OPTIONS="${CMAKE_CONFIGURE_OPTIONS} -DUSE_CUDA=ON"
@@
-          echo "CUDA_VERSION_MAJMIN_NODOT=${CUDA_VERSION_MAJMIN_NODOT}" | tee -a "${GITHUB_ENV}"
+          echo "CU_TAG=${CU_TAG}" | tee -a "${GITHUB_ENV}"
           echo "PIP_EXTRA_INDEX_URL=${PIP_EXTRA_INDEX_URL}" | tee -a "${GITHUB_ENV}"
``` <!-- review_comment_end -->

---

`219-226`: **Fix pytest file selection; globstar isn’t enabled by default**

The pattern **/test*.py won’t expand. Let pytest discover tests.



```diff
-          python -m pytest -n 4 **/test*.py -v -r fE --durations=0 --cache-clear
+          python -m pytest -n 4 -v -r fE --durations=0 --cache-clear

Alternatively, run shopt -s globstar before using **.

🧹 Nitpick comments (3)
.github/workflows/ci.yml (3)

141-160: Minor: remove unused ROCm “_NODOT” variable to reduce confusion

ROCM_VERSION_MAJMIN_NODOT is computed but not used (only echoed). Trim it.

-          ROCM_VERSION_MAJMIN_NODOT="${ROCM_VERSION_MAJMIN//./}"
@@
-          echo "ROCM_VERSION_MAJMIN_NODOT=${ROCM_VERSION_MAJMIN_NODOT}" | tee -a "${GITHUB_ENV}"
``` <!-- review_comment_end -->

---

`20-31`: **Self‑hosted runners: prefer local mirrors and host‑persistent caches**

Given slower network on self‑hosted runners, consider:
- Using a PyPI mirror (and matching UV_INDEX) when on self‑hosted.
- Persisting uv/ccache on the host (not GH cache) to avoid slow uploads.
- Optionally disabling GH cache for venv/ccache on self‑hosted.




Example snippet to set mirrors (add as a step after “Set up Python”):

```yaml
- name: Use PyPI mirror on self-hosted (optional)
  if: contains(join(matrix.runner.tags, ','), 'self-hosted') && env.PYPI_MIRROR_URL != ''
  env:
    PYPI_MIRROR_URL: ${{ secrets.PYPI_MIRROR_URL }}
  run: |
    echo "PIP_INDEX_URL=${PYPI_MIRROR_URL}" >> "$GITHUB_ENV"
    echo "UV_INDEX=${PYPI_MIRROR_URL}" >> "$GITHUB_ENV"

For ccache, on self‑hosted set CCACHE_DIR to a persistent path (e.g., /var/cache/ccache) and rely on local persistence instead of the GH cache action. Optionally add max-size to cap usage.


189-194: Scope ccache and avoid double‑caching; set a size limit

  • Ensure CCACHE_DIR is not under .cache when also using the ccache action to avoid duplicative caching.
  • On macOS (Metal), keep ccache; on self‑hosted, consider using a persistent local CCACHE_DIR and skipping the GH cache action.
  • Set a max-size to cap cache growth.

Example:

-      - name: Setup ccache
+      - name: Setup ccache
         uses: hendrikmuhs/ccache-action@v1
         with:
           create-symlink: true
           key: ccache-${{ runner.os }}-${{ runner.arch }}-${{ matrix.python-version }}-${{ matrix.runner.name }}-${{ matrix.runner.toolkit }}
+          max-size: 2G
``` <!-- review_comment_end -->
<!-- file_end -->

</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used**: CodeRabbit UI

**Review profile**: CHILL

**Plan**: Pro

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 2a3e3a167ae0140b99f8ae34a1e086a36684eb44 and 4f796393fd1fa67f9257498a85803f9ee1904607.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `.github/workflows/ci.yml` (1 hunks)

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🧠 Learnings (1)</summary>

<details>
<summary>📚 Learning: 2025-10-10T13:29:29.326Z</summary>

Learnt from: XuehaiPan
PR: #973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.326Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.


**Applied to files:**
- `.github/workflows/ci.yml`

</details>

</details>

</details>

<details>
<summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)</summary>

* GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
* GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

@XuehaiPan XuehaiPan force-pushed the merge-ci-files branch 2 times, most recently from 7ecda00 to de88d35 Compare October 10, 2025 15:31
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
.github/workflows/ci.yml (4)

13-14: Enable Actions write so caches can be saved (not just restored)

With a permissions block present, unspecified scopes are none. actions/cache@v4 and setup-python caching won’t save without actions: write. Add:

 permissions:
   contents: read
+  actions: write

Based on learnings. Please confirm in recent runs whether “Saving cache” logs appear for GH-hosted jobs. If not, this change is needed.


144-162: Map CUDA → PyTorch cu tag explicitly; avoid non-existent indices

Deriving cuNNN by stripping dots (e.g., 12.8 → cu128) may not exist. Use a map with fallback.

           TOOLKIT="${{ matrix.runner.toolkit }}"
           CUDA_VERSION="${TOOLKIT##*-}"
-          CUDA_VERSION_MAJMIN="$(echo ${CUDA_VERSION} | cut -d '.' -f-2)"
-          CUDA_VERSION_MAJMIN_NODOT="${CUDA_VERSION_MAJMIN//./}"
-          export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cu${CUDA_VERSION_MAJMIN_NODOT}"
+          CUDA_VERSION_MAJMIN="$(echo "${CUDA_VERSION}" | cut -d '.' -f-2)"
+          case "${CUDA_VERSION_MAJMIN}" in
+            12.4|12.5|12.6) CU_TAG="cu124" ;;
+            12.1|12.2|12.3) CU_TAG="cu121" ;;
+            11.8)           CU_TAG="cu118" ;;
+            *)              CU_TAG="cu124" ;;
+          esac
+          export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/${CU_TAG}"
           export UV_INDEX="${PIP_EXTRA_INDEX_URL}"
           export CMAKE_CONFIGURE_OPTIONS="${CMAKE_CONFIGURE_OPTIONS} -DUSE_CUDA=ON"
@@
-          echo "CUDA_VERSION_MAJMIN_NODOT=${CUDA_VERSION_MAJMIN_NODOT}" | tee -a "${GITHUB_ENV}"
+          echo "CU_TAG=${CU_TAG}" | tee -a "${GITHUB_ENV}"
           echo "PIP_EXTRA_INDEX_URL=${PIP_EXTRA_INDEX_URL}" | tee -a "${GITHUB_ENV}"

200-206: Avoid stale venvs: move CUDA-only deps into a tracked requirements file

Inline installing flash-attn bypasses your venv cache key (which hashes only requirements*.txt). Commit a requirements-cuda.txt and install from it.

-          if [[ "${{ matrix.runner.toolkit }}" == "CUDA"* ]]; then
-            uv pip install flash-attn==2.5.8
+          if [[ "${{ matrix.runner.toolkit }}" == "CUDA"* ]]; then
+            uv pip install -r requirements-cuda.txt

258-265: Fix pytest file selection; remove **/test*.py (globstar not enabled)

Bash on runners doesn’t enable globstar; pattern won’t expand. Let pytest discover tests.

-          python -m pytest -n 4 **/test*.py -v -r fE --durations=0 --cache-clear
+          python -m pytest -n 4 -v -r fE --durations=0 --cache-clear
🧹 Nitpick comments (2)
.github/workflows/ci.yml (2)

136-143: Skip actions/cache on self-hosted to save time

This step runs even on self-hosted (where you use your own tar cache). Guard it:

-      - name: Setup venv cache (GitHub-hosted runners)
+      - name: Setup venv cache (GitHub-hosted runners)
+        if: ! startsWith(matrix.runner.name, 'self-hosted')

106-114: Mirror indexes for self-hosted runners (slow network)

For self-hosted, set PIP_INDEX_URL/UV_INDEX from a mirror or secret to mitigate slow external bandwidth:

       - name: Setup venv cache (self-hosted runners)
         if: startsWith(matrix.runner.name, 'self-hosted')
         id: venv-cache-self-hosted
         run: |
+          # Optional: speed up installs on self-hosted with a local/mirror index
+          if [[ -n "${PIP_MIRROR_URL:-}" ]]; then
+            echo "Using pip mirror: ${PIP_MIRROR_URL}"
+            export PIP_INDEX_URL="${PIP_MIRROR_URL}"
+            export UV_INDEX="${PIP_MIRROR_URL}"
+          fi

Set PIP_MIRROR_URL via repo/org secrets or environments per runner.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 53aef5c and 7ecda00.

📒 Files selected for processing (1)
  • .github/workflows/ci.yml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.326Z
Learnt from: XuehaiPan
PR: tile-ai/tilelang#973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.326Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/ci.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)

@XuehaiPan
Copy link
Contributor Author

XuehaiPan commented Oct 10, 2025

That makes restoring GA cache takes more time than it saves.

Hi @oraluben, if I understand correctly, you mean downloading/uploading the venv caches from GitHub? The actions/cache action does support self-hosted runners for Windows. But the support status for other OSs remains unknown.

I have implemented a manual caching mechanism in commit de88d35 with some code comments. I believe it will resolve your concerns.

The new venv cache on the self-hosted runners can be shared between multiple CI runs if they are run concurrently. Previously, we used a hardcoded path for the shared venv on self-hosted runners (maybe with a concurrency limit of 1 for self-hosted runners).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
.github/workflows/ci.yml (2)

13-14: Reinstate actions: write so caches can save

With an explicit permissions block that omits actions, GitHub zeros that scope; actions/cache@v4 and setup-python’s pip cache can restore but cannot save (403 on upload). We need writable Actions perms for caching to pay off. Please add actions: write back.


148-160: Don’t fabricate cu${CUDA_VERSION} tags; map to real PyTorch wheels

Stripping the dot turns 12.8 into cu128, which PyTorch doesn’t publish. Installs will miss the correct wheel and fall back (or fail). Please map supported CUDA versions to actual cu tags with a case/lookup and fall back sanely. Example:

-          CUDA_VERSION_MAJMIN_NODOT="${CUDA_VERSION_MAJMIN//./}"
-          export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cu${CUDA_VERSION_MAJMIN_NODOT}"
+          case "${CUDA_VERSION_MAJMIN}" in
+            12.4|12.5|12.6|12.7|12.8) CU_TAG="cu124" ;;
+            12.1|12.2|12.3)          CU_TAG="cu121" ;;
+            11.8)                    CU_TAG="cu118" ;;
+            *)                       CU_TAG="cpu" ;;
+          esac
+          export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/${CU_TAG}"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 53aef5c and de88d35.

📒 Files selected for processing (1)
  • .github/workflows/ci.yml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.326Z
Learnt from: XuehaiPan
PR: tile-ai/tilelang#973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.326Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/ci.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)
🔇 Additional comments (1)
.github/workflows/ci.yml (1)

202-211: Fix toolkit detection: quoted glob makes every branch miss

In [[ … == "CUDA"* ]] the quotes make * literal, so "CUDA-12.8" never matches — we fall through to the final else and abort on every CUDA run (same for ROCm/Metal). Drop the quotes or switch to case/regex. For example:

-          if [[ "${{ matrix.runner.toolkit }}" == "CUDA"* ]]; then
+          if [[ "${{ matrix.runner.toolkit }}" == CUDA* ]]; then-          elif [[ "${{ matrix.runner.toolkit }}" == "ROCm"* ]]; then
+          elif [[ "${{ matrix.runner.toolkit }}" == ROCm* ]]; then-          elif [[ "${{ matrix.runner.toolkit }}" == "Metal"* ]]; then
+          elif [[ "${{ matrix.runner.toolkit }}" == Metal* ]]; then

Likely an incorrect or invalid review comment.

@XuehaiPan XuehaiPan force-pushed the merge-ci-files branch 2 times, most recently from 1226fa2 to 0badbdd Compare October 10, 2025 17:08
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
.github/workflows/ci.yml (2)

204-211: Pattern match still treats self-hosted as GitHub-hosted

Quoting "self-hosted"* disables wildcard matching, so the condition stays true and self-hosted runners wrongly drop into the ccache branch (exact issue previously reported). Please remove the quotes or switch to a case so self-hosted-* hits the else path as intended.

-          if [[ "${{ matrix.runner.name }}" != "self-hosted"* ]]; then
+          if [[ ${matrix.runner.name} != self-hosted* ]]; then

13-14: Add actions: write so caches actually persist

Specifying only permissions: contents: read zeros the actions scope, so neither actions/cache@v4 nor the setup-python pip cache can save artifacts—every run will reinstall from scratch. Please add actions: write (workflow or job level) alongside contents: read to restore cache saves. Based on learnings

 permissions:
   contents: read
+  actions: write
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b81c76a and 0badbdd.

📒 Files selected for processing (1)
  • .github/workflows/ci.yml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.326Z
Learnt from: XuehaiPan
PR: tile-ai/tilelang#973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.326Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/ci.yml

@XuehaiPan XuehaiPan force-pushed the merge-ci-files branch 2 times, most recently from 2c10759 to 0c054e3 Compare October 10, 2025 17:28
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
.github/workflows/ci.yml (1)

13-14: Allow cache saves by granting actions: write.

With permissions: explicitly set to only contents: read, every other scope drops to none, so actions/cache@v4 (and setup-python’s pip cache) can restore but cannot save. That knocks out the speedup this PR aims for on GitHub-hosted runners. Please add actions: write (workflow- or job-level) so caches persist between runs. Based on learnings.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0badbdd and 0c054e3.

📒 Files selected for processing (1)
  • .github/workflows/ci.yml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.326Z
Learnt from: XuehaiPan
PR: tile-ai/tilelang#973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.326Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/ci.yml

@XuehaiPan XuehaiPan force-pushed the merge-ci-files branch 3 times, most recently from c9160bd to 98ffe4e Compare October 10, 2025 18:05
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
.github/workflows/ci.yml (3)

13-14: Restore cache saves by granting actions: write.

With an explicit permissions: block, anything you omit (including actions) is forced to none. That means actions/cache@v4 (used by setup-uv/setup-python) can restore but can’t save caches, defeating the caching strategy for GitHub-hosted runners. Add actions: write here (or at job level) so cache uploads resume. Based on learnings


129-145: Map CUDA versions to real PyTorch CU tags.

Deriving cu${CUDA_VERSION_MAJMIN_NODOT} turns CUDA 12.8 into cu128, which PyTorch doesn’t publish, so wheel installs fail. Use an explicit version→cu tag map (with a safe fallback) instead of string stripping to ensure a valid index URL for each supported toolkit.


251-278: Correct uv run invocations (-m needs the module name).

uv run --no-project -m -- pytest … passes an empty module (--) to -m, so the command exits before pytest starts. Move the module name immediately after -m (e.g., uv run --no-project -m pytest -- …) or call uv run --no-project python -m pytest … in every pytest step here.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c755c3 and 98ffe4e.

📒 Files selected for processing (1)
  • .github/workflows/ci.yml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.347Z
Learnt from: XuehaiPan
PR: tile-ai/tilelang#973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.347Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/ci.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c5655c1 and 0bbfa1e.

📒 Files selected for processing (1)
  • requirements.txt (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.0 (on self-hosted-amd)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Build wheels for Python 3.8 on ubuntu-latest with CUDA-12.1
  • GitHub Check: Build wheels for Python 3.8 on macos-latest with Metal
  • GitHub Check: Build wheels for Python 3.8 on ubuntu-24.04-arm with CUDA-12.8

@LeiWang1999
Copy link
Member

cc @Alex4210987 for the rocm ci issues.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
.github/workflows/ci.yml (2)

274-374: Fix uv run -m -- pytest (module missing, tests never run)

Every invocation uv run --no-project -m -- … passes an empty module, so uv exits before running pytest/collect_env. Put the module right after -m (e.g., uv run --no-project python -m pytest … and uv run --no-project python -m torch.utils.collect_env). Update all blocks here so tests actually execute.


13-15: Grant actions: write so caches can save

With only contents: read specified, the GITHUB_TOKEN’s actions scope defaults to none, so setup-uv/actions-cache@v4 restores caches but can’t save them. Add actions: write (or write-all) to persist Python/venv caches. (Based on learnings)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6caf72b and 7b7280f.

📒 Files selected for processing (2)
  • .github/workflows/ci.yml (1 hunks)
  • .github/workflows/rocm-ci.yml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.347Z
Learnt from: XuehaiPan
PR: tile-ai/tilelang#973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.347Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/ci.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.0 (on self-hosted-amd)
  • GitHub Check: Build wheels for Python 3.8 on macos-latest with Metal
  • GitHub Check: Build wheels for Python 3.8 on ubuntu-24.04-arm with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.8 on ubuntu-latest with CUDA-12.1

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
.github/workflows/ci.yml (2)

13-15: Allow caches to save (actions: write required).

With an explicit permissions block, every omitted scope becomes none. That leaves the Actions permission unset, so actions/cache@v4, setup-python’s pip cache, and setup-uv’s cache can restore but never save. Re-enable the write scope so GitHub-hosted runs keep their caches.

 permissions:
   contents: read
+  actions: write

Based on learnings


145-167: Fix CUDA→PyTorch index (cu128 doesn’t exist).

CUDA-12.8 currently yields cu128, but PyTorch only ships cu118/cu121/cu124… This 404s and leaves CUDA wheels unset, breaking the CUDA job. Map CUDA MAJ.MIN to a real cu tag (with an error for unknown versions) before exporting the index.

           CUDA_VERSION="${TOOLKIT##*-}"
           CUDA_VERSION_MAJMIN="$(echo ${CUDA_VERSION} | cut -d '.' -f-2)"
-          CUDA_VERSION_MAJMIN_NODOT="${CUDA_VERSION_MAJMIN//./}"
-          if [[ "${TOOLKIT}" == "Nightly-"* ]]; then
-            # Use torch nightly builds
-            export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/nightly/cu${CUDA_VERSION_MAJMIN_NODOT}"
-          else
-            export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cu${CUDA_VERSION_MAJMIN_NODOT}"
-          fi
+          case "${CUDA_VERSION_MAJMIN}" in
+            12.4|12.5|12.6|12.7|12.8) CU_TAG="cu124" ;;
+            12.1|12.2|12.3)           CU_TAG="cu121" ;;
+            11.8)                     CU_TAG="cu118" ;;
+            *)
+              echo "::error::Unsupported CUDA version ${CUDA_VERSION}. Update the cu tag map."
+              exit 1
+              ;;
+          esac
+          if [[ "${TOOLKIT}" == Nightly-* ]]; then
+            export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/nightly/${CU_TAG}"
+          else
+            export PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/${CU_TAG}"
+          fi
           export UV_INDEX="${PIP_EXTRA_INDEX_URL}"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b7280f and 1758297.

📒 Files selected for processing (2)
  • .github/workflows/ci.yml (1 hunks)
  • .github/workflows/rocm-ci.yml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/rocm-ci.yml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.347Z
Learnt from: XuehaiPan
PR: tile-ai/tilelang#973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.347Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/ci.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Build wheels for Python 3.8 on ubuntu-24.04-arm with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.8 on ubuntu-latest with CUDA-12.1
  • GitHub Check: Build wheels for Python 3.8 on macos-latest with Metal
  • GitHub Check: build-test-amd

@XuehaiPan XuehaiPan force-pushed the merge-ci-files branch 2 times, most recently from 5bd3a21 to 6b22d3a Compare October 15, 2025 11:24
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
.github/workflows/rocm-ci.yml (1)

113-115: Still need to guard torch.utils.collect_env.

This reintroduces the earlier failure path: on a cold venv (or when the requirements set doesn’t install torch), python -m torch.utils.collect_env raises ModuleNotFoundError, and with set -e the whole job exits early. Please gate the call on torch being importable so the job survives cache misses.

-        echo "::group::torch.utils.collect_env"
-        python -m torch.utils.collect_env
-        echo "::endgroup::"
+        if python - <<'PYCODE'
+import importlib.util, sys
+sys.exit(0 if importlib.util.find_spec("torch") else 1)
+PYCODE
+        then
+          echo "::group::torch.utils.collect_env"
+          python -m torch.utils.collect_env
+          echo "::endgroup::"
+        else
+          echo "torch not installed yet; skip collect_env for this run"
+        fi
.github/workflows/ci.yml (1)

13-15: Enable Actions write permission for caches

With permissions explicitly set, every unspecified scope (including actions) becomes none. actions/cache@v4, setup-python’s pip cache, and setup-uv’s cache uploads therefore fail to save, so every run misses cache hits. Grant actions: write (workflow- or job-level) to restore upload capability.

 permissions:
   contents: read
+  actions: write
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6b22d3a and ab98648.

📒 Files selected for processing (2)
  • .github/workflows/ci.yml (1 hunks)
  • .github/workflows/rocm-ci.yml (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.347Z
Learnt from: XuehaiPan
PR: tile-ai/tilelang#973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.347Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/ci.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.0 (on self-hosted-amd)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Build wheels for Python 3.8 on ubuntu-latest with CUDA-12.1
  • GitHub Check: Build wheels for Python 3.8 on macos-latest with Metal
  • GitHub Check: Build wheels for Python 3.8 on ubuntu-24.04-arm with CUDA-12.8

@Alex4210987
Copy link
Collaborator

@XuehaiPan hi,you should set different env var for each version.
For example,USE_ROCM=True, for rocm ci.
otherwise, the tests would fail.

Comment on lines +199 to +204
echo "USE_ROCM=ON" | tee -a "${GITHUB_ENV}"
echo "ROCM_VERSION=${ROCM_VERSION}" | tee -a "${GITHUB_ENV}"
echo "ROCM_VERSION_MAJMIN=${ROCM_VERSION_MAJMIN}" | tee -a "${GITHUB_ENV}"
echo "ROCM_VERSION_MAJMIN_NODOT=${ROCM_VERSION_MAJMIN_NODOT}" | tee -a "${GITHUB_ENV}"
echo "PIP_EXTRA_INDEX_URL=${PIP_EXTRA_INDEX_URL}" | tee -a "${GITHUB_ENV}"
echo "UV_INDEX=${UV_INDEX}" | tee -a "${GITHUB_ENV}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example,USE_ROCM=True, for rocm ci.

@Alex4210987 It is defined here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
.github/workflows/ci.yml (1)

13-15: Grant actions: write so caches can save

With an explicit permissions: block, omitted scopes drop to none. actions/cache@v4 and setup-python’s cache need actions: write to upload; today the workflow restores but never saves, defeating the cache. Add the missing scope at workflow or job level. Based on learnings

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 01b7551 and 01eaba2.

📒 Files selected for processing (1)
  • .github/workflows/ci.yml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.347Z
Learnt from: XuehaiPan
PR: tile-ai/tilelang#973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.347Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/ci.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.0 (on self-hosted-amd)
  • GitHub Check: Build wheels for Python 3.8 on macos-latest with Metal
  • GitHub Check: Build wheels for Python 3.8 on ubuntu-24.04-arm with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.8 on ubuntu-latest with CUDA-12.1

@LeiWang1999
Copy link
Member

LGTM, thanks for your great job @XuehaiPan ! and thanks for @oraluben 's review.

@LeiWang1999 LeiWang1999 merged commit 8ce2778 into tile-ai:main Oct 15, 2025
10 checks passed
@XuehaiPan XuehaiPan deleted the merge-ci-files branch October 15, 2025 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants