Skip to content

[Backport] Add nvvm bindings #442

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Feb 11, 2025
Merged

Conversation

leofang
Copy link
Member

@leofang leofang commented Feb 8, 2025

Backport of #421 + #443.

* Add nvvm to setup.py

* Add test_nvvm.py

* test_nvvm.py version(), ir_version()

* Snapshot of generated files.

* Add in `nvvm.create_program()`

* Add in `nvvm.destroy_program()`

* Add in `nvvm.compile_program()`

* Add in add_module_to_program()

* Add in verify_program()

* Add in lazy_add_module_to_program()

* Add in get_compiled_result_size(), get_program_log_size()

* Add in get_compiled_result(), get_program_log()

* Change Copyright dates to 2025

* Use cybind results "automatically generated across versions from 12.0.1 to 12.8.0."

* update to use NVKS runners

* Add tests/run_simple.py

* update fetch_ctk to find nvvm shared lib

* fix wheel rel path

* add nvcc wheel to [all]

* Fix cybind bindings for add_module_to_program(), lazy_add_module_to_program()

* Add test_with_minimal_nnvm_ir()

* Remove tests/run_simple.py

* Update cuda_bindings/cuda/bindings/_internal/nvvm_windows.pyx

Co-authored-by: Leo Fang <[email protected]>

* Update cuda_bindings/cuda/bindings/_internal/nvvm_windows.pyx

Co-authored-by: Leo Fang <[email protected]>

* Remove stray `f` (it is now a plain string, not an f-string anymore)

* Add bootstrap_local_dev.sh script.

* Fix nvvm.compile_program() failure for CUDA version 12.0

The original datalayout lacked explicit alignment and size definitions for i1, i8, i16, f32, f64, v64, and v128.

The missing types are crucial for LLVM-based compilation in CUDA 12.0.

Later CUDA versions are more forgiving, but 12.0 enforces a stricter layout. The stricter layout should resolve the issue for CUDA 12.0
without breaking compatibility with later versions.

* Add test_verify_program_with_minimal_nnvm_ir() and rename some tests for clarity.

* Complete test coverage.

* Introduce noregex() to reduce backslash clutter.

* Use a contextmanager to replace repeated try-finally.

* Rename noregex to match_exact

* Introduce get_program_log() helper.

* Improve nvvm_program() Context Manager

* Remove redundant "utf-8"

* Also test with NVVM Bitcode (using a new pytest fixture).

* Introduce compile_or_verify fixture.

* Remove bootstrap_local_dev.sh, to be moved to a separate PR.

* Update from codegen after config fix.

* Update from codegen after config fix.

* Update from codegen after adding CTK 11.x nvvm.h headers. Functional NO-OP.

* Fix get_nvvm_dso_version_suffix() to match actual version numbers:

./11.0.3_450.51.06/cuda_nvcc/nvvm/lib64/libnvvm.so.3.3.0
./11.1.1_455.32.00/cuda_nvcc/nvvm/lib64/libnvvm.so.3.3.0
./11.2.2_460.32.03/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.3.1_465.19.01/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.4.4_470.82.01/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.5.1_495.29.05/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.6.2_510.47.03/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.7.1_515.65.01/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.8.0_520.61.05/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.0.1_525.85.12/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.1.1_530.30.02/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.2.2_535.104.05/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.3.2_545.23.08/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.4.1_550.54.15/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.5.1_555.42.06/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.6.2_560.35.03/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.8.0_570.86.10/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0

For completeness, since the nvjitlink code is touched in this commit, these are the libnvJitLink version numbers:

./12.0.1_525.85.12/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.0.140
./12.1.1_530.30.02/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.1.105
./12.2.2_535.104.05/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.2.140
./12.3.2_545.23.08/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.3.101
./12.4.1_550.54.15/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.4.127
./12.5.1_555.42.06/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.5.82
./12.6.2_560.35.03/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.6.77
./12.8.0_570.86.10/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.8.61

* find_libnvvm_so_via_proc_self_maps() Proof Of Concept

* Revert "find_libnvvm_so_via_proc_self_maps() Proof Of Concept"

This reverts commit b45bac2.

* Add another rpath for finding libnvvm.so

---------

(cherry picked from commit 2981bfd)
@leofang leofang added P0 High priority - Must do! feature New feature or request cuda.bindings Everything related to the cuda.bindings module labels Feb 8, 2025
@leofang leofang added this to the cuda-python 12-next, 11-next milestone Feb 8, 2025
@leofang leofang requested a review from rwgk February 8, 2025 03:33
@leofang leofang self-assigned this Feb 8, 2025
Copy link
Contributor

copy-pr-bot bot commented Feb 8, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

rwgk
rwgk previously approved these changes Feb 8, 2025
Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Leo :-)

@leofang
Copy link
Member Author

leofang commented Feb 8, 2025

/ok to test

@leofang leofang force-pushed the backport-421-to-11.8.x branch from c2ceee7 to 9bcd12c Compare February 8, 2025 03:43
@leofang
Copy link
Member Author

leofang commented Feb 8, 2025

/ok to test

@leofang leofang force-pushed the backport-421-to-11.8.x branch from 9bcd12c to d584e67 Compare February 8, 2025 04:07
@leofang
Copy link
Member Author

leofang commented Feb 8, 2025

/ok to test

@leofang leofang force-pushed the backport-421-to-11.8.x branch from d584e67 to 639319e Compare February 8, 2025 04:19
@leofang
Copy link
Member Author

leofang commented Feb 8, 2025

/ok to test

@leofang
Copy link
Member Author

leofang commented Feb 8, 2025

Seems like we need to update the IR versions etc.

for suffix in get_nvvm_dso_version_suffix(driver_ver):
if len(suffix) == 0:
continue
dll_name = "nvvm64_40_0"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: check this on windows

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I downloaded

  • nvidia_cuda_nvcc_cu11-11.8.89-py3-none-win_amd64.whl
  • nvidia_cuda_nvcc_cu12-12.0.76-py3-none-win_amd64.whl
  • nvidia_cuda_nvcc_cu12-12.8.61-py3-none-win_amd64.whl

from

Then:

$ for whl in *.whl; do unzip -l $whl | grep dll; done
 18079744  09-21-2022 18:38   nvidia/cuda_nvcc/nvvm/bin/nvvm64_40_0.dll
 18210304  10-25-2022 03:43   nvidia/cuda_nvcc/nvvm/bin/nvvm64_40_0.dll
 52869120  01-16-2025 05:07   nvidia/cuda_nvcc/nvvm/bin/nvvm64_40_0.dll

I.e. the nvvm dll_name for 11.8 is the same as for the 12.x series.

@rwgk
Copy link
Collaborator

rwgk commented Feb 8, 2025

/ok to test

@rwgk
Copy link
Collaborator

rwgk commented Feb 8, 2025

/ok to test

rwgk added 3 commits February 10, 2025 23:18

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
…VMIR_BITCODE_STATIC dict.
@rwgk
Copy link
Collaborator

rwgk commented Feb 11, 2025

/ok to test

rwgk
rwgk previously approved these changes Feb 11, 2025
Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ready for merging, too. It's in sync with #443.

@leofang
Copy link
Member Author

leofang commented Feb 11, 2025

/ok to test

@leofang leofang enabled auto-merge February 11, 2025 21:26
@leofang
Copy link
Member Author

leofang commented Feb 11, 2025

Thanks, Ralf! Turns out I can't approve PRs authored by me, but I set auto-merge.

@leofang leofang merged commit 9650c4b into NVIDIA:11.8.x Feb 11, 2025
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda.bindings Everything related to the cuda.bindings module feature New feature or request P0 High priority - Must do!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants