Skip to content

Add nvvm bindings #421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 52 commits into from
Feb 8, 2025
Merged

Add nvvm bindings #421

merged 52 commits into from
Feb 8, 2025

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented Jan 27, 2025

Closes #99

Copy link
Contributor

copy-pr-bot bot commented Jan 27, 2025

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rwgk
Copy link
Collaborator Author

rwgk commented Jan 27, 2025

/ok to test

@leofang leofang added this to the cuda-bindings parking lot milestone Jan 27, 2025
@leofang leofang added P0 High priority - Must do! feature New feature or request cuda.bindings Everything related to the cuda.bindings module labels Jan 30, 2025
@leofang
Copy link
Member

leofang commented Jan 30, 2025

I'll fix the CI failures. There are some changes needed in both the code and the CI.

Copy link
Contributor

copy-pr-bot bot commented Jan 30, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@leofang
Copy link
Member

leofang commented Jan 30, 2025

I think nvvm_windows.pyx is not quite right, but let's test Linux first

@rwgk
Copy link
Collaborator Author

rwgk commented Feb 5, 2025

/ok to test

@rwgk
Copy link
Collaborator Author

rwgk commented Feb 5, 2025

Tests are (very) complete now.

I only have one more TODO item for this PR: Try adding the 11.x headers.

@rwgk
Copy link
Collaborator Author

rwgk commented Feb 5, 2025

For completeness, this is how I determined that ERROR_INVALID_INPUT (4) is triggered by passing buffer NULL to nvvmAddModuleToProgram():

$ git diff
diff --git a/cuda_bindings/cuda/bindings/nvvm.pyx b/cuda_bindings/cuda/bindings/nvvm.pyx
index cf2a1d5..f747e81 100644
--- a/cuda_bindings/cuda/bindings/nvvm.pyx
+++ b/cuda_bindings/cuda/bindings/nvvm.pyx
@@ -153,6 +153,11 @@ cpdef add_module_to_program(intptr_t prog, buffer, size_t size, name):
         raise TypeError("name must be a Python str")
     cdef bytes _temp_name_ = (<str>name).encode()
     cdef char* _name_ = _temp_name_
+
+    cdef char* data_ptr = <char*> _buffer_.ptrs.data()
+    hex_representation = bytearray(<char[:size]> data_ptr).hex()
+    print(f"Buffer hex: {hex_representation}")
+
     with nogil:
         status = nvvmAddModuleToProgram(<Program>prog, <const char*>(_buffer_.ptrs.data()), size, <const char*>_name_)
     check_status(status)
$ pytest -s -v tests/test_nvvm.py
============================================================== test session starts ===============================================================
platform linux -- Python 3.12.8, pytest-8.3.4, pluggy-1.5.0 -- /home/rgrossekunst/forked/cuda-python/cuda_bindings/cbcv12_8Venv/bin/python
cachedir: .pytest_cache
benchmark: 5.1.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/rgrossekunst/forked/cuda-python/cuda_bindings/tests
configfile: pytest.ini
plugins: benchmark-5.1.0
collecting ... 
collected 9 items                                                                                                                                

tests/test_nvvm.py::test_nvvm_version PASSED
tests/test_nvvm.py::test_nvvm_ir_version PASSED
tests/test_nvvm.py::test_create_and_destroy PASSED
tests/test_nvvm.py::test_add_module_to_program FAILED
tests/test_nvvm.py::test_lazy_add_module_to_program PASSED
tests/test_nvvm.py::test_compile_program PASSED
tests/test_nvvm.py::test_verify_program PASSED
tests/test_nvvm.py::test_get_compiled_result PASSED
tests/test_nvvm.py::test_get_program_log PASSED

==================================================================== FAILURES ====================================================================
___________________________________________________________ test_add_module_to_program ___________________________________________________________

    def test_add_module_to_program():
        prog = nvvm.create_program()
        try:
            with pytest.raises(nvvm.nvvmError, match=r"^ERROR_INVALID_INPUT \(4\)$"):
>               nvvm.add_module_to_program(prog, [], 0, "SomeName")

tests/test_nvvm.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
cuda/bindings/nvvm.pyx:133: in cuda.bindings.nvvm.add_module_to_program
    cpdef add_module_to_program(intptr_t prog, buffer, size_t size, name):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   hex_representation = bytearray(<char[:size]> data_ptr).hex()
E   ValueError: Cannot create cython.array from NULL pointer

cuda/bindings/nvvm.pyx:158: ValueError
============================================================ short test summary info =============================================================
FAILED tests/test_nvvm.py::test_add_module_to_program - ValueError: Cannot create cython.array from NULL pointer
========================================================== 1 failed, 8 passed in 0.02s ===========================================================

rwgk added 8 commits February 5, 2025 10:50
./11.0.3_450.51.06/cuda_nvcc/nvvm/lib64/libnvvm.so.3.3.0
./11.1.1_455.32.00/cuda_nvcc/nvvm/lib64/libnvvm.so.3.3.0
./11.2.2_460.32.03/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.3.1_465.19.01/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.4.4_470.82.01/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.5.1_495.29.05/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.6.2_510.47.03/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.7.1_515.65.01/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.8.0_520.61.05/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.0.1_525.85.12/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.1.1_530.30.02/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.2.2_535.104.05/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.3.2_545.23.08/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.4.1_550.54.15/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.5.1_555.42.06/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.6.2_560.35.03/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.8.0_570.86.10/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0

For completeness, since the nvjitlink code is touched in this commit, these are the libnvJitLink version numbers:

./12.0.1_525.85.12/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.0.140
./12.1.1_530.30.02/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.1.105
./12.2.2_535.104.05/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.2.140
./12.3.2_545.23.08/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.3.101
./12.4.1_550.54.15/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.4.127
./12.5.1_555.42.06/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.5.82
./12.6.2_560.35.03/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.6.77
./12.8.0_570.86.10/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.8.61
@rwgk
Copy link
Collaborator Author

rwgk commented Feb 7, 2025

@leofang I backtracked for the minute and added be55676 based on our discussion yesterday. That extra rpath

  • helps if I pip install cuda-bindings into the conda env
  • but does not help if cuda-bindings is pip-installed into a separate venv

What I had under b45bac2 (reverted at the moment) fixes that.

I want to enumerate the situations that we may encounter, to then 1. decide what we want to accommodate, and 2. find corresponding solutions.

@rwgk
Copy link
Collaborator Author

rwgk commented Feb 7, 2025

/ok to test

Copy link
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@rwgk rwgk merged commit 2981bfd into NVIDIA:main Feb 8, 2025
69 checks passed
Copy link

github-actions bot commented Feb 8, 2025

Backport failed for 11.8.x, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 11.8.x
git worktree add -d .worktree/backport-421-to-11.8.x origin/11.8.x
cd .worktree/backport-421-to-11.8.x
git switch --create backport-421-to-11.8.x
git cherry-pick -x 2981bfd875a0576283fb54130d7b52f29071531c

@rwgk rwgk deleted the nvvm_bindings branch February 8, 2025 03:30
leofang pushed a commit to leofang/cuda-python that referenced this pull request Feb 8, 2025
* Add nvvm to setup.py

* Add test_nvvm.py

* test_nvvm.py version(), ir_version()

* Snapshot of generated files.

* Add in `nvvm.create_program()`

* Add in `nvvm.destroy_program()`

* Add in `nvvm.compile_program()`

* Add in add_module_to_program()

* Add in verify_program()

* Add in lazy_add_module_to_program()

* Add in get_compiled_result_size(), get_program_log_size()

* Add in get_compiled_result(), get_program_log()

* Change Copyright dates to 2025

* Use cybind results "automatically generated across versions from 12.0.1 to 12.8.0."

* update to use NVKS runners

* Add tests/run_simple.py

* update fetch_ctk to find nvvm shared lib

* fix wheel rel path

* add nvcc wheel to [all]

* Fix cybind bindings for add_module_to_program(), lazy_add_module_to_program()

* Add test_with_minimal_nnvm_ir()

* Remove tests/run_simple.py

* Update cuda_bindings/cuda/bindings/_internal/nvvm_windows.pyx

Co-authored-by: Leo Fang <[email protected]>

* Update cuda_bindings/cuda/bindings/_internal/nvvm_windows.pyx

Co-authored-by: Leo Fang <[email protected]>

* Remove stray `f` (it is now a plain string, not an f-string anymore)

* Add bootstrap_local_dev.sh script.

* Fix nvvm.compile_program() failure for CUDA version 12.0

The original datalayout lacked explicit alignment and size definitions for i1, i8, i16, f32, f64, v64, and v128.

The missing types are crucial for LLVM-based compilation in CUDA 12.0.

Later CUDA versions are more forgiving, but 12.0 enforces a stricter layout. The stricter layout should resolve the issue for CUDA 12.0
without breaking compatibility with later versions.

* Add test_verify_program_with_minimal_nnvm_ir() and rename some tests for clarity.

* Complete test coverage.

* Introduce noregex() to reduce backslash clutter.

* Use a contextmanager to replace repeated try-finally.

* Rename noregex to match_exact

* Introduce get_program_log() helper.

* Improve nvvm_program() Context Manager

* Remove redundant "utf-8"

* Also test with NVVM Bitcode (using a new pytest fixture).

* Introduce compile_or_verify fixture.

* Remove bootstrap_local_dev.sh, to be moved to a separate PR.

* Update from codegen after config fix.

* Update from codegen after config fix.

* Update from codegen after adding CTK 11.x nvvm.h headers. Functional NO-OP.

* Fix get_nvvm_dso_version_suffix() to match actual version numbers:

./11.0.3_450.51.06/cuda_nvcc/nvvm/lib64/libnvvm.so.3.3.0
./11.1.1_455.32.00/cuda_nvcc/nvvm/lib64/libnvvm.so.3.3.0
./11.2.2_460.32.03/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.3.1_465.19.01/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.4.4_470.82.01/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.5.1_495.29.05/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.6.2_510.47.03/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.7.1_515.65.01/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./11.8.0_520.61.05/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.0.1_525.85.12/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.1.1_530.30.02/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.2.2_535.104.05/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.3.2_545.23.08/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.4.1_550.54.15/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.5.1_555.42.06/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.6.2_560.35.03/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0
./12.8.0_570.86.10/cuda_nvcc/nvvm/lib64/libnvvm.so.4.0.0

For completeness, since the nvjitlink code is touched in this commit, these are the libnvJitLink version numbers:

./12.0.1_525.85.12/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.0.140
./12.1.1_530.30.02/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.1.105
./12.2.2_535.104.05/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.2.140
./12.3.2_545.23.08/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.3.101
./12.4.1_550.54.15/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.4.127
./12.5.1_555.42.06/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.5.82
./12.6.2_560.35.03/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.6.77
./12.8.0_570.86.10/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.8.61

* find_libnvvm_so_via_proc_self_maps() Proof Of Concept

* Revert "find_libnvvm_so_via_proc_self_maps() Proof Of Concept"

This reverts commit b45bac2.

* Add another rpath for finding libnvvm.so

---------

(cherry picked from commit 2981bfd)
Copy link

github-actions bot commented Feb 8, 2025

Doc Preview CI
Preview removed because the pull request was closed or merged.

@leofang
Copy link
Member

leofang commented Feb 8, 2025

@rwgk we forgot about adding docs, would you take care of it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda.bindings Everything related to the cuda.bindings module feature New feature or request P0 High priority - Must do! to-be-backported Trigger the bot to raise a backport PR upon merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add libnvvm bindings
3 participants