-
Notifications
You must be signed in to change notification settings - Fork 1.8k
fix: bindings unit tests for nanobind #6221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Linda-Stadter <[email protected]>
WalkthroughThis update refactors nanobind and pybind11 Python bindings, focusing on improved type handling, error reporting, and state management. It introduces custom pickling for nested vectors, enhances dtype support for tensors, removes deprecated utilities, and updates test suites to enable more comprehensive coverage for nanobind bindings. Changes
Sequence Diagram(s)sequenceDiagram
participant Python
participant NanobindBindings
participant CppClass
participant CustomCaster
Python->>NanobindBindings: Call constructor or unpickle object
NanobindBindings->>CppClass: Use placement new for in-place construction
Python->>NanobindBindings: Pass numpy/torch object as managed weight
NanobindBindings->>CustomCaster: Convert object to tensor (supports more dtypes)
CustomCaster-->>NanobindBindings: Return tensor
NanobindBindings-->>CppClass: Pass tensor to executor
Estimated code review effort4 (~90 minutes) Possibly related PRs
Suggested labels
Suggested reviewers
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
/bot run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/unittest/bindings/test_executor_bindings.py (1)
70-102
: Simplify the boolean assertion.The test properly validates executor construction with various dtypes. However, the assertion can be simplified.
- assert executor.can_enqueue_requests() == True + assert executor.can_enqueue_requests()
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
cpp/tensorrt_llm/nanobind/batch_manager/bindings.cpp
(1 hunks)cpp/tensorrt_llm/nanobind/batch_manager/kvCacheManager.cpp
(2 hunks)cpp/tensorrt_llm/nanobind/bindings.cpp
(2 hunks)cpp/tensorrt_llm/nanobind/common/bindTypes.h
(2 hunks)cpp/tensorrt_llm/nanobind/common/customCasters.h
(2 hunks)cpp/tensorrt_llm/nanobind/executor/executor.cpp
(3 hunks)cpp/tensorrt_llm/nanobind/executor/request.cpp
(7 hunks)cpp/tensorrt_llm/pybind/bindings.cpp
(1 hunks)tests/unittest/bindings/test_bindings_ut.py
(0 hunks)tests/unittest/bindings/test_executor_bindings.py
(4 hunks)
💤 Files with no reviewable changes (1)
- tests/unittest/bindings/test_bindings_ut.py
🧰 Additional context used
🪛 Ruff (0.12.2)
tests/unittest/bindings/test_executor_bindings.py
100-100: Avoid equality comparisons to True
; use executor.can_enqueue_requests():
for truth checks
Replace with executor.can_enqueue_requests()
(E712)
🔇 Additional comments (22)
cpp/tensorrt_llm/nanobind/batch_manager/bindings.cpp (1)
82-82
: LGTM! Namespace update aligns with nanobind refactoring.The change from
PybindUtils
toNanobindUtils
is consistent with the broader refactoring effort to standardize nanobind utility functions across the codebase.cpp/tensorrt_llm/pybind/bindings.cpp (1)
358-361
: Improved error handling in pybind11 bindings.The explicit runtime check with exception throwing provides clearer error feedback compared to assertions. This change aligns with similar improvements in nanobind bindings and ensures consistent error handling across binding implementations.
cpp/tensorrt_llm/nanobind/bindings.cpp (2)
362-367
: Improved error handling with explicit runtime check.The explicit runtime check with exception throwing provides better error feedback than assertions, consistent with the pybind11 version.
390-390
: Proper use of placement new for nanobind object reconstruction.Using placement new to reconstruct the
SamplingConfig
in place is the correct nanobind pattern for__setstate__
methods, ensuring proper object state restoration without creating new instances.cpp/tensorrt_llm/nanobind/common/bindTypes.h (2)
24-24
: Consistent namespace rename across nanobind utilities.The change from
PybindUtils
toNanobindUtils
properly reflects the nanobind-specific nature of these utilities and maintains consistency across the codebase.
63-63
: Proper nanobind pattern for in-place object reconstruction.Using placement new
new (&v) T(s)
in the__setstate__
method is the correct nanobind approach for reconstructing objects in place during deserialization.cpp/tensorrt_llm/nanobind/batch_manager/kvCacheManager.cpp (2)
51-53
: Clean approach using type alias and opaque marking.Creating a
CacheBlockIds
type alias and marking it opaque is a clean approach that allows for custom nanobind implementation while avoiding direct binding of complex nested vector types.
430-438
: Proper pickling implementation with consistent error handling.The custom
__getstate__
and__setstate__
methods follow established nanobind patterns, with proper runtime error checking and placement new for in-place reconstruction. This adds explicit serialization support for the nested vector type.tests/unittest/bindings/test_executor_bindings.py (2)
17-19
: LGTM!The import changes align with the removal of nanobind-specific test skips and add the necessary utility for tensor conversion.
1200-1221
: LGTM!The additions properly test the serialization/deserialization of the new
ContextPhaseParams
class, ensuring all fields are preserved through the pickle/unpickle cycle.cpp/tensorrt_llm/nanobind/executor/executor.cpp (2)
55-109
: Well-designed refactoring for enhanced dtype support.The refactoring from
nb::ndarray<nb::numpy>
to genericnb::object
with__array_interface__
protocol is a good design choice that:
- Supports a wider range of array-like objects
- Enables FP8 and BF16 support through metadata inspection
- Follows standard Python array protocol for better interoperability
140-141
: LGTM!The casting to
nb::object
properly adapts to the refactorednumpyToTensor
function signature.cpp/tensorrt_llm/nanobind/common/customCasters.h (2)
244-263
: Good simplification of tensor conversion logic.The removal of DLPack conversion in favor of direct PyTorch tensor handling reduces complexity and improves code clarity.
265-290
: LGTM!The one-way type caster for vectors of const references is well-implemented for its intended use case. The approach of copying to a regular vector for Python conversion is appropriate for const references.
cpp/tensorrt_llm/nanobind/executor/request.cpp (8)
448-453
: LGTM: Improved conditional construction logic.The explicit
else
block ensures proper object construction in both conditional paths, preventing potential issues with multiple construction calls or uninitialized objects.
457-476
: LGTM: Proper nanobind constructor pattern.The change from returning
std::unique_ptr
to using placement new onself
follows nanobind best practices for in-place object construction. The explicit argument names and default values also improve readability.
499-499
: LGTM: Consistent parameter naming.Renaming the parameter from
eagleConfig
toself
improves consistency across all__setstate__
lambdas in the file and follows common nanobind conventions.Also applies to: 505-506
535-535
: LGTM: Consistent parameter naming.Same consistency improvement as other
__setstate__
lambdas - renaming parameter toself
for better uniformity across the codebase.Also applies to: 541-541
566-566
: LGTM: Consistent parameter naming.Continuing the pattern of renaming
__setstate__
lambda parameters toself
for consistency across the binding definitions.Also applies to: 572-572
810-810
: LGTM: Consistent parameter naming.Final parameter renaming to complete the consistency pattern across all
__setstate__
lambdas in the file.Also applies to: 816-816
837-837
: LGTM: Simplified constructor binding.Removing the lambda wrapper and using direct
nb::init
call is a good simplification that reduces unnecessary complexity.
841-841
: LGTM: Alternative construction pattern for complex object.The Result
__setstate__
uses a different pattern - creating a local object, assigning fields, then using placement new. This approach may be necessary for the Result class's specific construction requirements and ensures all fields are properly initialized.Also applies to: 847-847, 861-861
PR_Github #12454 [ run ] triggered by Bot |
PR_Github #12454 [ run ] completed with state |
Signed-off-by: Linda-Stadter <[email protected]> Signed-off-by: Yang Li <[email protected]>
Signed-off-by: Linda-Stadter <[email protected]> Signed-off-by: Shreyas Misra <[email protected]>
Signed-off-by: Linda-Stadter <[email protected]> Signed-off-by: Ransiki Zhang <[email protected]>
Fix of failing bindings unit tests for nanobind
This change fixes errors in the nanobind bindings which caused unit tests in
tests/unittest/bindings
to fail. These tests were previously skipped with@pytest.mark.skipif(_tb.binding_type == "nanobind")
and all of them are enabled in this PR.Executing
pytest tests/unittest/bindings
GitHub Bot Help
/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...
Provide a user friendly way for developers to interact with a Jenkins server.
Run
/bot [-h|--help]
to print this help message.See details below for each supported subcommand.
run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]
Launch build/test pipelines. All previously running jobs will be killed.
--reuse-test (optional)pipeline-id
(OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.--disable-reuse-test
(OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.--disable-fail-fast
(OPTIONAL) : Disable fail fast on build/tests/infra failures.--skip-test
(OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.--stage-list "A10-PyTorch-1, xxx"
(OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.--gpu-type "A30, H100_PCIe"
(OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.--test-backend "pytorch, cpp"
(OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.--only-multi-gpu-test
(OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.--disable-multi-gpu-test
(OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.--add-multi-gpu-test
(OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.--post-merge
(OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx"
(OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".--detailed-log
(OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.--debug
(OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in thestage-list
parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.For guidance on mapping tests to stage names, see
docs/source/reference/ci-overview.md
and the
scripts/test_to_stage_mapping.py
helper.kill
kill
Kill all running builds associated with pull request.
skip
skip --comment COMMENT
Skip testing for latest commit on pull request.
--comment "Reason for skipping build/test"
is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.reuse-pipeline
reuse-pipeline
Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests
Chores