Skip to content

Conversation

litaotju
Copy link
Collaborator

@litaotju litaotju commented Sep 17, 2025

Reverts #7768

Summary by CodeRabbit

  • Refactor

    • Unified distributed all-reduce behavior across hardware by removing strategy selection and related configuration paths.
    • Initialization now relies solely on topology mapping, providing consistent behavior across environments.
    • May affect setups that previously depended on architecture-specific strategies.
  • Chores

    • Cleaned up legacy comments and removed unused references related to the old strategy path.

@litaotju litaotju requested a review from a team as a code owner September 17, 2025 14:45
Copy link
Contributor

coderabbitai bot commented Sep 17, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Removed dynamic AllReduce strategy selection and related imports. AllReduce is now initialized solely with mapping=model_config.mapping in Llama model code. The conditional path based on SM version was deleted, along with references to AllReduceStrategy.

Changes

Cohort / File(s) Summary of changes
AllReduce strategy simplification
tensorrt_llm/_torch/models/modeling_llama.py
- Removed import of AllReduceStrategy.
- Deleted SM-version-based strategy selection logic and pre-Blackwell comment.
- Simplified AllReduce construction to AllReduce(mapping=...) (dropped strategy= argument).
- Kept AllReduceParams, MoEAllReduce, and fusion op imports/usage otherwise intact.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor App as Llama Decoder Layer
  participant Util as get_sm_version()
  participant Dist as AllReduce

  rect rgba(230,230,255,0.4)
  note over App,Dist: Previous flow (removed)
  App->>Util: Query SM version
  Util-->>App: SM version
  App->>App: Choose AllReduce strategy (e.g., NCCL vs others)
  App->>Dist: AllReduce(strategy=..., mapping=...)
  Dist-->>App: Instance
  end
Loading
sequenceDiagram
  autonumber
  actor App as Llama Decoder Layer
  participant Dist as AllReduce

  rect rgba(230,255,230,0.4)
  note over App,Dist: New flow
  App->>Dist: AllReduce(mapping=...)
  Dist-->>App: Instance
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch revert-7768-fix/llama3_allreduce_strategy

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22c120e and 59c6190.

📒 Files selected for processing (1)
  • tensorrt_llm/_torch/models/modeling_llama.py (2 hunks)

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@litaotju litaotju closed this Sep 17, 2025
@chzblych chzblych deleted the revert-7768-fix/llama3_allreduce_strategy branch September 18, 2025 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant