Skip to content

Conversation

avivhalfon
Copy link
Contributor

@avivhalfon avivhalfon commented Jul 22, 2025

  • I have added tests that cover my changes.
  • If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
  • PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
  • (If applicable) I have updated the documentation accordingly.

Important

Added vendor detection for LangChain models to enhance tracing spans with vendor-specific information.

  • Behavior:
    • Added detect_vendor_from_class() in vendor_detection.py to identify model vendors (e.g., OpenAI, Azure) from class names.
    • Updated _create_llm_span() in callback_handler.py to set SpanAttributes.LLM_SYSTEM to the detected vendor.
  • Metrics:
    • Updated token usage and duration metrics in callback_handler.py and span_utils.py to use dynamic vendor names.
  • Tests:
    • Modified tests in test_langchain_metrics.py and test_llms.py to check for correct vendor labeling in spans and metrics.

This description was created by Ellipsis for ccbbf67. You can customize this summary. It will automatically update as commits are pushed.


Summary by CodeRabbit

  • New Features

    • Enhanced vendor detection for LangChain models, enabling automatic identification and labeling of model vendors (e.g., OpenAI, Azure, AWS, Google) in tracing spans.
  • Improvements

    • Tracing spans now dynamically reflect the detected vendor name, offering clearer insights during monitoring and debugging of model interactions.
    • Added verification of vendor labeling in telemetry data across multiple model integrations to ensure accurate tracing information.
    • Metrics and token usage reporting updated to use dynamic vendor names instead of a fixed label, improving accuracy in monitoring data.

Copy link

coderabbitai bot commented Jul 22, 2025

Walkthrough

Two helper functions were added to extract a class name from serialized model data and detect the vendor from that class name. The span creation method was updated to accept serialized data and set the vendor dynamically instead of a fixed string. Calls to span creation and span attribute usage were modified accordingly. Tests were enhanced to assert the correct vendor attribute in spans. A new module was introduced to encapsulate vendor detection logic based on class name patterns.

Changes

File(s) Change Summary
.../opentelemetry/instrumentation/langchain/callback_handler.py Added _extract_class_name_from_serialized; updated _create_llm_span signature and logic to accept serialized data; modified calls in on_chat_model_start and on_llm_start to pass serialized data; replaced hardcoded vendor strings with dynamic vendor retrieval in on_llm_end.
.../opentelemetry/instrumentation/langchain/vendor_detection.py Added VendorRule dataclass and detect_vendor_from_class function implementing vendor detection logic based on class name exact matches and substring patterns.
.../opentelemetry/instrumentation/langchain/span_utils.py Modified set_chat_response_usage to retrieve vendor from span attributes instead of hardcoding "Langchain".
.../opentelemetry-instrumentation-langchain/tests/test_llms.py Added assertions verifying SpanAttributes.LLM_SYSTEM matches expected vendor strings in spans for multiple LLM test cases including HuggingFace, openai, Anthropic, and AWS.
.../opentelemetry-instrumentation-langchain/tests/metrics/test_langchain_metrics.py Updated test assertions to expect SpanAttributes.LLM_SYSTEM attribute value "openai" instead of "Langchain" in metrics tests.

Estimated code review effort

2 (~20 minutes)

Possibly related PRs

Poem

In the garden of code, a new sprout appears,
Detecting the vendor with clever new gears.
Spans now wear badges—OpenAI, Azure, or more—
No longer just Langchain, as they were before.
🐇 With a twitch of my nose, I hop in delight,
For code grows more clever in the soft moonlight!

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to acd97da in 2 minutes and 8 seconds. Click for details.
  • Reviewed 109 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:336
  • Draft comment:
    Good use of the vendor extraction in _create_llm_span via 'serialized or {}'. As an enhancement, consider logging or alerting when the fallback vendor ('Langchain') is used, to help diagnose cases where the vendor info might be missing.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While the suggestion could help with debugging, it violates several review rules: 1) It's not pointing out a clear code change requirement 2) It's more of an enhancement suggestion than a critical issue 3) It's speculative - assuming that fallback cases need monitoring without evidence of actual problems. The current code handles the fallback gracefully and consistently. The suggestion could genuinely help identify misconfigured or unexpected LLM implementations. Logging fallbacks could provide valuable operational insights. While logging could be helpful, the comment doesn't identify an actual problem with the current implementation. The fallback is an intentional design choice that works correctly. The comment should be deleted as it suggests an optional enhancement rather than identifying a clear issue requiring change. The current fallback behavior is working as designed.
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:79
  • Draft comment:
    Typographical note: The fallback vendor string 'Langchain' has inconsistent capitalization compared to 'LangChain' in the docstring. Consider standardizing the casing for consistency.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.

Workflow ID: wflow_nT4rjSX8wlqnqX0x

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (1)

549-549: Fix inconsistent vendor attribution in metrics.

The metrics recording still uses hardcoded "Langchain" while spans now use dynamic vendor detection. This creates inconsistency between traces and metrics.

Consider storing the detected vendor in the SpanHolder during span creation and using it for metrics:

# In _create_llm_span method, after line 338:
+        # Store vendor for metrics
+        self.spans[run_id].vendor = vendor

# In on_llm_end method:
-                        SpanAttributes.LLM_SYSTEM: "Langchain",
+                        SpanAttributes.LLM_SYSTEM: getattr(self.spans[run_id], 'vendor', 'Langchain'),

You'll also need to update the SpanHolder class to include a vendor field.

Also applies to: 559-559, 576-576

🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (1)

66-124: LGTM with minor suggestions for robustness.

The vendor detection logic is comprehensive and well-structured. Consider these improvements:

  1. Handle edge case: Add type checking for class_id to prevent potential TypeError if it's neither a list nor convertible to string.
  2. Performance: Consider using a dictionary lookup for exact matches before iterating through patterns.
 def _detect_vendor_from_class(serialized: dict[str, Any]) -> str:
     """
     Detect vendor from LangChain model class name.
     Uses unified detection rules combining exact matches and patterns.
 
     Args:
         serialized: Serialized model information from LangChain callback
     
     Returns:
         Vendor string (e.g., "openai", "Azure", "AWS", "Google", etc.)
     """
     class_id = serialized.get("id", [])
     if not class_id:
         return "Langchain"  # Fallback
     
     # Get the class name (last element in id array)
-    class_name = class_id[-1] if isinstance(class_id, list) else str(class_id)
+    try:
+        class_name = class_id[-1] if isinstance(class_id, list) else str(class_id)
+    except (TypeError, IndexError):
+        return "Langchain"  # Fallback for invalid class_id structure
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6704f15 and acd97da.

📒 Files selected for processing (1)
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (5 hunks)
🔇 Additional comments (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (2)

314-341: Excellent implementation of dynamic vendor detection.

The method signature update and vendor detection integration is well-implemented:

  • Maintains backward compatibility with optional serialized parameter
  • Proper fallback to empty dict when serialized is None
  • Clean integration with the new _detect_vendor_from_class function

451-451: Correct integration of vendor detection in method calls.

Both method calls properly pass the serialized parameter to enable dynamic vendor detection during span creation.

Also applies to: 477-477

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 3139236 in 2 minutes and 31 seconds. Click for details.
  • Reviewed 95 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:116
  • Draft comment:
    Consider precomputing class_name.lower() once before the loop instead of inside each iteration to improve performance.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% This is a vendor detection function that likely runs infrequently and with a small number of iterations. The .lower() operation is only performed if the exact match fails. Moving it outside would save a few string operations but could make the code less readable by separating the transformation from where it's used. The optimization impact would be negligible in practice. The comment does identify a valid micro-optimization opportunity. In some contexts, reducing redundant string operations could be important for performance. While technically correct, this optimization would have negligible real-world impact given the function's usage pattern and small loop size. The current code organization keeps related operations together which aids readability. The comment should be deleted as it suggests a micro-optimization that would have negligible benefit while potentially making the code less readable.
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:84
  • Draft comment:
    Vendor strings have mixed casing (e.g., 'Azure' vs 'openai'). Ensure this inconsistency is intentional for downstream processing.
  • Reason this comment was not posted:
    Comment was on unchanged code.

Workflow ID: wflow_jWVovKRxKumv52FD

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed 2c973e2 in 1 minute and 45 seconds. Click for details.
  • Reviewed 51 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:66
  • Draft comment:
    Refactored _detect_vendor_from_class now takes a class name (string) instead of a serialized dict. Consider trimming or validating the class_name (e.g., removing extra whitespace) to ensure robust vendor matching.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment has two parts: 1) Noting the refactoring change - this is just informative and not actionable. 2) Suggesting input validation - while this could be actionable, the function already handles empty/None inputs on line 77 with "if not class_name:", so additional validation isn't really needed. The suggestion to trim whitespace could be useful but isn't critical since the class names appear to come from internal code rather than user input. The function already has basic input validation. Could there be other edge cases I'm missing? What if class_name contains invalid characters or is malformed in some way? The class names come from LangChain's internal serialization, so they should be well-formed. The existing None/empty check is sufficient for the error cases we care about. Over-validating internal data would add complexity without much benefit. The comment should be deleted since it's partly informational (not actionable) and partly suggesting validation that isn't really needed given the existing checks and internal nature of the data.

Workflow ID: wflow_Xy0PlI91W37scpcE

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (2)

317-344: Fix potential null pointer exception when serialized is None.

The method doesn't handle the case where serialized is None, which will cause a runtime error when calling serialized.get("id", []).

Apply this diff to add null safety:

-        class_id = serialized.get("id", [])
+        class_id = serialized.get("id", []) if serialized else []

Additionally, consider adding more defensive handling:

         # Get the class name (last element in id array)
         if isinstance(class_id, list) and len(class_id) > 0:
             class_name = class_id[-1]
-        elif class_id:
+        elif isinstance(class_id, str) and class_id:
             class_name = str(class_id)
         else:
             class_name = ""

555-585: Consider using detected vendor in metrics for consistency.

The metrics recording still uses hardcoded "Langchain" values while the span attributes now use the detected vendor. This inconsistency could impact observability.

Consider storing the detected vendor in the span holder or retrieving it from the span attributes to use in metrics:

                 self.token_histogram.record(
                     prompt_tokens,
                     attributes={
-                        SpanAttributes.LLM_SYSTEM: "Langchain",
+                        SpanAttributes.LLM_SYSTEM: span.get_attribute(SpanAttributes.LLM_SYSTEM) or "Langchain",
                         SpanAttributes.LLM_TOKEN_TYPE: "input",
                         SpanAttributes.LLM_RESPONSE_MODEL: model_name or "unknown",
                     },
                 )

Note: This assumes a way to retrieve span attributes. If not available, consider storing the vendor in the SpanHolder class or passing it through the method chain.

🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (1)

66-120: Solid vendor detection implementation with comprehensive coverage.

The function correctly handles edge cases and provides good fallback behavior. The fix for case-sensitivity ordering (exact matches before lowercasing) addresses the previous review comment appropriately.

Consider moving vendor_rules to module level as a constant for better performance since it's recreated on every function call:

+# Vendor detection rules (order matters - most specific first)
+_VENDOR_RULES = [
+    ({"AzureChatOpenAI", "AzureOpenAI", "AzureOpenAIEmbeddings"}, ["azure"], "Azure"),
+    # ... rest of the rules
+]
+
 def _detect_vendor_from_class(class_name: str) -> str:
     """
     Detect vendor from LangChain model class name.
     Uses unified detection rules combining exact matches and patterns.
 
     Args:
         class_name: The class name extracted from serialized model information
 
     Returns:
         Vendor string
     """
     if not class_name:  # Handles None, empty string, or other falsy values
         return "Langchain"  # Fallback
 
-    # Vendor detection rules (order matters - most specific first)
-    vendor_rules = [
-        ({"AzureChatOpenAI", "AzureOpenAI", "AzureOpenAIEmbeddings"}, ["azure"], "Azure"),
-        # ... rest of the rules
-    ]
-
-    for exact_matches, patterns, vendor in vendor_rules:
+    for exact_matches, patterns, vendor in _VENDOR_RULES:
         if class_name in exact_matches:
             return vendor
         class_lower = class_name.lower()
         if any(pattern in class_lower for pattern in patterns):
             return vendor
 
     return "Langchain"  # Fallback
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3139236 and 2c973e2.

📒 Files selected for processing (1)
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test Packages (3.11)
  • GitHub Check: Test Packages (3.10)
  • GitHub Check: Test Packages (3.9)
  • GitHub Check: Lint
  • GitHub Check: Test Packages (3.12)
  • GitHub Check: Build Packages (3.11)
🔇 Additional comments (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (2)

456-458: Correct integration of vendor detection for chat models.

The modification properly passes the serialized parameter to enable vendor detection while maintaining all existing functionality.


482-484: Correct integration of vendor detection for LLM completions.

The modification properly passes the serialized parameter to enable vendor detection for completion requests.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed e844351 in 2 minutes and 6 seconds. Click for details.
  • Reviewed 50 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/tests/test_llms.py:146
  • Draft comment:
    Ensure the vendor attribute for HuggingFace is set as intended. The test asserts 'HuggingFace' (with capital H and F); please confirm this casing is consistent with your conventions.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
2. packages/opentelemetry-instrumentation-langchain/tests/test_llms.py:280
  • Draft comment:
    Verify that the vendor for OpenAI is consistently set as 'openai' (all lowercase). Confirm that this matches your vendor naming conventions.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 40% <= threshold 50% The comment is asking the PR author to confirm their naming conventions, which violates the rule against asking for confirmation of intentions. However, it does suggest a specific naming convention ('openai' in lowercase), which could be useful if the code is inconsistent. The comment could be rephrased to suggest checking for consistency without asking for confirmation.
3. packages/opentelemetry-instrumentation-langchain/tests/test_llms.py:662
  • Draft comment:
    Confirm the vendor attribute for Anthropic is set to 'Anthropic'. The casing differs from the OpenAI vendor; ensure this is intended.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
4. packages/opentelemetry-instrumentation-langchain/tests/test_llms.py:873
  • Draft comment:
    Check that the vendor attribute for Bedrock is correctly set to 'AWS'. Ensure that this naming aligns with your overall vendor strategy.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to ensure that the vendor attribute aligns with their overall strategy, which is a form of asking for confirmation or double-checking. This violates the rule against asking the author to confirm their intention or ensure behavior is intended.
5. packages/opentelemetry-instrumentation-langchain/tests/test_llms.py:1089
  • Draft comment:
    The test_trace_propagation now uses an expected_vendors dictionary for OpenAI variants. Ensure that this mapping is updated when adding further vendors and matches the intended vendor naming.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 20% <= threshold 50% The comment is asking the author to ensure that a mapping is updated when adding further vendors. This is a general request to ensure future updates, which is not specific to the current PR changes. It doesn't provide a specific suggestion or point out a specific issue with the current code.

Workflow ID: wflow_OIklbcfrvb7LSLTv

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 4c32120 in 37 seconds. Click for details.
  • Reviewed 13 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:332
  • Draft comment:
    Using '(serialized or {})' safely handles cases where 'serialized' is None. This prevents a potential AttributeError when trying to call .get() on None.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_ftrF3HaJBxrDsgl2

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 17ca083 in 41 seconds. Click for details.
  • Reviewed 49 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:65
  • Draft comment:
    Nice extraction: _extract_class_name_from_serialized encapsulates the logic to retrieve the model class name from the 'id' field, improving clarity and maintainability.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:349
  • Draft comment:
    Good refactor: Replacing inline extraction of 'id' with _extract_class_name_from_serialized() in _create_llm_span cleans up the code and reuses logic.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_2mAe6yzn448FT94S

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@doronkopit5 doronkopit5 changed the title fix: added vendors to langchain fix(langchain): added vendors to llm calls Jul 23, 2025
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed c8922f4 in 1 minute and 44 seconds. Click for details.
  • Reviewed 29 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:62
  • Draft comment:
    Unnecessary blank line before _extract_class_name_from_serialized. Consider removing if not aligned with style.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:81
  • Draft comment:
    Extra blank line added before _detect_vendor_from_class. Remove if not needed for consistency.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_D0YhwrcTRGDKmrEH

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 0914b67 in 1 minute and 47 seconds. Click for details.
  • Reviewed 40 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:560
  • Draft comment:
    Good change replacing hardcoded 'Langchain' with a vendor fetched from span attributes. Consider computing the vendor once at the beginning of on_llm_end to avoid duplicate lookups.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:589
  • Draft comment:
    Vendor is re-fetched for the duration histogram. Reusing the earlier computed vendor (if unchanged) would reduce redundancy.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% The comment is technically correct - there is some duplication in fetching the vendor value. However, this is a very minor optimization that would save just one attribute lookup. The code is still readable and maintainable as is. The benefit of reusing the value would be minimal and could make the code slightly more complex by requiring an additional variable to be tracked. I may be undervaluing the importance of consistent code patterns and DRY principles, even for small things. Repeated code, even if minor, could indicate a lack of attention to detail. While consistency is important, not every minor duplication needs to be optimized away. The current code is clear and the duplication is so minimal that fixing it could be considered premature optimization. This comment, while technically correct, suggests an optimization that would provide minimal benefit while potentially making the code slightly more complex. The current implementation is clear and maintainable.

Workflow ID: wflow_fHV0yAQL760TZHnj

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed 00ff79d in 2 minutes and 0 seconds. Click for details.
  • Reviewed 40 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_7RSSOPcrnLuYjPwM

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed cf9a33f in 1 minute and 33 seconds. Click for details.
  • Reviewed 49 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py:339
  • Draft comment:
    Using a fallback of 'Langchain' when reading the vendor from span attributes might cause inconsistencies if the expected vendor is 'openai'. Ensure that the vendor is set explicitly before this point or adjust the fallback to match expected values.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment makes a speculative claim about inconsistencies without providing evidence. The code is in a Langchain instrumentation package, so using "Langchain" as the default system makes logical sense. There's no clear indication that "openai" would be a better default. The comment is making assumptions about the expected vendor without understanding the codebase context. I could be wrong if there's a broader context where OpenAI is the primary vendor being tracked, or if there are conventions established elsewhere in the codebase about vendor naming. Given this is explicitly Langchain instrumentation code (from the package name and path), and without clear evidence that OpenAI should be the default, the original default of "Langchain" appears correct. Delete the comment. It makes speculative claims about inconsistencies without evidence, and the current default makes sense given the context of Langchain instrumentation.
2. packages/opentelemetry-instrumentation-langchain/tests/metrics/test_langchain_metrics.py:47
  • Draft comment:
    Test assertions updated vendor from 'Langchain' to 'openai'. Confirm that instrumentation reliably sets the vendor attribute as expected.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_RetWMkoqlCpiyO2j

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 4cdfbdc in 55 seconds. Click for details.
  • Reviewed 13 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py:338
  • Draft comment:
    This change only removes an extra whitespace after setting the vendor variable. It doesn’t alter the vendor extraction logic, which retrieves the vendor from span.attributes (defaulting to 'Langchain'). Ensure that this minimal change is intentional and that vendor labeling is handled uniformly elsewhere.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_fzPlW4dlRMg1UJpy

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed ccbbf67 in 1 minute and 46 seconds. Click for details.
  • Reviewed 212 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:298
  • Draft comment:
    Refactored vendor detection to delegate to detect_vendor_from_class (imported from vendor_detection). Ensure that _extract_class_name_from_serialized returns a consistently formatted class name to avoid mismatches.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py:11
  • Draft comment:
    The exact match check in the matches() method is case-sensitive while the fallback (pattern matching) is case-insensitive. Consider normalizing the class name for the exact match check as well for consistency.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% Looking at the exact_matches in the code, they all use specific casing that matches the actual class names (like "AzureChatOpenAI", "ChatOpenAI", etc). The case sensitivity here seems intentional since these are exact class name matches. The patterns being case-insensitive makes sense for more flexible matching. Having different sensitivity levels for exact vs pattern matching could be a deliberate design choice. I could be wrong about the intentionality - maybe the author didn't consider case sensitivity. Also, having consistent behavior might prevent future bugs. While consistency is good, the current behavior appears intentional and useful - exact matches should match exact class names, while patterns can be more flexible. The different case sensitivity behaviors appear to be a deliberate design choice rather than a bug or issue that needs fixing.
3. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py:18
  • Draft comment:
    Consider caching the vendor rules from _get_vendor_rules() to avoid re-creating the list on every call of detect_vendor_from_class. With a small rule set, the performance impact is minimal, but caching may enhance clarity and efficiency if this function is called frequently.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_kKxtgWmg83C1ui7C

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py (3)

1-16: Good design with potential performance optimization.

The frozen dataclass design is excellent for immutability. However, consider optimizing the pattern matching:

@dataclass(frozen=True)
class VendorRule:
    exact_matches: Set[str]
-    patterns: List[str]
+    patterns: Set[str]
    vendor_name: str

    def matches(self, class_name: str) -> bool:
        if class_name in self.exact_matches:
            return True
        class_lower = class_name.lower()
        return any(pattern in class_lower for pattern in self.patterns)

Using Set[str] for patterns provides consistent typing with exact_matches and slightly better performance for iteration, though the performance difference is minimal given the small number of patterns per rule.


18-97: Critical ordering dependency - consider documentation enhancement.

The ordering of rules is crucial due to overlapping patterns (e.g., Azure must come before OpenAI since Azure classes contain "openai"). While the current implementation works correctly, consider enhancing maintainability:

Add inline comments to highlight critical ordering dependencies:

return [
+    # CRITICAL: Azure must come before OpenAI due to overlapping "openai" pattern
    VendorRule(
        exact_matches={"AzureChatOpenAI", "AzureOpenAI", "AzureOpenAIEmbeddings"},
        patterns=["azure"],
        vendor_name="Azure"
    ),
    VendorRule(
        exact_matches={"ChatOpenAI", "OpenAI", "OpenAIEmbeddings"},
        patterns=["openai"],
        vendor_name="openai"
    ),

For future extensibility, consider:

  1. Moving rules to a configuration file (JSON/YAML) for easier maintenance
  2. Implementing a priority/specificity scoring system instead of relying on list order
  3. Standardizing vendor name casing consistency

100-121: Well-implemented with good error handling.

The function correctly handles edge cases, has clear documentation, and implements the detection logic properly. The early return for empty input and fallback to "Langchain" are good defensive programming practices.

Consider verifying that the default "Langchain" casing is consistent with how it's used elsewhere in the codebase for telemetry attributes.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4cdfbdc and ccbbf67.

📒 Files selected for processing (2)
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (9 hunks)
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
🔇 Additional comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py (1)

1-121: Excellent modular design for vendor detection.

This new module provides a clean, extensible foundation for LangChain vendor detection. The implementation correctly separates concerns, uses immutable data structures, and handles edge cases appropriately. The rule-based approach with exact matches and patterns provides good flexibility for matching various class naming conventions.

@avivhalfon avivhalfon merged commit 85da47f into main Jul 23, 2025
10 checks passed
@avivhalfon avivhalfon deleted the ah/langchain-vendor-matching branch July 23, 2025 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants