-
Notifications
You must be signed in to change notification settings - Fork 807
fix(langchain): added vendors to llm calls #3165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughTwo helper functions were added to extract a class name from serialized model data and detect the vendor from that class name. The span creation method was updated to accept serialized data and set the vendor dynamically instead of a fixed string. Calls to span creation and span attribute usage were modified accordingly. Tests were enhanced to assert the correct vendor attribute in spans. A new module was introduced to encapsulate vendor detection logic based on class name patterns. Changes
Estimated code review effort2 (~20 minutes) Possibly related PRs
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to acd97da in 2 minutes and 8 seconds. Click for details.
- Reviewed
109
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:336
- Draft comment:
Good use of the vendor extraction in _create_llm_span via 'serialized or {}'. As an enhancement, consider logging or alerting when the fallback vendor ('Langchain') is used, to help diagnose cases where the vendor info might be missing. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While the suggestion could help with debugging, it violates several review rules: 1) It's not pointing out a clear code change requirement 2) It's more of an enhancement suggestion than a critical issue 3) It's speculative - assuming that fallback cases need monitoring without evidence of actual problems. The current code handles the fallback gracefully and consistently. The suggestion could genuinely help identify misconfigured or unexpected LLM implementations. Logging fallbacks could provide valuable operational insights. While logging could be helpful, the comment doesn't identify an actual problem with the current implementation. The fallback is an intentional design choice that works correctly. The comment should be deleted as it suggests an optional enhancement rather than identifying a clear issue requiring change. The current fallback behavior is working as designed.
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:79
- Draft comment:
Typographical note: The fallback vendor string 'Langchain' has inconsistent capitalization compared to 'LangChain' in the docstring. Consider standardizing the casing for consistency. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_nT4rjSX8wlqnqX0x
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
...emetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (1)
549-549
: Fix inconsistent vendor attribution in metrics.The metrics recording still uses hardcoded "Langchain" while spans now use dynamic vendor detection. This creates inconsistency between traces and metrics.
Consider storing the detected vendor in the
SpanHolder
during span creation and using it for metrics:# In _create_llm_span method, after line 338: + # Store vendor for metrics + self.spans[run_id].vendor = vendor # In on_llm_end method: - SpanAttributes.LLM_SYSTEM: "Langchain", + SpanAttributes.LLM_SYSTEM: getattr(self.spans[run_id], 'vendor', 'Langchain'),You'll also need to update the
SpanHolder
class to include avendor
field.Also applies to: 559-559, 576-576
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (1)
66-124
: LGTM with minor suggestions for robustness.The vendor detection logic is comprehensive and well-structured. Consider these improvements:
- Handle edge case: Add type checking for
class_id
to prevent potentialTypeError
if it's neither a list nor convertible to string.- Performance: Consider using a dictionary lookup for exact matches before iterating through patterns.
def _detect_vendor_from_class(serialized: dict[str, Any]) -> str: """ Detect vendor from LangChain model class name. Uses unified detection rules combining exact matches and patterns. Args: serialized: Serialized model information from LangChain callback Returns: Vendor string (e.g., "openai", "Azure", "AWS", "Google", etc.) """ class_id = serialized.get("id", []) if not class_id: return "Langchain" # Fallback # Get the class name (last element in id array) - class_name = class_id[-1] if isinstance(class_id, list) else str(class_id) + try: + class_name = class_id[-1] if isinstance(class_id, list) else str(class_id) + except (TypeError, IndexError): + return "Langchain" # Fallback for invalid class_id structure
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
(5 hunks)
🔇 Additional comments (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (2)
314-341
: Excellent implementation of dynamic vendor detection.The method signature update and vendor detection integration is well-implemented:
- Maintains backward compatibility with optional
serialized
parameter- Proper fallback to empty dict when
serialized
is None- Clean integration with the new
_detect_vendor_from_class
function
451-451
: Correct integration of vendor detection in method calls.Both method calls properly pass the
serialized
parameter to enable dynamic vendor detection during span creation.Also applies to: 477-477
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 3139236 in 2 minutes and 31 seconds. Click for details.
- Reviewed
95
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:116
- Draft comment:
Consider precomputing class_name.lower() once before the loop instead of inside each iteration to improve performance. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% This is a vendor detection function that likely runs infrequently and with a small number of iterations. The .lower() operation is only performed if the exact match fails. Moving it outside would save a few string operations but could make the code less readable by separating the transformation from where it's used. The optimization impact would be negligible in practice. The comment does identify a valid micro-optimization opportunity. In some contexts, reducing redundant string operations could be important for performance. While technically correct, this optimization would have negligible real-world impact given the function's usage pattern and small loop size. The current code organization keeps related operations together which aids readability. The comment should be deleted as it suggests a micro-optimization that would have negligible benefit while potentially making the code less readable.
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:84
- Draft comment:
Vendor strings have mixed casing (e.g., 'Azure' vs 'openai'). Ensure this inconsistency is intentional for downstream processing. - Reason this comment was not posted:
Comment was on unchanged code.
Workflow ID: wflow_jWVovKRxKumv52FD
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed 2c973e2 in 1 minute and 45 seconds. Click for details.
- Reviewed
51
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:66
- Draft comment:
Refactored _detect_vendor_from_class now takes a class name (string) instead of a serialized dict. Consider trimming or validating the class_name (e.g., removing extra whitespace) to ensure robust vendor matching. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment has two parts: 1) Noting the refactoring change - this is just informative and not actionable. 2) Suggesting input validation - while this could be actionable, the function already handles empty/None inputs on line 77 with "if not class_name:", so additional validation isn't really needed. The suggestion to trim whitespace could be useful but isn't critical since the class names appear to come from internal code rather than user input. The function already has basic input validation. Could there be other edge cases I'm missing? What if class_name contains invalid characters or is malformed in some way? The class names come from LangChain's internal serialization, so they should be well-formed. The existing None/empty check is sufficient for the error cases we care about. Over-validating internal data would add complexity without much benefit. The comment should be deleted since it's partly informational (not actionable) and partly suggesting validation that isn't really needed given the existing checks and internal nature of the data.
Workflow ID: wflow_Xy0PlI91W37scpcE
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
...emetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (2)
317-344
: Fix potential null pointer exception when serialized is None.The method doesn't handle the case where
serialized
isNone
, which will cause a runtime error when callingserialized.get("id", [])
.Apply this diff to add null safety:
- class_id = serialized.get("id", []) + class_id = serialized.get("id", []) if serialized else []Additionally, consider adding more defensive handling:
# Get the class name (last element in id array) if isinstance(class_id, list) and len(class_id) > 0: class_name = class_id[-1] - elif class_id: + elif isinstance(class_id, str) and class_id: class_name = str(class_id) else: class_name = ""
555-585
: Consider using detected vendor in metrics for consistency.The metrics recording still uses hardcoded "Langchain" values while the span attributes now use the detected vendor. This inconsistency could impact observability.
Consider storing the detected vendor in the span holder or retrieving it from the span attributes to use in metrics:
self.token_histogram.record( prompt_tokens, attributes={ - SpanAttributes.LLM_SYSTEM: "Langchain", + SpanAttributes.LLM_SYSTEM: span.get_attribute(SpanAttributes.LLM_SYSTEM) or "Langchain", SpanAttributes.LLM_TOKEN_TYPE: "input", SpanAttributes.LLM_RESPONSE_MODEL: model_name or "unknown", }, )Note: This assumes a way to retrieve span attributes. If not available, consider storing the vendor in the SpanHolder class or passing it through the method chain.
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (1)
66-120
: Solid vendor detection implementation with comprehensive coverage.The function correctly handles edge cases and provides good fallback behavior. The fix for case-sensitivity ordering (exact matches before lowercasing) addresses the previous review comment appropriately.
Consider moving
vendor_rules
to module level as a constant for better performance since it's recreated on every function call:+# Vendor detection rules (order matters - most specific first) +_VENDOR_RULES = [ + ({"AzureChatOpenAI", "AzureOpenAI", "AzureOpenAIEmbeddings"}, ["azure"], "Azure"), + # ... rest of the rules +] + def _detect_vendor_from_class(class_name: str) -> str: """ Detect vendor from LangChain model class name. Uses unified detection rules combining exact matches and patterns. Args: class_name: The class name extracted from serialized model information Returns: Vendor string """ if not class_name: # Handles None, empty string, or other falsy values return "Langchain" # Fallback - # Vendor detection rules (order matters - most specific first) - vendor_rules = [ - ({"AzureChatOpenAI", "AzureOpenAI", "AzureOpenAIEmbeddings"}, ["azure"], "Azure"), - # ... rest of the rules - ] - - for exact_matches, patterns, vendor in vendor_rules: + for exact_matches, patterns, vendor in _VENDOR_RULES: if class_name in exact_matches: return vendor class_lower = class_name.lower() if any(pattern in class_lower for pattern in patterns): return vendor return "Langchain" # Fallback
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
(5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.9)
- GitHub Check: Lint
- GitHub Check: Test Packages (3.12)
- GitHub Check: Build Packages (3.11)
🔇 Additional comments (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (2)
456-458
: Correct integration of vendor detection for chat models.The modification properly passes the
serialized
parameter to enable vendor detection while maintaining all existing functionality.
482-484
: Correct integration of vendor detection for LLM completions.The modification properly passes the
serialized
parameter to enable vendor detection for completion requests.
...emetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed e844351 in 2 minutes and 6 seconds. Click for details.
- Reviewed
50
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
5
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/tests/test_llms.py:146
- Draft comment:
Ensure the vendor attribute for HuggingFace is set as intended. The test asserts 'HuggingFace' (with capital H and F); please confirm this casing is consistent with your conventions. - Reason this comment was not posted:
Comment looked like it was already resolved.
2. packages/opentelemetry-instrumentation-langchain/tests/test_llms.py:280
- Draft comment:
Verify that the vendor for OpenAI is consistently set as 'openai' (all lowercase). Confirm that this matches your vendor naming conventions. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =40%
<= threshold50%
The comment is asking the PR author to confirm their naming conventions, which violates the rule against asking for confirmation of intentions. However, it does suggest a specific naming convention ('openai' in lowercase), which could be useful if the code is inconsistent. The comment could be rephrased to suggest checking for consistency without asking for confirmation.
3. packages/opentelemetry-instrumentation-langchain/tests/test_llms.py:662
- Draft comment:
Confirm the vendor attribute for Anthropic is set to 'Anthropic'. The casing differs from the OpenAI vendor; ensure this is intended. - Reason this comment was not posted:
Comment looked like it was already resolved.
4. packages/opentelemetry-instrumentation-langchain/tests/test_llms.py:873
- Draft comment:
Check that the vendor attribute for Bedrock is correctly set to 'AWS'. Ensure that this naming aligns with your overall vendor strategy. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the PR author to ensure that the vendor attribute aligns with their overall strategy, which is a form of asking for confirmation or double-checking. This violates the rule against asking the author to confirm their intention or ensure behavior is intended.
5. packages/opentelemetry-instrumentation-langchain/tests/test_llms.py:1089
- Draft comment:
The test_trace_propagation now uses an expected_vendors dictionary for OpenAI variants. Ensure that this mapping is updated when adding further vendors and matches the intended vendor naming. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =20%
<= threshold50%
The comment is asking the author to ensure that a mapping is updated when adding further vendors. This is a general request to ensure future updates, which is not specific to the current PR changes. It doesn't provide a specific suggestion or point out a specific issue with the current code.
Workflow ID: wflow_OIklbcfrvb7LSLTv
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 4c32120 in 37 seconds. Click for details.
- Reviewed
13
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:332
- Draft comment:
Using '(serialized or {})' safely handles cases where 'serialized' is None. This prevents a potential AttributeError when trying to call .get() on None. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
Workflow ID: wflow_ftrF3HaJBxrDsgl2
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 17ca083 in 41 seconds. Click for details.
- Reviewed
49
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:65
- Draft comment:
Nice extraction: _extract_class_name_from_serialized encapsulates the logic to retrieve the model class name from the 'id' field, improving clarity and maintainability. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:349
- Draft comment:
Good refactor: Replacing inline extraction of 'id' with _extract_class_name_from_serialized() in _create_llm_span cleans up the code and reuses logic. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
Workflow ID: wflow_2mAe6yzn448FT94S
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed c8922f4 in 1 minute and 44 seconds. Click for details.
- Reviewed
29
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:62
- Draft comment:
Unnecessary blank line before _extract_class_name_from_serialized. Consider removing if not aligned with style. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:81
- Draft comment:
Extra blank line added before _detect_vendor_from_class. Remove if not needed for consistency. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
Workflow ID: wflow_D0YhwrcTRGDKmrEH
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
...emetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 0914b67 in 1 minute and 47 seconds. Click for details.
- Reviewed
40
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:560
- Draft comment:
Good change replacing hardcoded 'Langchain' with a vendor fetched from span attributes. Consider computing the vendor once at the beginning of on_llm_end to avoid duplicate lookups. - Reason this comment was not posted:
Comment looked like it was already resolved.
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:589
- Draft comment:
Vendor is re-fetched for the duration histogram. Reusing the earlier computed vendor (if unchanged) would reduce redundancy. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% The comment is technically correct - there is some duplication in fetching the vendor value. However, this is a very minor optimization that would save just one attribute lookup. The code is still readable and maintainable as is. The benefit of reusing the value would be minimal and could make the code slightly more complex by requiring an additional variable to be tracked. I may be undervaluing the importance of consistent code patterns and DRY principles, even for small things. Repeated code, even if minor, could indicate a lack of attention to detail. While consistency is important, not every minor duplication needs to be optimized away. The current code is clear and the duplication is so minimal that fixing it could be considered premature optimization. This comment, while technically correct, suggests an optimization that would provide minimal benefit while potentially making the code slightly more complex. The current implementation is clear and maintainable.
Workflow ID: wflow_fHV0yAQL760TZHnj
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed 00ff79d in 2 minutes and 0 seconds. Click for details.
- Reviewed
40
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
0
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
Workflow ID: wflow_7RSSOPcrnLuYjPwM
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
packages/opentelemetry-instrumentation-langchain/tests/metrics/test_langchain_metrics.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed cf9a33f in 1 minute and 33 seconds. Click for details.
- Reviewed
49
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
2
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py:339
- Draft comment:
Using a fallback of 'Langchain' when reading the vendor from span attributes might cause inconsistencies if the expected vendor is 'openai'. Ensure that the vendor is set explicitly before this point or adjust the fallback to match expected values. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment makes a speculative claim about inconsistencies without providing evidence. The code is in a Langchain instrumentation package, so using "Langchain" as the default system makes logical sense. There's no clear indication that "openai" would be a better default. The comment is making assumptions about the expected vendor without understanding the codebase context. I could be wrong if there's a broader context where OpenAI is the primary vendor being tracked, or if there are conventions established elsewhere in the codebase about vendor naming. Given this is explicitly Langchain instrumentation code (from the package name and path), and without clear evidence that OpenAI should be the default, the original default of "Langchain" appears correct. Delete the comment. It makes speculative claims about inconsistencies without evidence, and the current default makes sense given the context of Langchain instrumentation.
2. packages/opentelemetry-instrumentation-langchain/tests/metrics/test_langchain_metrics.py:47
- Draft comment:
Test assertions updated vendor from 'Langchain' to 'openai'. Confirm that instrumentation reliably sets the vendor attribute as expected. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_RetWMkoqlCpiyO2j
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 4cdfbdc in 55 seconds. Click for details.
- Reviewed
13
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py:338
- Draft comment:
This change only removes an extra whitespace after setting the vendor variable. It doesn’t alter the vendor extraction logic, which retrieves the vendor fromspan.attributes
(defaulting to 'Langchain'). Ensure that this minimal change is intentional and that vendor labeling is handled uniformly elsewhere. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
Workflow ID: wflow_fzPlW4dlRMg1UJpy
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed ccbbf67 in 1 minute and 46 seconds. Click for details.
- Reviewed
212
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
3
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:298
- Draft comment:
Refactored vendor detection to delegate to detect_vendor_from_class (imported from vendor_detection). Ensure that _extract_class_name_from_serialized returns a consistently formatted class name to avoid mismatches. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py:11
- Draft comment:
The exact match check in the matches() method is case-sensitive while the fallback (pattern matching) is case-insensitive. Consider normalizing the class name for the exact match check as well for consistency. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% Looking at the exact_matches in the code, they all use specific casing that matches the actual class names (like "AzureChatOpenAI", "ChatOpenAI", etc). The case sensitivity here seems intentional since these are exact class name matches. The patterns being case-insensitive makes sense for more flexible matching. Having different sensitivity levels for exact vs pattern matching could be a deliberate design choice. I could be wrong about the intentionality - maybe the author didn't consider case sensitivity. Also, having consistent behavior might prevent future bugs. While consistency is good, the current behavior appears intentional and useful - exact matches should match exact class names, while patterns can be more flexible. The different case sensitivity behaviors appear to be a deliberate design choice rather than a bug or issue that needs fixing.
3. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py:18
- Draft comment:
Consider caching the vendor rules from _get_vendor_rules() to avoid re-creating the list on every call of detect_vendor_from_class. With a small rule set, the performance impact is minimal, but caching may enhance clarity and efficiency if this function is called frequently. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
Workflow ID: wflow_kKxtgWmg83C1ui7C
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py (3)
1-16
: Good design with potential performance optimization.The frozen dataclass design is excellent for immutability. However, consider optimizing the pattern matching:
@dataclass(frozen=True) class VendorRule: exact_matches: Set[str] - patterns: List[str] + patterns: Set[str] vendor_name: str def matches(self, class_name: str) -> bool: if class_name in self.exact_matches: return True class_lower = class_name.lower() return any(pattern in class_lower for pattern in self.patterns)Using
Set[str]
for patterns provides consistent typing withexact_matches
and slightly better performance for iteration, though the performance difference is minimal given the small number of patterns per rule.
18-97
: Critical ordering dependency - consider documentation enhancement.The ordering of rules is crucial due to overlapping patterns (e.g., Azure must come before OpenAI since Azure classes contain "openai"). While the current implementation works correctly, consider enhancing maintainability:
Add inline comments to highlight critical ordering dependencies:
return [ + # CRITICAL: Azure must come before OpenAI due to overlapping "openai" pattern VendorRule( exact_matches={"AzureChatOpenAI", "AzureOpenAI", "AzureOpenAIEmbeddings"}, patterns=["azure"], vendor_name="Azure" ), VendorRule( exact_matches={"ChatOpenAI", "OpenAI", "OpenAIEmbeddings"}, patterns=["openai"], vendor_name="openai" ),
For future extensibility, consider:
- Moving rules to a configuration file (JSON/YAML) for easier maintenance
- Implementing a priority/specificity scoring system instead of relying on list order
- Standardizing vendor name casing consistency
100-121
: Well-implemented with good error handling.The function correctly handles edge cases, has clear documentation, and implements the detection logic properly. The early return for empty input and fallback to "Langchain" are good defensive programming practices.
Consider verifying that the default "Langchain" casing is consistent with how it's used elsewhere in the codebase for telemetry attributes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
(9 hunks)packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
🔇 Additional comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py (1)
1-121
: Excellent modular design for vendor detection.This new module provides a clean, extensible foundation for LangChain vendor detection. The implementation correctly separates concerns, uses immutable data structures, and handles edge cases appropriately. The rule-based approach with exact matches and patterns provides good flexibility for matching various class naming conventions.
feat(instrumentation): ...
orfix(instrumentation): ...
.Important
Added vendor detection for LangChain models to enhance tracing spans with vendor-specific information.
detect_vendor_from_class()
invendor_detection.py
to identify model vendors (e.g., OpenAI, Azure) from class names._create_llm_span()
incallback_handler.py
to setSpanAttributes.LLM_SYSTEM
to the detected vendor.callback_handler.py
andspan_utils.py
to use dynamic vendor names.test_langchain_metrics.py
andtest_llms.py
to check for correct vendor labeling in spans and metrics.This description was created by
for ccbbf67. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
New Features
Improvements