-
Notifications
You must be signed in to change notification settings - Fork 807
fix(openai): dynamically import types for 1.99 #3244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to b920f6d in 2 minutes and 45 seconds. Click for details.
- Reviewed
297
lines of code in5
files - Skipped
1
files when reviewing. - Skipped posting
7
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py:999
- Draft comment:
Consider narrowing the broad exception catch in _is_chat_message_function_tool_call to specific exceptions (e.g. ImportError, AttributeError) to avoid masking unexpected errors. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The broad exception handling here appears intentional and necessary. The code is dealing with version compatibility across different OpenAI SDK versions, where import paths and type structures may vary. The comments explicitly explain this is for version compatibility. Narrowing to specific exceptions could miss some edge cases or break compatibility with different versions. The suggestion to narrow exceptions is generally a good practice. Could there be version-specific errors we're not considering that should be caught explicitly? In this case, the broad exception handling is justified because we need to gracefully handle any issues with imports or type checking across multiple versions of the OpenAI SDK. The code's behavior (returning False) is appropriate for any exception. The broad exception handling should be kept as is since it's intentionally catching all possible errors for version compatibility, and the failure mode (returning False) is appropriate regardless of the specific exception type.
2. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py:1018
- Draft comment:
Similarly, in _is_function_call, consider catching specific exceptions rather than a blanket 'except Exception'. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The broad Exception catch is actually intentional here as this is a compatibility layer. The function needs to handle any possible error that could occur when checking the type, not just AttributeError. The import could fail, the class could be missing, or other version compatibility issues could arise. The function's purpose is to safely check the type and return False for any failure case. The suggestion to catch AttributeError would make the code more precise, but what if the import fails with ImportError? What if there are other version compatibility issues? The broad Exception catch is appropriate here because this is a compatibility layer that needs to handle any possible failure gracefully. Making it more specific could cause version compatibility issues. The broad Exception catch should be kept as is since this is a compatibility layer that needs to handle all possible failure cases gracefully.
3. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py:451
- Draft comment:
The extraction of output text using parsed_response.output[0].content[0].text is in a try/except that silently ignores exceptions. Consider logging a debug message if extraction fails to aid future troubleshooting. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While adding debug logging could help with troubleshooting, this appears to be an intentional fallback pattern where the code first tries one way to get the text, then another, and it's okay if both fail. The code continues execution either way and the output_text field is optional in the TracedData model. Silent failure seems to be the intended behavior here since this is part of telemetry/instrumentation code that shouldn't interfere with the main application flow. The comment identifies a potential debugging improvement, but may not fully understand that this is instrumentation code where silent failure is often preferable to avoid impacting the main application. Given this is instrumentation code and output_text is optional, the current silent failure approach is likely intentional and appropriate. Adding debug logging could add noise without providing significant value. The comment should be deleted as the current silent failure approach is appropriate for this instrumentation code context where output_text is optional.
4. packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py:25
- Draft comment:
Tests have been updated to assert against the new nested structure (response.output[0].content[0].text). Ensure that this structure is consistent and documented in the updated OpenAI SDK so future changes do not break these assertions. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the PR author to ensure that the structure is consistent and documented, which violates the rule against asking the author to ensure things. It also suggests documenting in the SDK, which is not a specific code suggestion or request for a test. Therefore, this comment should be removed.
5. packages/opentelemetry-instrumentation-openai/pyproject.toml:41
- Draft comment:
Pinning openai version to 1.99.7 ensures compatibility with new types, but be aware that future changes in the OpenAI SDK might require further adjustments. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
This comment is about a dependency change, specifically pinning the OpenAI version. The comment is purely informative and does not provide a specific suggestion or question about the code. It also does not ask for a test or a specific confirmation about the change. Therefore, it violates the rules about commenting on dependencies and making purely informative comments.
6. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py:467
- Draft comment:
There appears to be a typographical error on line 467. The variableparsed_response_output_text
is used, but likely it should beparsed_response.output_text
(using a dot to access the attribute). - Reason this comment was not posted:
Comment looked like it was already resolved.
7. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py:570
- Draft comment:
Typo: It appears that 'parsed_response_output_text' is missing a dot. Should this be 'parsed_response.output_text'? - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_75pk65H575cGs2Nm
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughImplements compatibility with OpenAI SDK 1.99.7 by replacing static type imports/checks with dynamic detection, broadening tool-call parsing to accept pydantic models, and updating response text extraction. Refactors ChatStream for robust cleanup and error handling across sync/async iterations. Updates tests to new SDK types/structures and pins test dependency to 1.99.7. Changes
Sequence Diagram(s)sequenceDiagram
participant UserCode
participant ChatStream
participant OpenAI.SDK
participant Tracing
participant Metrics
UserCode->>ChatStream: iterate()/async iterate()
ChatStream->>OpenAI.SDK: request stream
loop chunks
OpenAI.SDK-->>ChatStream: chunk
ChatStream->>Metrics: record partial usage/tokens
ChatStream->>Tracing: update span attributes
ChatStream-->>UserCode: yield chunk
end
OpenAI.SDK-->>ChatStream: end-of-stream
ChatStream->>Metrics: record final usage
ChatStream->>Tracing: set status, end span
ChatStream-->>UserCode: stop iteration
Note over ChatStream: On error: cleanup, set span status, end span
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-openai/tests/traces/test_chat.py (1)
377-384
: Dynamic import handles version compatibility wellThe dynamic import with fallback ensures tests work across different OpenAI SDK versions. While repetitive across three test functions, this approach is acceptable for test code.
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-openai/pyproject.toml (1)
41-41
: Consider using a version range instead of exact pinningPinning to exact version
1.99.7
in test dependencies could cause issues with security updates and bug fixes. Since the PR aims to support the 1.99 series, consider using">=1.99.0,<2.0.0"
or"~=1.99.0"
to allow patch updates while maintaining compatibility.-openai = { extras = ["datalib"], version = "1.99.7" } +openai = { extras = ["datalib"], version = "~=1.99.0" }packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py (1)
450-458
: Extract duplicated output text extraction logicThe output text extraction logic is duplicated between sync (lines 450-458) and async (lines 552-560) functions. Consider extracting this into a helper function to improve maintainability.
Add a helper function before line 450:
def _extract_output_text(parsed_response): """Extract output text from parsed response, handling different response structures.""" if hasattr(parsed_response, "output_text"): return parsed_response.output_text try: if (parsed_response.output and len(parsed_response.output) > 0 and parsed_response.output[0].content and len(parsed_response.output[0].content) > 0): return parsed_response.output[0].content[0].text except Exception: pass return NoneThen simplify both occurrences:
-parsed_response_output_text = None -if hasattr(parsed_response, "output_text"): - parsed_response_output_text = parsed_response.output_text -else: - try: - parsed_response_output_text = parsed_response.output[0].content[0].text - except Exception: - pass +parsed_response_output_text = _extract_output_text(parsed_response)Also applies to: 552-560
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
packages/opentelemetry-instrumentation-openai/poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (5)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
(4 hunks)packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py
(4 hunks)packages/opentelemetry-instrumentation-openai/pyproject.toml
(1 hunks)packages/opentelemetry-instrumentation-openai/tests/traces/test_chat.py
(4 hunks)packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py
(4 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py
454-457: Use contextlib.suppress(Exception)
instead of try
-except
-pass
Replace with contextlib.suppress(Exception)
(SIM105)
556-559: Use contextlib.suppress(Exception)
instead of try
-except
-pass
Replace with contextlib.suppress(Exception)
(SIM105)
🔇 Additional comments (8)
packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py (1)
26-26
: LGTM! Test updates correctly reflect new response structureThe test assertions have been properly updated to access the response text via the new nested structure
response.output[0].content[0].text
, aligning with OpenAI SDK 1.99.7 changes.Also applies to: 49-49, 72-72, 81-81
packages/opentelemetry-instrumentation-openai/tests/traces/test_chat.py (1)
7-8
: LGTM! Import updated for new SDK typeThe import correctly uses
ChatCompletionMessageFunctionToolCall
which is the new type name in OpenAI SDK 1.99.7.packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (6)
10-10
: LGTM! Required import for type checkingThe pydantic import is necessary for handling the new BaseModel-based tool call types in OpenAI SDK 1.99.7.
962-995
: API change maintains backward compatibilityThe
_parse_tool_calls
signature change to acceptUnion[dict, pydantic.BaseModel]
maintains backward compatibility sincedict
is still supported. The implementation correctly handles both legacy dict format and new pydantic BaseModel representations.
998-1024
: Well-designed type detection with version compatibilityThe type detection helpers (
_is_chat_message_function_tool_call
and_is_function_call
) are well-implemented with:
- Dynamic imports that handle missing types gracefully
- Clear comments explaining version compatibility
- Proper fallback logic for different SDK versions
603-611
: Robust cleanup implementation with proper safeguardsThe cleanup state tracking and
__del__
implementation include proper safeguards:
- Thread-safe cleanup with
_cleanup_lock
- Defensive
hasattr
check in__del__
to handle interpreter shutdown scenarios- Prevention of duplicate cleanup operations
644-674
: Excellent error handling in stream iteratorsThe enhanced error handling in
__next__
and__anext__
properly:
- Distinguishes between normal completion (StopIteration) and errors
- Ensures cleanup happens in all scenarios
- Sets appropriate span status on errors
- Re-raises exceptions after cleanup
784-815
: Excellent addition for partial stream metricsThe
_record_partial_metrics
method ensures observability even when streams are partially consumed or interrupted. It safely records all available metrics without throwing exceptions, which is crucial for maintaining telemetry integrity.
parsed_response_output_text = None | ||
if hasattr(parsed_response, "output_text"): | ||
parsed_response_output_text = parsed_response.output_text | ||
else: | ||
try: | ||
parsed_response_output_text = parsed_response.output[0].content[0].text | ||
except Exception: | ||
pass | ||
traced_data = TracedData( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add array bounds validation for safer nested access
The fallback logic parsed_response.output[0].content[0].text
could raise IndexError
if either array is empty. Consider adding bounds checking for more robust error handling.
parsed_response_output_text = None
if hasattr(parsed_response, "output_text"):
parsed_response_output_text = parsed_response.output_text
else:
try:
- parsed_response_output_text = parsed_response.output[0].content[0].text
+ if (parsed_response.output
+ and len(parsed_response.output) > 0
+ and parsed_response.output[0].content
+ and len(parsed_response.output[0].content) > 0):
+ parsed_response_output_text = parsed_response.output[0].content[0].text
except Exception:
pass
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
parsed_response_output_text = None | |
if hasattr(parsed_response, "output_text"): | |
parsed_response_output_text = parsed_response.output_text | |
else: | |
try: | |
parsed_response_output_text = parsed_response.output[0].content[0].text | |
except Exception: | |
pass | |
traced_data = TracedData( | |
parsed_response_output_text = None | |
if hasattr(parsed_response, "output_text"): | |
parsed_response_output_text = parsed_response.output_text | |
else: | |
try: | |
if (parsed_response.output | |
and len(parsed_response.output) > 0 | |
and parsed_response.output[0].content | |
and len(parsed_response.output[0].content) > 0): | |
parsed_response_output_text = parsed_response.output[0].content[0].text | |
except Exception: | |
pass | |
traced_data = TracedData( |
🧰 Tools
🪛 Ruff (0.12.2)
454-457: Use contextlib.suppress(Exception)
instead of try
-except
-pass
Replace with contextlib.suppress(Exception)
(SIM105)
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py
around lines 450 to 458, the fallback access
parsed_response.output[0].content[0].text can raise IndexError if output or
content arrays are empty; add explicit bounds checks (e.g. verify
parsed_response.output is a list-like and len(parsed_response.output) > 0 and
parsed_response.output[0].content is list-like and len(...) > 0) before
indexing, and narrow the exception handling to IndexError/AttributeError (or use
conditional attribute/getattr checks) so parsed_response_output_text safely
falls back to None without swallowing unrelated exceptions.
Fixes #3242
feat(instrumentation): ...
orfix(instrumentation): ...
.Important
Dynamically import types for OpenAI 1.99 compatibility and update tests and dependencies accordingly.
ChatCompletionMessageFunctionToolCall
andFunctionCall
inchat_wrappers.py
to support OpenAI 1.99 and older versions._parse_tool_calls()
inchat_wrappers.py
to usepydantic.BaseModel
for compatibility.responses_get_or_create_wrapper()
andasync_responses_get_or_create_wrapper()
inresponses_wrappers.py
to handleoutput_text
extraction.openai
version to1.99.7
inpyproject.toml
.test_chat.py
andtest_responses.py
to reflect changes in type handling and ensure compatibility with OpenAI 1.99.This description was created by
for b920f6d. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit