-
Notifications
You must be signed in to change notification settings - Fork 30
do not merge: vector db work (split) #760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ibility - Changed jsonschema constraint from ~4.17.3 to >=4.17.3,<5.0 - Maintains broad compatibility while unblocking fastmcp requirements - Regenerated poetry.lock with new constraint Co-Authored-By: AJ Steers <[email protected]>
…hema 4.18+ compatibility - Replace RefResolver.from_schema() with Registry().with_resource() pattern - Update resolver.resolve() calls to use resolver.lookup().contents - Addresses jsonschema 4.18+ deprecation warnings - Maintains backward compatibility with existing functionality Co-Authored-By: AJ Steers <[email protected]>
- Auto-format YAML and Markdown files per project standards - No functional changes, only formatting consistency Co-Authored-By: AJ Steers <[email protected]>
- Added referencing>=0.30.0 dependency to pyproject.toml - Required for jsonschema 4.18+ RefResolver migration - Regenerated poetry.lock with new dependency Co-Authored-By: AJ Steers <[email protected]>
- Update constraint to >=4.18.0,<5.0 as suggested by @aaronsteers - Remove all backward compatibility code for 4.17.x - Use referencing library exclusively with DRAFT7 default specification - Fixes CannotDetermineSpecification error for schemas without property Co-Authored-By: AJ Steers <[email protected]>
Co-Authored-By: AJ Steers <[email protected]>
…cing compatibility test Co-Authored-By: AJ Steers <[email protected]>
- Try new referencing library first for jsonschema 4.17.3+ compatibility - Fall back to old RefResolver if referencing fails - Ensures compatibility across jsonschema versions while migrating to new API - Fixes pytest failure in test_transform.py Co-Authored-By: AJ Steers <[email protected]>
…r implementation Co-Authored-By: AJ Steers <[email protected]>
- Accept aaronsteers' simplified comment for referencing dependency - Maintain jsonschema constraint >=4.17.3,<5.0 for broad compatibility - Complete merge of remote changes with local RefResolver migration work Co-Authored-By: AJ Steers <[email protected]>
- Enhanced error handling in resolve function with more specific exception catching - Added explicit checks for validator_instance.resolver existence - Improved fallback logic to prevent transformation pipeline breakage - Maintains backward compatibility while preferring referencing library Co-Authored-By: AJ Steers <[email protected]>
Co-Authored-By: AJ Steers <[email protected]>
…-manager.devin.ai/proxy/github.com/airbytehq/airbyte-python-cdk into devin/1756425696-jsonschema-version-pin
Co-Authored-By: AJ Steers <[email protected]>
…n transform.py - Update Poetry version from 1.8.3 to 2.0.1 in manifest server Dockerfile for consistency with local development - Fix transform.py resolve function to properly handle both new referencing library and legacy RefResolver - Addresses CI failures in 'Manifest Server Docker Image Build' and pytest transform tests Co-Authored-By: AJ Steers <[email protected]>
…schema-version-pin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR replaces the deprecated jsonschema.RefResolver
with the newer referencing
library for JSON schema reference resolution across the codebase. The main purpose is to modernize the reference resolution mechanism while maintaining compatibility.
- Updates dependency versions for
jsonschema
and addsreferencing
library - Replaces
RefResolver
-based logic withreferencing.Registry
andResolver
- Consolidates reference expansion functionality and updates import paths
Reviewed Changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
pyproject.toml | Updates jsonschema version constraint and adds referencing dependency |
airbyte_cdk/sources/utils/schema_helpers.py | Replaces RefResolver with referencing library implementation |
airbyte_cdk/sources/utils/transform.py | Updates to use new reference resolution approach |
unit_tests/sources/utils/test_transform.py | Adds missing schema definition for test |
unit_tests/destinations/vector_db_based/config_test.py | Updates import from deprecated utility |
airbyte_cdk/destinations/vector_db_based/config.py | Updates import and function call |
airbyte_cdk/init.py | Removes deprecated function from exports |
airbyte_cdk/manifest_server/Dockerfile | Updates Poetry version |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
||
try: | ||
from jsonschema.validators import Validator | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using bare except:
clause is discouraged. Specify the expected exception type (likely ImportError
) to avoid catching unintended exceptions.
except: | |
except ImportError: |
Copilot uses AI. Check for mistakes.
expand_refs(schema) | ||
|
||
# Now we can expand $refs in the property value: | ||
if isinstance(property_value, dict): | ||
expand_refs(property_value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expand_refs
function is called multiple times and mutates the schema in-place. Consider creating a deep copy of the schema before the first call to avoid unintended side effects on the original schema object.
Copilot uses AI. Check for mistakes.
@@ -293,6 +293,6 @@ def remove_discriminator(schema: Dict[str, Any]) -> None: | |||
def schema(cls, by_alias: bool = True, ref_template: str = "") -> Dict[str, Any]: | |||
"""we're overriding the schema classmethod to enable some post-processing""" | |||
schema: Dict[str, Any] = super().schema() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function call should match the pattern used elsewhere in the codebase. The original code assigned the result back to schema
, but expand_refs
modifies the schema in-place and returns None. Consider documenting this behavior or maintaining consistency with the assignment pattern for clarity.
schema: Dict[str, Any] = super().schema() | |
schema: Dict[str, Any] = super().schema() | |
# expand_refs modifies the schema in-place and returns None. |
Copilot uses AI. Check for mistakes.
📝 WalkthroughWalkthroughReplaces JSON Schema $ref resolution with the referencing library and pre-expansion workflow. Removes resolve_refs from public API and deletes its implementation. Updates transform logic to operate on expanded schemas. Adjusts vector DB config schema generation to use expand_refs. Updates dependencies and Poetry installer version. Adds/updates unit tests. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller as Caller
participant Transform as transform.normalize_schema
participant Helpers as sources.utils.schema_helpers
participant Registry as referencing.Registry/Resolver
Caller->>Transform: normalize_schema(schema)
Transform->>Helpers: expand_refs(schema)
Helpers->>Registry: get_ref_resolver_registry(schema)
Registry-->>Helpers: Resolver
Helpers->>Helpers: _expand_refs(schema, Resolver)
Helpers-->>Transform: expanded schema (definitions removed)
Transform->>Transform: normalize using expanded subschemas
Transform-->>Caller: normalized schema
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks (2 passed, 1 inconclusive)❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. ✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
airbyte_cdk/sources/utils/transform.py (1)
241-259
: Pre-expand refs once in transform() to avoid mutation during traversalLet’s deep copy the input schema, expand refs once, and validate/normalize against that. This prevents repeated in-place mutations and improves performance.
Apply:
def transform( self, record: Dict[str, Any], schema: Mapping[str, Any], ) -> None: @@ - if TransformConfig.NoTransform in self._config: - return - normalizer = self._normalizer(schema) + if TransformConfig.NoTransform in self._config: + return + # Work on a copy to avoid mutating caller-provided schema + local_schema: Dict[str, Any] = deepcopy(schema) # type: ignore[assignment] + expand_refs(local_schema) + normalizer = self._normalizer(local_schema) for e in normalizer.iter_errors(record):airbyte_cdk/sources/utils/schema_helpers.py (1)
85-104
: Guard against cycles and non-object$ref
targets
- No cycle detection: self-referential schemas can cause infinite recursion.
- If a
$ref
resolves to a boolean schema (valid in Draft-07),schema.update(...)
will raise.Would you add a small visited set and an object check, wdyt?
-def _expand_refs(schema: Any, ref_resolver: Any) -> None: +def _expand_refs(schema: Any, ref_resolver: Any, seen: set[str] | None = None) -> None: + if seen is None: + seen = set() """Internal function to iterate over schema and replace all occurrences of $ref with their definitions. Recursive. @@ - if isinstance(schema, MutableMapping): + if isinstance(schema, MutableMapping): if "$ref" in schema: - ref_url = schema.pop("$ref") - definition = ref_resolver.lookup(ref_url).contents - _expand_refs( - definition, ref_resolver=ref_resolver - ) # expand refs in definitions as well - schema.update(definition) + ref_url = schema.pop("$ref") + if isinstance(ref_url, str) and ref_url in seen: + return # avoid infinite recursion on cyclic refs + if isinstance(ref_url, str): + seen.add(ref_url) + definition = ref_resolver.lookup(ref_url).contents + _expand_refs(definition, ref_resolver=ref_resolver, seen=seen) # expand refs in definitions as well + if not isinstance(definition, MutableMapping): + raise TypeError(f"$ref '{ref_url}' resolved to non-object schema (got {type(definition).__name__})") + schema.update(definition) @@ - for key, value in schema.items(): - _expand_refs(value, ref_resolver=ref_resolver) + for key, value in schema.items(): + _expand_refs(value, ref_resolver=ref_resolver, seen=seen) elif isinstance(schema, List): for value in schema: - _expand_refs(value, ref_resolver=ref_resolver) + _expand_refs(value, ref_resolver=ref_resolver, seen=seen)
🧹 Nitpick comments (8)
pyproject.toml (1)
46-49
: Add a conservative upper bound to referencing to reduce surprise breaks
referencing
is <1.0 and may introduce breaking changes in minors. Since we’ve pinnedjsonschema <5.0
, would you also capreferencing
(e.g.,<0.41
) to keep CI green until we explicitly bump, wdyt?Apply:
-jsonschema = ">=4.17.3,<5.0" +jsonschema = ">=4.17.3,<5.0" @@ -referencing = ">=0.36.2" +referencing = ">=0.36.2,<0.41"airbyte_cdk/sources/utils/transform.py (2)
96-104
: Drop “$ref” (and “array”) from overridden validators after pre-expansionOnce we pre-expand,
$ref
won’t appear and overriding its validator just adds overhead. Also,"array"
isn’t a Draft 7 keyword. Shall we trim these, wdyt?Apply:
- if key in ["type", "array", "$ref", "properties", "items"] + if key in ["type", "properties", "items"]
20-23
: Scope the exception to ImportErrorCatching bare exceptions hides unrelated issues during import. Shall we narrow this to
ImportError
, wdyt?Apply:
-try: - from jsonschema.validators import Validator -except: - from jsonschema import Validator +try: + from jsonschema.validators import Validator +except ImportError: + from jsonschema import Validatorairbyte_cdk/sources/utils/schema_helpers.py (3)
17-19
: Avoid importing privatereferencing._core.Resolver
Relying on a private module is brittle across versions. Would you switch the type hint to
Any
(or a local Protocol) and drop the private import, wdyt?-from referencing._core import Resolver # used for type hints +# Intentionally avoid importing private modules from `referencing` @@ -def _expand_refs(schema: Any, ref_resolver: Resolver) -> None: +def _expand_refs(schema: Any, ref_resolver: Any) -> None:Also applies to: 85-85
70-83
: Use a stable, non-empty root URI for the registryAn empty string URI can be surprising for fragment-only refs and future resource additions. Using a URN makes intent explicit. Would you switch to a stable URN, wdyt?
- Registry().with_resource( - uri="", + Registry().with_resource( + uri="urn:jsonschema:root", resource=resource, ),
85-104
: (Spec nuance) Sibling keywords with$ref
are merged viadict.update
Per JSON Schema best practices, siblings often intend additive constraints (commonly expressed as
allOf: [{$ref: ...}, {...siblings...}]
). A blindupdate
can silently override sibling or referenced keywords. Would you consider composing siblings viaallOf
to preserve semantics, wdyt?If you want to adopt
allOf
, a minimal change would be:- _expand_refs(definition, ref_resolver=ref_resolver, seen=seen) - if not isinstance(definition, MutableMapping): - raise TypeError(...) - schema.update(definition) + _expand_refs(definition, ref_resolver=ref_resolver, seen=seen) + if not isinstance(definition, MutableMapping): + raise TypeError(...) + siblings = dict(schema) # current (post-pop) siblings + schema.clear() + if siblings: + schema.update({"allOf": [definition, siblings]}) + else: + schema.update(definition)unit_tests/sources/utils/test_transform.py (1)
23-26
: Add an assertion exercising the new$ref
(my_type) path
COMPLEX_SCHEMA["properties"]["def"]["properties"]["dd"]
now references#/definitions/my_type
, but no test covers it. Shall we add a quick case to confirm inlining/ref resolution works end-to-end, wdyt?Example new test (outside the parametrized block):
def test_transform_resolves_local_ref_in_nested_def(): schema = deepcopy(COMPLEX_SCHEMA) # If transform relies on pre-expansion elsewhere, call expand_refs here if needed. data = {"def": {"dd": 123}} t = TypeTransformer(TransformConfig.DefaultSchemaNormalization) t.transform(data, schema) assert data == {"def": {"dd": "123"}}Also applies to: 36-36
unit_tests/destinations/vector_db_based/config_test.py (1)
63-66
: Optional: mirror the production schema() signature to avoid driftWould you pass through by_alias/ref_template to stay API-compatible with BaseModel.schema and the prod implementation, wdyt?
- def schema(cls): + def schema(cls, by_alias: bool = True, ref_template: str = ""): """we're overriding the schema classmethod to enable some post-processing""" - schema = super().schema() + schema = super().schema(by_alias=by_alias, ref_template=ref_template)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (9)
airbyte_cdk/__init__.py
(0 hunks)airbyte_cdk/destinations/vector_db_based/config.py
(2 hunks)airbyte_cdk/manifest_server/Dockerfile
(2 hunks)airbyte_cdk/sources/utils/schema_helpers.py
(3 hunks)airbyte_cdk/sources/utils/transform.py
(2 hunks)airbyte_cdk/utils/spec_schema_transformations.py
(0 hunks)pyproject.toml
(1 hunks)unit_tests/destinations/vector_db_based/config_test.py
(2 hunks)unit_tests/sources/utils/test_transform.py
(2 hunks)
💤 Files with no reviewable changes (2)
- airbyte_cdk/utils/spec_schema_transformations.py
- airbyte_cdk/init.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Applied to files:
airbyte_cdk/manifest_server/Dockerfile
🧬 Code graph analysis (4)
airbyte_cdk/destinations/vector_db_based/config.py (4)
airbyte_cdk/sources/utils/schema_helpers.py (1)
expand_refs
(106-116)unit_tests/destinations/vector_db_based/config_test.py (1)
schema
(64-69)airbyte_cdk/sources/file_based/config/abstract_file_based_spec.py (1)
schema
(88-98)airbyte_cdk/sources/config.py (1)
schema
(21-27)
unit_tests/destinations/vector_db_based/config_test.py (2)
airbyte_cdk/sources/utils/schema_helpers.py (1)
expand_refs
(106-116)airbyte_cdk/destinations/vector_db_based/config.py (1)
schema
(293-298)
airbyte_cdk/sources/utils/schema_helpers.py (4)
airbyte_cdk/destinations/vector_db_based/config.py (1)
schema
(293-298)unit_tests/destinations/vector_db_based/config_test.py (1)
schema
(64-69)airbyte_cdk/sources/file_based/config/abstract_file_based_spec.py (1)
schema
(88-98)airbyte_cdk/sources/config.py (1)
schema
(21-27)
airbyte_cdk/sources/utils/transform.py (1)
airbyte_cdk/sources/utils/schema_helpers.py (2)
expand_refs
(106-116)get_ref_resolver_registry
(70-82)
🔇 Additional comments (3)
airbyte_cdk/manifest_server/Dockerfile (1)
14-14
: Install the poetry-dynamic-versioning plugin when installing Poetry 2 in the image?Poetry 2 only applies dynamic (VCS-derived) versions when the dynamic-versioning plugin is present — the plugin provides a PEP‑517 build-backend and CLI integration when installed with the "plugin" extra, and Poetry 2 can also install project plugins via [tool.poetry.requires-plugins]; there are Docker/rootless edge-cases where plugin registration can silently fail, so would you prefer to pip-install the plugin here (recommended for a pip-installed Poetry) or declare it in pyproject and verify registration in the image? (github.com)
Apply this minimal diff to install the plugin alongside Poetry:
- pip install poetry==2.0.1 + pip install poetry==2.0.1 'poetry-dynamic-versioning[plugin]<2.0.0'airbyte_cdk/destinations/vector_db_based/config.py (1)
10-10
: LGTM on switching toexpand_refs
This aligns with the new registry-based approach and keeps the method mutation-only usage consistent. Nice cleanup.
Also applies to: 296-296
unit_tests/destinations/vector_db_based/config_test.py (1)
18-18
: Import swap to expand_refs is correctThis aligns with the new registry-based resolver and mirrors the production code path; no issues spotted.
def expand_refs(schema: Any) -> None: | ||
"""Iterate over schema and replace all occurrences of $ref with their definitions. | ||
|
||
If a "definitions" section is present at the root of the schema, it will be removed | ||
after $ref resolution is complete. | ||
|
||
:param schema: schema that will be patched | ||
""" | ||
_expand_refs(schema) | ||
schema.pop("definitions", None) # remove definitions created by $ref | ||
ref_resolver = get_ref_resolver_registry(schema).resolver() | ||
_expand_refs(schema, ref_resolver) | ||
schema.pop("definitions", None) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Return the schema for ergonomic use and back-compat
Some call sites (tests, etc.) assign schema = expand_refs(schema)
. Since this mutates in place, could we also return schema
to keep both styles working, wdyt?
-def expand_refs(schema: Any) -> None:
+def expand_refs(schema: Any) -> Any:
@@
- schema.pop("definitions", None)
+ schema.pop("definitions", None)
+ return schema
Run to find assignment-style usages that would otherwise break:
🏁 Script executed:
#!/bin/bash
# Find assignment-style expand_refs usage
rg -nP --type=py '^\s*\w+\s*=\s*expand_refs\s*\(' -C1
# Find all expand_refs call sites for a quick audit
rg -nP --type=py '\bexpand_refs\s*\(' -C2
Length of output: 3461
Return schema from expand_refs for ergonomic use and back-compat?
One assignment-style caller was found at unit_tests/destinations/vector_db_based/config_test.py:67 — change the signature to return the schema so both assignment and in-place usages work.
-def expand_refs(schema: Any) -> None:
+def expand_refs(schema: Any) -> Any:
@@
- schema.pop("definitions", None)
+ schema.pop("definitions", None)
+ return schema
wdyt?
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def expand_refs(schema: Any) -> None: | |
"""Iterate over schema and replace all occurrences of $ref with their definitions. | |
If a "definitions" section is present at the root of the schema, it will be removed | |
after $ref resolution is complete. | |
:param schema: schema that will be patched | |
""" | |
_expand_refs(schema) | |
schema.pop("definitions", None) # remove definitions created by $ref | |
ref_resolver = get_ref_resolver_registry(schema).resolver() | |
_expand_refs(schema, ref_resolver) | |
schema.pop("definitions", None) | |
def expand_refs(schema: Any) -> Any: | |
"""Iterate over schema and replace all occurrences of $ref with their definitions. | |
If a "definitions" section is present at the root of the schema, it will be removed | |
after $ref resolution is complete. | |
:param schema: schema that will be patched | |
""" | |
ref_resolver = get_ref_resolver_registry(schema).resolver() | |
_expand_refs(schema, ref_resolver) | |
schema.pop("definitions", None) | |
return schema |
🤖 Prompt for AI Agents
In airbyte_cdk/sources/utils/schema_helpers.py around lines 106 to 117, update
expand_refs to return the schema after performing in-place $ref expansion so
callers that assign its result continue to work; change the function signature
to return the schema (keep existing in-place behavior), update the docstring to
mention the return value, and ensure the function ends with "return schema"
after popping "definitions"; adjust the type hint to reflect the returned value
if desired and run/update callers (e.g.,
unit_tests/destinations/vector_db_based/config_test.py) that expect an
assignment to use the returned schema.
# Very first step is to expand $refs in the schema itself: | ||
expand_refs(schema) | ||
|
||
# Now we can expand $refs in the property value: | ||
if isinstance(property_value, dict): | ||
expand_refs(property_value) | ||
|
||
def resolve(subschema: dict[str, Any]) -> dict[str, Any]: | ||
if "$ref" in subschema: | ||
_, resolved = cast( | ||
RefResolver, | ||
validator_instance.resolver, | ||
).resolve(subschema["$ref"]) | ||
return cast(dict[str, Any], resolved) | ||
return subschema | ||
# Now we can validate and normalize the values: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t expand refs on every validator callback (perf + possible Unresolvable)
Calling expand_refs(schema)
per-visit mutates the schema repeatedly and can raise Unresolvable
when expanding dict-valued fragments without root context. Can we remove in-callback expansion and do a single pre-expansion before validator creation, wdyt?
Apply:
- # Very first step is to expand $refs in the schema itself:
- expand_refs(schema)
-
- # Now we can expand $refs in the property value:
- if isinstance(property_value, dict):
- expand_refs(property_value)
+ # $refs are pre-expanded once before validator construction.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# Very first step is to expand $refs in the schema itself: | |
expand_refs(schema) | |
# Now we can expand $refs in the property value: | |
if isinstance(property_value, dict): | |
expand_refs(property_value) | |
def resolve(subschema: dict[str, Any]) -> dict[str, Any]: | |
if "$ref" in subschema: | |
_, resolved = cast( | |
RefResolver, | |
validator_instance.resolver, | |
).resolve(subschema["$ref"]) | |
return cast(dict[str, Any], resolved) | |
return subschema | |
# Now we can validate and normalize the values: | |
# $refs are pre-expanded once before validator construction. | |
# Now we can validate and normalize the values: |
@@ -64,7 +64,7 @@ def remove_discriminator(schema: dict) -> None: | |||
def schema(cls): | |||
"""we're overriding the schema classmethod to enable some post-processing""" | |||
schema = super().schema() | |||
schema = resolve_refs(schema) | |||
schema = expand_refs(schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix: expand_refs mutates in-place; assigning its return sets schema to None
expand_refs returns None and mutates the dict; assigning it to schema will make schema None and break remove_discriminator/assertions. Suggest removing the assignment to match the implementation in airbyte_cdk/destinations/vector_db_based/config.py, wdyt?
Apply this diff:
- schema = expand_refs(schema)
+ expand_refs(schema)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
schema = expand_refs(schema) | |
expand_refs(schema) |
🤖 Prompt for AI Agents
In unit_tests/destinations/vector_db_based/config_test.py around line 67, the
test assigns the result of expand_refs to schema even though expand_refs mutates
the dict in-place and returns None; remove the assignment and just call
expand_refs(schema) so schema remains the populated dict for subsequent
remove_discriminator/assertions.
Summary by CodeRabbit