Skip to content

Conversation

dbgold17
Copy link
Contributor

@dbgold17 dbgold17 commented Sep 9, 2025

I found these changes to be helpful when working on updating the cdk to enable new functionality in a manifest-only connector so I figured I'd contribute them back.

  1. upgrade the dagger version used by the bin/generate_component_manifest_files.py script for generating pydantic models from the declarative_component_schema.yaml file. This fixes an issue I ran into locally when using 0.13.3 where the script would error our with
error: parse selections: parse field "args": __Field.args has no such argument: "includeDeprecated"
⠴ Establishing connection to the API serverError: Sequence aborted after failed subtask 'assemble' 
  1. declare the dagger dependency as a normal dev dependency in the pyproject.toml file instead of hardcoded in the script.

  2. run poe ruff-fix automatically after regenerating pydantic models to avoid seeing unrelated spacing / indentation diffs

  3. modify the debug_manifest.py script to allow the caller to specify a path to a manifest yaml file and custom components python file, matching what exists for configs and catalogs. I found it useful when I had multiple slightly varying copies of a manifest file locally. If not specified, the script will default to looking for the file in the previously hardcoded location.

Summary by CodeRabbit

  • New Features

    • Debug tool accepts a manifest path (with a sensible default) and an optional components path for loading extra components.
  • Documentation

    • Updated debugging instructions and VSCode launch example to show manifest-path and components-path usage; added a placeholder interpreter note.
  • Chores

    • Bumped dagger-io to 0.18.6.
    • Removed explicit dependency install step from the manifest generator script.
    • Added automatic code-formatting to build tasks.

… format python files after generating pydnatic models, upgrade dagger version, install dagger normally as a dev poetry dependency, allow specifying a custom manifest file path in IDE debug configuration.
@Copilot Copilot AI review requested due to automatic review settings September 9, 2025 21:40
Copilot

This comment was marked as outdated.

Copy link

github-actions bot commented Sep 9, 2025

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@dbgold17/cdk-local-dev-improvements#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch dbgold17/cdk-local-dev-improvements

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

Copy link
Contributor

coderabbitai bot commented Sep 9, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Removed an inline pip install from the Dagger manifest generator script, declared dagger-io in pyproject.toml and added ruff-fix to the Poe build sequence, and updated debug_manifest code and README to accept --manifest-path (defaulting to resources/manifest.yaml) and an optional --components-path for dynamic component registration.

Changes

Cohort / File(s) Summary of changes
Manifest generation script
bin/generate-component-manifest-dagger.sh
Removed explicit pip install dagger-io==0.13.3; script now sets -e and runs python bin/generate_component_manifest_files.py.
Project dependencies & build tasks
pyproject.toml
Added dagger-io = "0.18.6" to normal and dev dependencies; appended "ruff-fix" to the Poe build task sequence.
Debug manifest CLI & docs
debug_manifest/debug_manifest.py, debug_manifest/README.md
debug_manifest.py now parses args via AirbyteEntrypoint.parse_args, computes manifest_path = parsed_args.manifest_path or "resources/manifest.yaml", optionally registers components from --components-path via a new _register_components_from_file(filepath: str) helper (uses importlib to load and register module under "components" and "source_declarative_manifest.components"), and passes path_to_yaml=manifest_path into YamlDeclarativeSource and the debug_manifest invocation. README launch snippet updated to include --manifest-path resources/manifest.yaml and optional --components-path.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User/CLI
  participant DM as debug_manifest.py
  participant AE as AirbyteEntrypoint
  participant CMP as components loader
  participant YS as YamlDeclarativeSource

  U->>DM: Run debug_manifest [--manifest-path <path>] [--components-path <file>]
  DM->>AE: AirbyteEntrypoint.parse_args(args)
  AE-->>DM: parsed args (manifest_path, components_path, ...)
  DM->>DM: manifest_path = parsed_args.manifest_path or "resources/manifest.yaml"
  alt components_path provided
    DM->>CMP: _register_components_from_file(components_path)
    CMP-->>DM: module registered under "components" and "source_declarative_manifest.components"
  end
  DM->>YS: init(path_to_yaml=manifest_path, ...)
  YS-->>DM: Source initialized
  DM-->>U: debug_manifest runs using the provided/default manifest
  Note right of YS: manifest_path drives YAML loading
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • maxi297

Would you like me to also check CI images and common local dev setups for dagger-io 0.18.6 availability and suggest updating CI images if missing, wdyt?

Pre-merge checks (2 passed, 1 inconclusive)

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title Check ❓ Inconclusive The current title uses generic phrasing like “minor improvements” and doesn’t surface the key changes—upgrading the dagger dependency, declaring it in pyproject.toml, adding a ruff-fix step, and introducing CLI options for manifest and components paths—making it too vague to quickly convey the PR’s main intent. Could you consider a more descriptive title that highlights the primary changes, for example “chore: upgrade dagger to 0.18.6, add manifest/components CLI options, and integrate ruff-fix into build tasks”? Wdyt?
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dbgold17/cdk-local-dev-improvements

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
pyproject.toml (2)

151-151: Run ruff-fix before packaging so the wheel includes formatted/generated code

Right now build formats after build-package, so the artifacts may not reflect the fixes. Shall we move ruff-fix ahead of packaging, wdyt?

-build = {sequence = ["assemble", "openapi-generate", "build-package", "ruff-fix"], help = "Run all tasks to build the package."}
+build = {sequence = ["assemble", "openapi-generate", "ruff-fix", "build-package"], help = "Run all tasks to build the package."}

123-123: Loosen dagger-io pin to a patch-safe range and ensure dev extras installation?

• Change in pyproject.toml from
dagger-io = "0.18.6"
to
dagger-io = ">=0.18.6,<0.19.0"
so future 0.18.x patches (latest upstream is 0.18.17) land without manual bumps.
• Add or document a poetry install --with dev step (e.g. in README and/or a new poe install task) so bin/generate-component-manifest-dagger.sh reliably finds the dev-group dependency—wdyt?

debug_manifest/README.md (2)

25-25: Minor: mark the snippet as jsonc to match inline comments

Since the snippet uses // comments, would you switch the code fence to jsonc for accuracy and better highlighting in some renderers, wdyt?


30-33: Great addition; consider documenting --components-path and clarifying env install

Nice call adding --manifest-path. Would you also mention the optional --components-path flag (supported by the entrypoint) in this snippet and below, so users discover it easily, wdyt? Additionally, to ensure local tasks that rely on dev deps (like Dagger) work out of the box, shall we tweak the install step to suggest poetry install --with dev --all-extras, wdyt?

debug_manifest/debug_manifest.py (1)

33-39: Pass through debug and config_path for better traceability

Since YamlDeclarativeSource supports debug and config_path, would you wire them through for improved DX, wdyt?

     debug_manifest(
         YamlDeclarativeSource(
-            path_to_yaml=manifest_path,
+            path_to_yaml=manifest_path,
+            debug=getattr(parsed_args, "debug", False),
             catalog=YamlDeclarativeSource.read_catalog(catalog_path) if catalog_path else None,
             config=YamlDeclarativeSource.read_config(config_path) if config_path else None,
-            state=YamlDeclarativeSource.read_state(state_path) if state_path else None,
+            state=YamlDeclarativeSource.read_state(state_path) if state_path else None,
+            config_path=config_path,
         ),
         args,
     )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ae0e8aa and 972cc60.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • bin/generate-component-manifest-dagger.sh (0 hunks)
  • debug_manifest/README.md (1 hunks)
  • debug_manifest/debug_manifest.py (1 hunks)
  • pyproject.toml (2 hunks)
💤 Files with no reviewable changes (1)
  • bin/generate-component-manifest-dagger.sh
🧰 Additional context used
🧬 Code graph analysis (1)
debug_manifest/debug_manifest.py (2)
airbyte_cdk/entrypoint.py (4)
  • AirbyteEntrypoint (53-368)
  • extract_catalog (352-356)
  • extract_config (359-363)
  • extract_state (345-349)
airbyte_cdk/sources/declarative/yaml_declarative_source.py (1)
  • YamlDeclarativeSource (17-69)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build

Copy link

github-actions bot commented Sep 9, 2025

PyTest Results (Fast)

3 743 tests  +1   3 732 ✅ +2   7m 2s ⏱️ +43s
    1 suites ±0      11 💤  - 1 
    1 files   ±0       0 ❌ ±0 

Results for commit b6b7768. ± Comparison against base commit ae0e8aa.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Sep 9, 2025

PyTest Results (Full)

3 746 tests  +1   3 734 ✅ +1   11m 0s ⏱️ -2s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit b6b7768. ± Comparison against base commit ae0e8aa.

♻️ This comment has been updated with latest results.

@dbgold17 dbgold17 requested a review from Copilot September 9, 2025 22:18
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR provides minor improvements to the development experience when working on manifest-only connectors locally. The changes include upgrading dependencies, improving the development workflow, and adding flexibility to debugging tools.

Key changes:

  • Updated dagger dependency version and moved it to pyproject.toml as a dev dependency
  • Enhanced the debug manifest script to support custom manifest file paths with a sensible default
  • Added automatic code formatting to the build process

Reviewed Changes

Copilot reviewed 3 out of 5 changed files in this pull request and generated 1 comment.

File Description
pyproject.toml Added dagger-io dependency and automatic ruff formatting to build task
debug_manifest/debug_manifest.py Added support for custom manifest paths via command line arguments
debug_manifest/README.md Updated documentation to include manifest-path option and clarified placeholders

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@dbgold17 dbgold17 changed the title dx: minor improvements to working on manifest only connectors locally chore: minor improvements to working on manifest only connectors locally Sep 9, 2025
@dbgold17
Copy link
Contributor Author

@maxi297 @pnilan can I get a review?

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me

@github-actions github-actions bot added the chore label Sep 10, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
debug_manifest/debug_manifest.py (3)

19-43: Harden dynamic component import to avoid partial module pollution

If exec_module fails, the pre-inserted entries in sys.modules remain and can cause confusing follow-on errors. Could we add existence checks and clean up sys.modules on failure, wdyt?

@@
-def _register_components_from_file(filepath: str) -> None:
+def _register_components_from_file(filepath: str) -> None:
@@
-    components_path = Path(filepath)
+    components_path = Path(filepath).resolve()
+    if not components_path.exists():
+        raise FileNotFoundError(f"Components file not found: {components_path}")
@@
-    sys.modules[module_name] = module
-    sys.modules[sdm_module_name] = module
-
-    spec.loader.exec_module(module)
+    sys.modules[module_name] = module
+    sys.modules[sdm_module_name] = module
+    try:
+        spec.loader.exec_module(module)
+    except Exception as e:
+        # cleanup partially initialized modules
+        sys.modules.pop(module_name, None)
+        sys.modules.pop(sdm_module_name, None)
+        raise ImportError(f"Failed to import components from {components_path}") from e

52-54: Avoid re-parsing args via extract_ helpers*

The extract_* helpers call parse_args again. Since we already have parsed_args, shall we reuse it to avoid triple parsing, wdyt?

-    catalog_path = AirbyteEntrypoint.extract_catalog(args)
-    config_path = AirbyteEntrypoint.extract_config(args)
-    state_path = AirbyteEntrypoint.extract_state(args)
+    catalog_path = getattr(parsed_args, "catalog", None)
+    config_path = getattr(parsed_args, "config", None)
+    state_path = getattr(parsed_args, "state", None)

58-62: Pass config_path through for better debug context

YamlDeclarativeSource accepts config_path; passing it can improve emitted debug info. Shall we include it, wdyt?

         YamlDeclarativeSource(
             path_to_yaml=manifest_path,
+            config_path=config_path,
             catalog=YamlDeclarativeSource.read_catalog(catalog_path) if catalog_path else None,
             config=YamlDeclarativeSource.read_config(config_path) if config_path else None,
             state=YamlDeclarativeSource.read_state(state_path) if state_path else None,
         ),
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 598fe31 and fd85d96.

📒 Files selected for processing (2)
  • debug_manifest/README.md (1 hunks)
  • debug_manifest/debug_manifest.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • debug_manifest/README.md
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Applied to files:

  • debug_manifest/debug_manifest.py
🧬 Code graph analysis (1)
debug_manifest/debug_manifest.py (2)
airbyte_cdk/entrypoint.py (4)
  • AirbyteEntrypoint (53-368)
  • extract_catalog (352-356)
  • extract_config (359-363)
  • extract_state (345-349)
airbyte_cdk/sources/declarative/yaml_declarative_source.py (1)
  • YamlDeclarativeSource (17-69)
🪛 GitHub Actions: Linters
debug_manifest/debug_manifest.py

[error] 16-22: Ruff formatting would reformat 1 file. Command 'poetry run ruff format --diff .' exited with code 1. Run 'poetry run ruff format' to apply code style fixes.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
🔇 Additional comments (2)
debug_manifest/debug_manifest.py (2)

48-51: Nice consolidation of arg parsing and safer defaults

Using a single parse and getattr with a default avoids AttributeError on commands like spec and removes the earlier double-parse on manifest_path. LGTM, thanks for addressing the earlier feedback, wdyt?


1-65: Fix Ruff formatting failure

CI is red on Ruff formatting. Could you run poetry run ruff format (or poe ruff-fix) and commit the result, wdyt?

@dbgold17 dbgold17 requested a review from maxi297 September 10, 2025 15:54
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
debug_manifest/debug_manifest.py (3)

29-36: Small nits: prefer is_file() and avoid re-importing sys.

  • Using Path.is_file() guards against directories.
  • sys is already imported at module scope; no need to re-import inside the function. Wdyt?

Apply:

-    import sys
     from pathlib import Path
@@
-    if not components_path.exists():
+    if not components_path.is_file():
         raise FileNotFoundError(f"Components file not found: {components_path}")

47-53: Guard against clobbering existing modules and ensure rollback on import failure.

If another module named "components" is already present, this unconditionally overwrites it and could leave sys.modules in a bad state on exceptions. Shall we wrap registration in a try/except and restore prior modules on failure, wdyt?

-    # Register then execute the module
-    # we dual-register the module to mirror what is done elsewhere in the CDK
-    sys.modules[module_name] = module
-    sys.modules[sdm_module_name] = module
-
-    spec.loader.exec_module(module)
+    # Register then execute the module (with rollback if execution fails)
+    prev_components = sys.modules.get(module_name)
+    prev_sdm_components = sys.modules.get(sdm_module_name)
+    try:
+        sys.modules[module_name] = module
+        sys.modules[sdm_module_name] = module
+        spec.loader.exec_module(module)
+    except Exception:
+        if prev_components is None:
+            sys.modules.pop(module_name, None)
+        else:
+            sys.modules[module_name] = prev_components
+        if prev_sdm_components is None:
+            sys.modules.pop(sdm_module_name, None)
+        else:
+            sys.modules[sdm_module_name] = prev_sdm_components
+        raise

59-66: Avoid re-parsing args for catalog/config/state.

extract_* re-parses the CLI; since you already have parsed_args, can we read these directly to save work and keep things consistent, wdyt?

-    catalog_path = AirbyteEntrypoint.extract_catalog(args)
-    config_path = AirbyteEntrypoint.extract_config(args)
-    state_path = AirbyteEntrypoint.extract_state(args)
+    catalog_path = getattr(parsed_args, "catalog", None)
+    config_path = getattr(parsed_args, "config", None)
+    state_path = getattr(parsed_args, "state", None)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 69dfe3b and 7192a5f.

📒 Files selected for processing (1)
  • debug_manifest/debug_manifest.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Applied to files:

  • debug_manifest/debug_manifest.py
🧬 Code graph analysis (1)
debug_manifest/debug_manifest.py (2)
airbyte_cdk/entrypoint.py (4)
  • AirbyteEntrypoint (53-368)
  • extract_catalog (352-356)
  • extract_config (359-363)
  • extract_state (345-349)
airbyte_cdk/sources/declarative/yaml_declarative_source.py (1)
  • YamlDeclarativeSource (17-69)
🪛 GitHub Actions: Linters
debug_manifest/debug_manifest.py

[error] 19-29: Ruff format would reformat 1 file: debug_manifest/debug_manifest.py. Run 'poetry run ruff format' to apply formatting.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
🔇 Additional comments (4)
debug_manifest/debug_manifest.py (4)

20-54: Solid dynamic component registration and clear error handling — LGTM.

The helper cleanly loads and dual-registers the module and surfaces useful errors. Nice touch for local dev.


57-62: Nice: single parse_args() and getattr() defaulting.

This addresses the earlier double-parse/AttributeError concern. LGTM.


67-73: Passing manifest_path through to YamlDeclarativeSource looks correct.

Keeps default behavior while enabling overrides. LGTM.


20-29: Verify Ruff formatting locally
Could you run poetry run ruff format debug_manifest/debug_manifest.py (or poe ruff-fix) in your local environment and commit any formatting changes so that this file matches the CI’s expectations? wdyt?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
debug_manifest/debug_manifest.py (1)

57-65: Parse args once; reuse parsed_args for catalog/config/state

We still re-parse via AirbyteEntrypoint.extract_*, which internally calls parse_args again. Can we read these directly from parsed_args to avoid redundant parsing, for consistency and tiny perf/clarity win, wdyt?

-    catalog_path = AirbyteEntrypoint.extract_catalog(args)
-    config_path = AirbyteEntrypoint.extract_config(args)
-    state_path = AirbyteEntrypoint.extract_state(args)
+    catalog_path = getattr(parsed_args, "catalog", None)
+    config_path = getattr(parsed_args, "config", None)
+    state_path = getattr(parsed_args, "state", None)
🧹 Nitpick comments (2)
debug_manifest/debug_manifest.py (2)

67-75: Propagate debug and config_path into YamlDeclarativeSource

Since YamlDeclarativeSource.__init__ accepts debug and config_path, would passing them improve diagnostics and parity with runtime behavior, wdyt?

     debug_manifest(
         YamlDeclarativeSource(
-            path_to_yaml=manifest_path,
+            path_to_yaml=manifest_path,
+            debug=getattr(parsed_args, "debug", False),
             catalog=YamlDeclarativeSource.read_catalog(catalog_path) if catalog_path else None,
             config=YamlDeclarativeSource.read_config(config_path) if config_path else None,
             state=YamlDeclarativeSource.read_state(state_path) if state_path else None,
+            config_path=config_path,
         ),
         args,
     )

20-53: Harden dynamic component loader (file checks; drop redundant import; minor robustness)

A couple of small guardrails could make this safer to use locally without surprises: ensure the path is a file with a .py suffix, normalize it, and avoid the redundant inner import sys. Also, passing a Path directly to spec_from_file_location is fine in newer Python, but converting to str is the most compatible. Wdyt?

 def _register_components_from_file(filepath: str) -> None:
@@
-    import importlib.util
-    import sys
-    from pathlib import Path
+    import importlib.util
+    from pathlib import Path
@@
-    components_path = Path(filepath)
-    if not components_path.exists():
+    components_path = Path(filepath).expanduser().resolve()
+    if not components_path.exists():
         raise FileNotFoundError(f"Components file not found: {components_path}")
+    if not components_path.is_file():
+        raise FileNotFoundError(f"Components path is not a file: {components_path}")
+    if components_path.suffix != ".py":
+        raise ValueError(f"Components file must be a .py file: {components_path}")
@@
-    spec = importlib.util.spec_from_file_location(module_name, components_path)
+    spec = importlib.util.spec_from_file_location(module_name, str(components_path))
@@
-    sys.modules[module_name] = module
-    sys.modules[sdm_module_name] = module
+    sys.modules[module_name] = module
+    sys.modules[sdm_module_name] = module

Optional: if you expect repeated invocations with different files in the same process, do you want to warn when overwriting existing entries in sys.modules (e.g., via if module_name in sys.modules: ... log/warn), wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7192a5f and b6b7768.

📒 Files selected for processing (1)
  • debug_manifest/debug_manifest.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Applied to files:

  • debug_manifest/debug_manifest.py
🧬 Code graph analysis (1)
debug_manifest/debug_manifest.py (2)
airbyte_cdk/entrypoint.py (4)
  • AirbyteEntrypoint (53-368)
  • extract_catalog (352-356)
  • extract_config (359-363)
  • extract_state (345-349)
airbyte_cdk/sources/declarative/yaml_declarative_source.py (1)
  • YamlDeclarativeSource (17-69)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-shopify
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Manifest Server Docker Image Build
🔇 Additional comments (1)
debug_manifest/debug_manifest.py (1)

59-62: Nice: CLI flexibility + safe defaults look good

Using getattr(..., None) or "resources/manifest.yaml" for both manifest_path and components_path gracefully supports commands like spec that don’t define these args. 👍 Wdyt about keeping this pattern consistent elsewhere too?

@dbgold17 dbgold17 merged commit fd3a330 into main Sep 10, 2025
29 checks passed
@dbgold17 dbgold17 deleted the dbgold17/cdk-local-dev-improvements branch September 10, 2025 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants