Skip to content

Conversation

maxi297
Copy link
Contributor

@maxi297 maxi297 commented Aug 26, 2025

What

https://github.com/airbytehq/oncall/issues/8851

https://airbytehq-team.slack.com/archives/C09C4CK3BEW

How

Ensuring "$parameter" is populated from "parameters" value

Summary by CodeRabbit

  • Bug Fixes

    • Fixed parameter propagation in declarative sources using concurrent cursors so per-stream parameters are correctly applied, improving incremental sync reliability and compatibility between manifest- and model-defined components.
  • Tests

    • Added a unit test to verify per-stream parameter propagation into concurrent cursor configurations.

@github-actions github-actions bot added bug Something isn't working security labels Aug 26, 2025
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/fix_parameter_propagation_for_concurrent_cursors#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/fix_parameter_propagation_for_concurrent_cursors

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

@maxi297 maxi297 requested review from brianjlai, tolik0 and pnilan August 26, 2025 19:17
Copy link
Contributor

coderabbitai bot commented Aug 26, 2025

📝 Walkthrough

Walkthrough

Copy "parameters" into "$parameters" when absent for two concurrent-cursor factory paths before parsing into Pydantic models, ensuring both manifest-shaped and model.dict-shaped component inputs propagate parameters into the concurrent cursor construction. No public signatures changed.

Changes

Cohort / File(s) Change Summary
Concurrent cursor factory
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
Added logic in create_concurrent_cursor_from_datetime_based_cursor and create_concurrent_cursor_from_perpartition_cursor to copy parameters$parameters when $parameters is missing, preserving parameter propagation for two input shapes; left remaining parsing logic unchanged.
Unit tests
unit_tests/sources/declarative/test_concurrent_declarative_source.py
Added test_parameter_propagation_for_concurrent_cursor to verify per-stream parameters propagate into the concurrent cursor's cursor_field.

Sequence Diagram(s)

sequenceDiagram
  participant Manifest as Manifest / Model dict
  participant Factory as ModelToComponentFactory
  participant Pydantic as Pydantic Model Parser

  rect #F0F9FF
  Manifest->>Factory: provide component_definition (may have "parameters" or "$parameters")
  end

  rect #F7FFF0
  Note right of Factory: If "$parameters" missing\ncopy "parameters" -> "$parameters"
  Factory->>Factory: ensure "$parameters" present
  end

  Factory->>Pydantic: parse component_definition into model (uses $parameters)
  Pydantic-->>Factory: parsed component instance
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • pnilan
  • tolik0

Would you like me to draft a short comment suggesting unifying the two parameter input shapes inside the factory (so propagation consistently happens in one place), wdyt?

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch maxi297/fix_parameter_propagation_for_concurrent_cursors

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

coderabbitai[bot]
coderabbitai bot previously requested changes Aug 26, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4475c57 and 6d6bbed.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

[error] 1286-1286: Command 'poetry run mypy --config-file mypy.ini airbyte_cdk' failed with mypy error: Unsupported target for indexed assignment ("Mapping[str, Any]") [index]


[error] 1596-1596: Command 'poetry run mypy --config-file mypy.ini airbyte_cdk' failed with mypy error: Unsupported target for indexed assignment ("Mapping[str, Any]") [index]

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)

Copy link

github-actions bot commented Aug 26, 2025

PyTest Results (Fast)

3 721 tests  +1   3 710 ✅ +1   7m 9s ⏱️ -1s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 183262c. ± Comparison against base commit 4475c57.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Aug 26, 2025

PyTest Results (Full)

3 724 tests  +1   3 713 ✅ +1   10m 5s ⏱️ -10s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 183262c. ± Comparison against base commit 4475c57.

♻️ This comment has been updated with latest results.

@maxi297 maxi297 dismissed coderabbitai[bot]’s stale review August 26, 2025 19:46

Their concern should be addressed in an upcoming PR

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
unit_tests/sources/declarative/test_concurrent_declarative_source.py (5)

4740-4742: Strengthen the assertion by checking we’re actually on the concurrent path.

Would you consider asserting the stream and cursor types before the final key check? That makes the intent explicit and guards against future refactors where streams() could return a different wrapper. Wdyt?

-    streams = source.streams({})
-
-    assert streams[0].cursor.cursor_field.cursor_field_key == cursor_field_parameter_override
+    streams = source.streams({})
+    # Extra safety to ensure we hit the concurrent path
+    assert isinstance(streams[0], DefaultStream)
+    assert isinstance(streams[0].cursor, ConcurrentCursor)
+    assert streams[0].name == "stream_name"
+    assert streams[0].cursor.cursor_field.cursor_field_key == cursor_field_parameter_override

4733-4741: Optional: use _group_streams and select by name to avoid ordering assumptions.

Most tests here validate concurrent constructs via _group_streams. Using it and selecting by name will make this test more resilient if stream ordering changes. Interested in switching? Wdyt?

-    streams = source.streams({})
-
-    assert streams[0].cursor.cursor_field.cursor_field_key == cursor_field_parameter_override
+    concurrent_streams, _ = source._group_streams(config={})
+    stream = next(s for s in concurrent_streams if s.name == "stream_name")
+    assert isinstance(stream, DefaultStream)
+    assert isinstance(stream.cursor, ConcurrentCursor)
+    assert stream.cursor.cursor_field.cursor_field_key == cursor_field_parameter_override

4654-4660: Tiny readability bump: add a one-line docstring explaining the scenario.

This helps future readers quickly understand the regression being covered. Add a brief why here? Wdyt?

 def test_parameter_propagation_for_concurrent_cursor():
+    """Ensures stream $parameters['cursor_field'] propagates into the concurrent DatetimeBased cursor."""

4654-4742: Broaden coverage: add a per-partition concurrent cursor variant.

The fix also touches the per-partition concurrent cursor factory path. Would you like to add a sibling test that sets up a partition_router and verifies the same $parameters override flows into that cursor too? I can help wire it if useful. Wdyt?

Example outline you could drop below this test:

def test_parameter_propagation_for_concurrent_perpartition_cursor():
    override = "created_at"
    manifest = {
        "version": "5.0.0",
        "definitions": {
            "selector": {"type": "RecordSelector", "extractor": {"type": "DpathExtractor", "field_path": []}},
            "requester": {"type": "HttpRequester", "url_base": "https://persona.metaverse.com", "http_method": "GET"},
            "retriever": {
                "type": "SimpleRetriever",
                "record_selector": {"$ref": "#/definitions/selector"},
                "paginator": {"type": "NoPagination"},
                "requester": {"$ref": "#/definitions/requester"},
                "partition_router": {"type": "ListPartitionRouter", "cursor_field": "partition_id", "values": ["A", "B"]},
            },
            "incremental_cursor": {
                "type": "DatetimeBasedCursor",
                "start_datetime": {"datetime": "2024-01-01"},
                "end_datetime": {"datetime": "2024-12-31"},
                "datetime_format": "%Y-%m-%d",
                "cursor_datetime_formats": ["%Y-%m-%d"],
                "cursor_granularity": "P1D",
                "step": "P400D",
                "cursor_field": "{{ parameters.get('cursor_field', 'updated_at') }}",
                "start_time_option": {"type": "RequestOption", "field_name": "start", "inject_into": "request_parameter"},
                "end_time_option": {"type": "RequestOption", "field_name": "end", "inject_into": "request_parameter"},
            },
            "incremental_stream": {
                "retriever": {"$ref": "#/definitions/retriever", "requester": {"$ref": "#/definitions/requester"}},
                "incremental_sync": {"$ref": "#/definitions/incremental_cursor"},
                "$parameters": {"name": "perpartition", "primary_key": "id", "path": "/path", "cursor_field": override},
                "schema_loader": {"type": "InlineSchemaLoader", "schema": {"$schema": "https://json-schema.org/draft-07/schema#", "type": "object", "properties": {"id": {"type": ["null", "string"]}}}},
            },
        },
        "streams": ["#/definitions/incremental_stream"],
        "check": {"stream_names": ["perpartition"]},
        "concurrency_level": {"type": "ConcurrencyLevel", "default_concurrency": 5, "max_concurrency": 25},
    }

    source = ConcurrentDeclarativeSource(source_config=manifest, config={}, catalog=create_catalog("perpartition"), state=None)
    concurrent_streams, _ = source._group_streams(config={})
    stream = next(s for s in concurrent_streams if s.name == "perpartition")
    assert isinstance(stream.cursor, ConcurrentCursor)
    assert stream.cursor.cursor_field.cursor_field_key == override

4676-4678: Use object form for end_datetime for consistency

I ran a search across our declarative tests and found that every other static example in this suite uses the object shape for end_datetime—only this one remains as a bare string. Would you mind switching it to the object form so it matches the rest? Wdyt?

• Location: unit_tests/sources/declarative/test_concurrent_declarative_source.py:4677

-               "end_datetime": "2024-12-31",
+               "end_datetime": {"datetime": "2024-12-31"},
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6d6bbed and 183262c.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2 hunks)
  • unit_tests/sources/declarative/test_concurrent_declarative_source.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
🧰 Additional context used
🧬 Code graph analysis (1)
unit_tests/sources/declarative/test_concurrent_declarative_source.py (2)
airbyte_cdk/sources/declarative/incremental/concurrent_partition_cursor.py (2)
  • state (137-157)
  • cursor_field (133-134)
airbyte_cdk/sources/streams/concurrent/default_stream.py (2)
  • cursor (92-93)
  • cursor_field (52-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (1)
unit_tests/sources/declarative/test_concurrent_declarative_source.py (1)

4654-4742: Nice, focused regression test for parameter propagation into concurrent cursor.

This directly validates that a stream-level $parameters override flows into the DatetimeBased concurrent cursor’s cursor_field. It aligns with the PR objective and should prevent a silent fallback to the default "updated_at". LGTM.

@maxi297 maxi297 merged commit 2aa7b58 into main Aug 26, 2025
26 checks passed
@maxi297 maxi297 deleted the maxi297/fix_parameter_propagation_for_concurrent_cursors branch August 26, 2025 20:10
@frifriSF59
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working security
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants