Skip to content

Conversation

pedroslopez
Copy link
Contributor

@pedroslopez pedroslopez commented Aug 29, 2025

What

Adds basic datadog tracing with ddtrace to manifest-server. This follows the conventions used for airbyte platform services, ie enabling datadog via DD_ENABLED=true. Datadog-specific configuration is also already set by default on the helm charts, so just enabling datadog gives us visibility in cloud out of the box without any extra setup.

Demo

image image

Summary by CodeRabbit

  • New Features

    • Optional Datadog APM tracing for the manifest server. Enable by setting DD_ENABLED=true to auto-instrument requests.
  • Documentation

    • Added guidance on enabling Datadog tracing, including an example command and a link to ddtrace configuration.
  • Chores

    • Added ddtrace as an optional dependency and included it in the manifest-server installation extra for easier setup.

@github-actions github-actions bot added the enhancement New feature or request label Aug 29, 2025
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@pedro/ddtrace-manifest-server#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch pedro/ddtrace-manifest-server

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

Copy link

github-actions bot commented Aug 29, 2025

PyTest Results (Fast)

3 763 tests  ±0   3 751 ✅ ±0   6m 36s ⏱️ -3s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit dfbdde1. ± Comparison against base commit e4b34b6.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Aug 29, 2025

PyTest Results (Full)

3 766 tests  ±0   3 754 ✅ ±0   11m 23s ⏱️ -4s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit dfbdde1. ± Comparison against base commit e4b34b6.

♻️ This comment has been updated with latest results.

@pedroslopez pedroslopez marked this pull request as ready for review August 29, 2025 06:53
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Datadog APM tracing support to the manifest-server component to enable monitoring and observability. The implementation follows existing Airbyte platform conventions by using the DD_ENABLED=true environment variable to enable tracing.

  • Adds ddtrace dependency as an optional extra for manifest-server
  • Implements conditional auto-instrumentation import based on DD_ENABLED environment variable
  • Documents Datadog configuration and usage in README

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.

File Description
pyproject.toml Adds ddtrace dependency and includes it in manifest-server extras
airbyte_cdk/manifest_server/app.py Implements conditional Datadog auto-instrumentation import
airbyte_cdk/manifest_server/README.md Documents Datadog APM configuration and usage

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

coderabbitai bot commented Aug 29, 2025

📝 Walkthrough

Walkthrough

Adds optional Datadog APM auto-instrumentation gated by the DD_ENABLED env var, documents usage in README, includes ddtrace as an optional dependency and in the manifest-server extra, and makes CLI dependency-check include ddtrace in import gating.

Changes

Cohort / File(s) Summary
Documentation
airbyte_cdk/manifest_server/README.md
Adds a Datadog APM section explaining DD_ENABLED usage, notes ddtrace inclusion in the manifest-server extra, provides an example, and links to ddtrace configuration docs.
App & CLI instrumentation
airbyte_cdk/manifest_server/app.py, airbyte_cdk/manifest_server/cli/_common.py
app.py: conditionally imports ddtrace.auto at module import when DD_ENABLED=="true" to enable APM (will raise if enabled but ddtrace missing). cli/_common.py: adds ddtrace to the import check that sets FASTAPI_AVAILABLE.
Dependencies
pyproject.toml
Adds optional dependency ddtrace = { version = "^3.12.3", optional = true } and updates the manifest-server extra to include ddtrace alongside fastapi and uvicorn.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Proc as Process
  participant Env as Environment
  participant App as manifest_server.app
  participant CLI as manifest_server.cli/_common
  participant APM as ddtrace.auto
  participant API as FastAPI

  Proc->>Env: Read DD_ENABLED
  alt DD_ENABLED == "true"
    Proc->>APM: Import `ddtrace.auto` (auto-instrument)
    note right of APM: Imported early at module import time
  else DD_ENABLED not "true"
    note right of Proc: Skip `ddtrace` import
  end
  Proc->>CLI: CLI dependency check (fastapi, uvicorn, ddtrace)
  CLI-->>Proc: FASTAPI_AVAILABLE true/false
  Proc->>App: Import app module
  App->>API: Initialize FastAPI app and routers
  Proc->>API: Start server (e.g., uvicorn)
  API->>Proc: Serve requests (APM traces if enabled)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • aaronsteers — could you take a look at the ddtrace import gating and dependency update, wdyt?

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pedro/ddtrace-manifest-server

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
airbyte_cdk/manifest_server/README.md (3)

166-168: Clarify truthy values and casing for DD_ENABLED.

Could we note that the code treats DD_ENABLED case-insensitively and accepts true/1/yes (if we adopt the parsing change below), to avoid users tripping over "True" vs "true", wdyt?

Apply this doc tweak:

- export DD_ENABLED=true
+ export DD_ENABLED=true   # accepts: true/1/yes (case-insensitive)

170-171: Document minimal recommended Datadog env vars.

Would we add a short list for DD_SERVICE, DD_ENV, and agent URL so users see spans with the right service name and the agent endpoint immediately, wdyt?

Apply this doc addition:

 This requires the `ddtrace` dependency, which is included in the `manifest-server` extra. For additional configuration options via environment variables, see [ddtrace configuration](https://ddtrace.readthedocs.io/en/stable/configuration.html).
+
+Recommended env vars:
+
+- `DD_SERVICE=manifest-server`
+- `DD_ENV=<dev|staging|prod>`
+- `DD_TRACE_AGENT_URL=http://<agent-host>:8126`  (or `DD_AGENT_HOST`/`DD_TRACE_AGENT_PORT`)

175-177: Show a fuller example with service/env.

Shall we include service/env to encourage consistent tagging in APM from day one, wdyt?

- DD_ENABLED=true manifest-server start
+ DD_ENABLED=true DD_SERVICE=manifest-server DD_ENV=dev manifest-server start
airbyte_cdk/manifest_server/app.py (1)

3-5: Optionally default DD_SERVICE to “manifest-server” when not set.

For consistent service naming without relying on Helm defaults, shall we set a conservative default only if DD_SERVICE is unset, wdyt?

-    try:
-        import ddtrace.auto  # noqa: F401
-        logging.getLogger(__name__).info("Datadog tracing enabled.")
+    try:
+        import ddtrace.auto  # noqa: F401
+        from ddtrace import config
+        if not os.getenv("DD_SERVICE"):
+            config.service = "manifest-server"
+        logging.getLogger(__name__).info("Datadog tracing enabled for 'manifest-server'.")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e4b34b6 and 142ad7b.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • airbyte_cdk/manifest_server/README.md (1 hunks)
  • airbyte_cdk/manifest_server/app.py (1 hunks)
  • pyproject.toml (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Check: source-shopify
🔇 Additional comments (3)
pyproject.toml (2)

128-128: Including ddtrace in the manifest-server extra is perfect.

This aligns the extra with the README and ensures pip install airbyte-cdk[manifest-server] just works. 👍


92-92: ddtrace ^3.12.3 supports Python 3.10–3.13 on manylinux/musllinux
Verified on PyPI that ddtrace >= 3.12.3 publishes official cp313 wheels for both manylinux and musllinux [1][2], so our 3.10–3.13 matrix is covered—no changes needed, wdyt?

airbyte_cdk/manifest_server/app.py (1)

1-6: Earliest import location: consider moving gating to the CLI entrypoint.

Auto-instrumentation is best loaded before importing uvicorn/FastAPI. Our CLI likely imports uvicorn first; would you prefer moving this block to airbyte_cdk/manifest_server/cli/run.py (top of file) or documenting ddtrace-run manifest-server start as an alternative for maximal coverage, wdyt?

Copy link
Contributor

@aaronsteers aaronsteers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, with one suggestion inline.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte_cdk/manifest_server/cli/_common.py (1)

8-16: Don’t require ddtrace to start the server and avoid pre-importing FastAPI (breaks auto-instrumentation).

  • Adding ddtrace to the top-level dependency gate makes it mandatory, contradicting the “optional, env-gated” goal.
  • Importing fastapi at module import time can load it before ddtrace.auto runs, which can prevent auto-instrumentation from patching FastAPI/Starlette.

Could we detect availability without importing these modules and only require ddtrace when DD_ENABLED=true, wdyt?

Apply this diff to avoid early imports and to conditionally require ddtrace only when enabled:

@@
-# Import server dependencies with graceful fallback
-try:
-    import ddtrace  # noqa: F401
-    import fastapi  # noqa: F401
-    import uvicorn  # noqa: F401
-
-    FASTAPI_AVAILABLE = True
-except ImportError:
-    FASTAPI_AVAILABLE = False
+# Resolve server dependency availability without importing runtime libs (preserves ddtrace auto-instrumentation order)
+import importlib.util
+
+def _is_available(mod: str) -> bool:
+    return importlib.util.find_spec(mod) is not None
+
+FASTAPI_AVAILABLE = _is_available("fastapi") and _is_available("uvicorn")
@@
-def check_manifest_server_dependencies() -> None:
-    """Check if manifest-server dependencies are installed."""
-    if not FASTAPI_AVAILABLE:
-        click.echo(
-            "❌ Manifest runner dependencies not found. Please install with:\n\n"
-            "  pip install airbyte-cdk[manifest-server]\n"
-            "  # or\n"
-            "  poetry install --extras manifest-server\n",
-            err=True,
-        )
-        sys.exit(1)
+def check_manifest_server_dependencies() -> None:
+    """Check if manifest-server dependencies are installed."""
+    import os
+
+    missing = []
+    if not _is_available("fastapi"):
+        missing.append("fastapi")
+    if not _is_available("uvicorn"):
+        missing.append("uvicorn")
+    if os.getenv("DD_ENABLED", "").lower() == "true" and not _is_available("ddtrace"):
+        missing.append("ddtrace")
+
+    if missing:
+        click.echo(
+            f"❌ Manifest server dependencies missing: {', '.join(missing)}.\n\n"
+            "Please install with:\n\n"
+            "  pip install 'airbyte-cdk[manifest-server]'\n"
+            "  # or\n"
+            "  poetry install --extras manifest-server\n",
+            err=True,
+        )
+        sys.exit(1)

Also applies to: 19-29

🧹 Nitpick comments (1)
airbyte_cdk/manifest_server/cli/_common.py (1)

23-27: Clarify the guidance text when ddtrace isn’t required.

If Datadog is disabled, users don’t need ddtrace. The diff above prints exactly which packages are missing, reducing confusion. Would you like to keep the existing message as a fallback and show the “missing: …” prefix only when we detect gaps, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 142ad7b and dfbdde1.

📒 Files selected for processing (2)
  • airbyte_cdk/manifest_server/app.py (1 hunks)
  • airbyte_cdk/manifest_server/cli/_common.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/manifest_server/app.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Applied to files:

  • airbyte_cdk/manifest_server/cli/_common.py
📚 Learning: 2024-11-15T00:59:08.154Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.

Applied to files:

  • airbyte_cdk/manifest_server/cli/_common.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-shopify
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)

@pedroslopez pedroslopez merged commit 9ef1a3d into main Aug 29, 2025
35 of 36 checks passed
@pedroslopez pedroslopez deleted the pedro/ddtrace-manifest-server branch August 29, 2025 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants