-
Notifications
You must be signed in to change notification settings - Fork 30
feat: add datadog support to manifest-server #739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@pedro/ddtrace-manifest-server#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch pedro/ddtrace-manifest-server Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Datadog APM tracing support to the manifest-server component to enable monitoring and observability. The implementation follows existing Airbyte platform conventions by using the DD_ENABLED=true
environment variable to enable tracing.
- Adds ddtrace dependency as an optional extra for manifest-server
- Implements conditional auto-instrumentation import based on DD_ENABLED environment variable
- Documents Datadog configuration and usage in README
Reviewed Changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.
File | Description |
---|---|
pyproject.toml | Adds ddtrace dependency and includes it in manifest-server extras |
airbyte_cdk/manifest_server/app.py | Implements conditional Datadog auto-instrumentation import |
airbyte_cdk/manifest_server/README.md | Documents Datadog APM configuration and usage |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
📝 WalkthroughWalkthroughAdds optional Datadog APM auto-instrumentation gated by the DD_ENABLED env var, documents usage in README, includes ddtrace as an optional dependency and in the manifest-server extra, and makes CLI dependency-check include ddtrace in import gating. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Proc as Process
participant Env as Environment
participant App as manifest_server.app
participant CLI as manifest_server.cli/_common
participant APM as ddtrace.auto
participant API as FastAPI
Proc->>Env: Read DD_ENABLED
alt DD_ENABLED == "true"
Proc->>APM: Import `ddtrace.auto` (auto-instrument)
note right of APM: Imported early at module import time
else DD_ENABLED not "true"
note right of Proc: Skip `ddtrace` import
end
Proc->>CLI: CLI dependency check (fastapi, uvicorn, ddtrace)
CLI-->>Proc: FASTAPI_AVAILABLE true/false
Proc->>App: Import app module
App->>API: Initialize FastAPI app and routers
Proc->>API: Start server (e.g., uvicorn)
API->>Proc: Serve requests (APM traces if enabled)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
airbyte_cdk/manifest_server/README.md (3)
166-168
: Clarify truthy values and casing for DD_ENABLED.Could we note that the code treats DD_ENABLED case-insensitively and accepts true/1/yes (if we adopt the parsing change below), to avoid users tripping over "True" vs "true", wdyt?
Apply this doc tweak:
- export DD_ENABLED=true + export DD_ENABLED=true # accepts: true/1/yes (case-insensitive)
170-171
: Document minimal recommended Datadog env vars.Would we add a short list for DD_SERVICE, DD_ENV, and agent URL so users see spans with the right service name and the agent endpoint immediately, wdyt?
Apply this doc addition:
This requires the `ddtrace` dependency, which is included in the `manifest-server` extra. For additional configuration options via environment variables, see [ddtrace configuration](https://ddtrace.readthedocs.io/en/stable/configuration.html). + +Recommended env vars: + +- `DD_SERVICE=manifest-server` +- `DD_ENV=<dev|staging|prod>` +- `DD_TRACE_AGENT_URL=http://<agent-host>:8126` (or `DD_AGENT_HOST`/`DD_TRACE_AGENT_PORT`)
175-177
: Show a fuller example with service/env.Shall we include service/env to encourage consistent tagging in APM from day one, wdyt?
- DD_ENABLED=true manifest-server start + DD_ENABLED=true DD_SERVICE=manifest-server DD_ENV=dev manifest-server startairbyte_cdk/manifest_server/app.py (1)
3-5
: Optionally default DD_SERVICE to “manifest-server” when not set.For consistent service naming without relying on Helm defaults, shall we set a conservative default only if
DD_SERVICE
is unset, wdyt?- try: - import ddtrace.auto # noqa: F401 - logging.getLogger(__name__).info("Datadog tracing enabled.") + try: + import ddtrace.auto # noqa: F401 + from ddtrace import config + if not os.getenv("DD_SERVICE"): + config.service = "manifest-server" + logging.getLogger(__name__).info("Datadog tracing enabled for 'manifest-server'.")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (3)
airbyte_cdk/manifest_server/README.md
(1 hunks)airbyte_cdk/manifest_server/app.py
(1 hunks)pyproject.toml
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Check: source-shopify
🔇 Additional comments (3)
pyproject.toml (2)
128-128
: Including ddtrace in the manifest-server extra is perfect.This aligns the extra with the README and ensures
pip install airbyte-cdk[manifest-server]
just works. 👍
92-92
: ddtrace ^3.12.3 supports Python 3.10–3.13 on manylinux/musllinux
Verified on PyPI that ddtrace >= 3.12.3 publishes official cp313 wheels for both manylinux and musllinux [1][2], so our 3.10–3.13 matrix is covered—no changes needed, wdyt?airbyte_cdk/manifest_server/app.py (1)
1-6
: Earliest import location: consider moving gating to the CLI entrypoint.Auto-instrumentation is best loaded before importing uvicorn/FastAPI. Our CLI likely imports uvicorn first; would you prefer moving this block to
airbyte_cdk/manifest_server/cli/run.py
(top of file) or documentingddtrace-run manifest-server start
as an alternative for maximal coverage, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, with one suggestion inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
airbyte_cdk/manifest_server/cli/_common.py (1)
8-16
: Don’t require ddtrace to start the server and avoid pre-importing FastAPI (breaks auto-instrumentation).
- Adding
ddtrace
to the top-level dependency gate makes it mandatory, contradicting the “optional, env-gated” goal.- Importing
fastapi
at module import time can load it beforeddtrace.auto
runs, which can prevent auto-instrumentation from patching FastAPI/Starlette.Could we detect availability without importing these modules and only require
ddtrace
whenDD_ENABLED=true
, wdyt?Apply this diff to avoid early imports and to conditionally require ddtrace only when enabled:
@@ -# Import server dependencies with graceful fallback -try: - import ddtrace # noqa: F401 - import fastapi # noqa: F401 - import uvicorn # noqa: F401 - - FASTAPI_AVAILABLE = True -except ImportError: - FASTAPI_AVAILABLE = False +# Resolve server dependency availability without importing runtime libs (preserves ddtrace auto-instrumentation order) +import importlib.util + +def _is_available(mod: str) -> bool: + return importlib.util.find_spec(mod) is not None + +FASTAPI_AVAILABLE = _is_available("fastapi") and _is_available("uvicorn") @@ -def check_manifest_server_dependencies() -> None: - """Check if manifest-server dependencies are installed.""" - if not FASTAPI_AVAILABLE: - click.echo( - "❌ Manifest runner dependencies not found. Please install with:\n\n" - " pip install airbyte-cdk[manifest-server]\n" - " # or\n" - " poetry install --extras manifest-server\n", - err=True, - ) - sys.exit(1) +def check_manifest_server_dependencies() -> None: + """Check if manifest-server dependencies are installed.""" + import os + + missing = [] + if not _is_available("fastapi"): + missing.append("fastapi") + if not _is_available("uvicorn"): + missing.append("uvicorn") + if os.getenv("DD_ENABLED", "").lower() == "true" and not _is_available("ddtrace"): + missing.append("ddtrace") + + if missing: + click.echo( + f"❌ Manifest server dependencies missing: {', '.join(missing)}.\n\n" + "Please install with:\n\n" + " pip install 'airbyte-cdk[manifest-server]'\n" + " # or\n" + " poetry install --extras manifest-server\n", + err=True, + ) + sys.exit(1)Also applies to: 19-29
🧹 Nitpick comments (1)
airbyte_cdk/manifest_server/cli/_common.py (1)
23-27
: Clarify the guidance text when ddtrace isn’t required.If Datadog is disabled, users don’t need
ddtrace
. The diff above prints exactly which packages are missing, reducing confusion. Would you like to keep the existing message as a fallback and show the “missing: …” prefix only when we detect gaps, wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
airbyte_cdk/manifest_server/app.py
(1 hunks)airbyte_cdk/manifest_server/cli/_common.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte_cdk/manifest_server/app.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Applied to files:
airbyte_cdk/manifest_server/cli/_common.py
📚 Learning: 2024-11-15T00:59:08.154Z
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
Applied to files:
airbyte_cdk/manifest_server/cli/_common.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: Check: source-intercom
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-shopify
- GitHub Check: Manifest Server Docker Image Build
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.12, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.13, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
What
Adds basic datadog tracing with ddtrace to manifest-server. This follows the conventions used for airbyte platform services, ie enabling datadog via
DD_ENABLED=true
. Datadog-specific configuration is also already set by default on the helm charts, so just enabling datadog gives us visibility in cloud out of the box without any extra setup.Demo
Summary by CodeRabbit
New Features
Documentation
Chores