Skip to content

Conversation

amotl
Copy link
Member

@amotl amotl commented Aug 9, 2025

About

The patch concludes another round of page relocations based on the currently proposed next-generation layout, this time focusing on the Ingest/ETL and Ingest/CDC sections. In this spirit, the new ingest subsection is becoming a spot of increased gravity.

Tutorials now in “learn” reside,
With every hop, more docs to guide!

Details

  • Dissolve individual pages in category section etl, relocating them into dedicated items within the backbone section integrate instead.
  • Relocated items: Azure Functions, Apache Iceberg, InfluxDB, MongoDB, MySQL and MariaDB, RisingWave, Streamsets.
  • Dissolve weird page toc assembly on ETL and CDC category index pages, using toctree only for now.

Preview

References

@amotl amotl added sanding-500 Sanding medium-sized details. refactoring Changing shape or layout, or moving content around. guidance Matters of layout, shape, and structure. labels Aug 9, 2025
Copy link

coderabbitai bot commented Aug 9, 2025

Warning

Rate limit exceeded

@amotl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 30 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between b8d438e and 6a273df.

📒 Files selected for processing (27)
  • docs/_include/card/timeseries-datashader.md (1 hunks)
  • docs/_include/links.md (2 hunks)
  • docs/conf.py (1 hunks)
  • docs/connect/configure.md (3 hunks)
  • docs/connect/index.md (2 hunks)
  • docs/connect/mcp/index.md (4 hunks)
  • docs/ingest/cdc/index.md (1 hunks)
  • docs/ingest/etl/index.md (1 hunks)
  • docs/ingest/telemetry/index.md (1 hunks)
  • docs/integrate/apache-iceberg/index.md (1 hunks)
  • docs/integrate/aws-dms/index.md (1 hunks)
  • docs/integrate/aws-dynamodb/index.md (1 hunks)
  • docs/integrate/aws-kinesis/index.md (1 hunks)
  • docs/integrate/azure-functions/index.md (1 hunks)
  • docs/integrate/azure-functions/learn.rst (1 hunks)
  • docs/integrate/cluvio/index.md (1 hunks)
  • docs/integrate/index.md (3 hunks)
  • docs/integrate/influxdb/index.md (1 hunks)
  • docs/integrate/influxdb/learn.md (1 hunks)
  • docs/integrate/meltano/index.md (1 hunks)
  • docs/integrate/mongodb/index.md (1 hunks)
  • docs/integrate/mongodb/learn.md (1 hunks)
  • docs/integrate/mysql/index.md (1 hunks)
  • docs/integrate/risingwave/apache-iceberg.md (1 hunks)
  • docs/integrate/risingwave/index.md (1 hunks)
  • docs/integrate/streamsets/index.md (1 hunks)
  • docs/integrate/streamsets/learn.rst (1 hunks)

Walkthrough

This update significantly expands and restructures the integration documentation. It introduces new integration pages for Apache Iceberg, AWS DMS, DynamoDB, Kinesis, Azure Functions, MongoDB, MySQL, and StreamSets. Several existing integration and ingestion docs are condensed or refactored, with detailed tutorials moved to dedicated "learn" pages. Link references are updated and reorganized.

Changes

Cohort / File(s) Change Summary
Integration Index and New Integrations
docs/integrate/index.md, docs/integrate/apache-iceberg/index.md, docs/integrate/aws-dms/index.md, docs/integrate/aws-dynamodb/index.md, docs/integrate/aws-kinesis/index.md, docs/integrate/azure-functions/index.md, docs/integrate/mongodb/index.md, docs/integrate/mysql/index.md, docs/integrate/streamsets/index.md
Added new integration documentation pages for Apache Iceberg, AWS DMS, DynamoDB, Kinesis, Azure Functions, MongoDB, MySQL, and StreamSets. Updated the integration index to include these new entries.
InfluxDB Documentation Overhaul
docs/integrate/influxdb/index.md, docs/integrate/influxdb/learn.md
Replaced detailed InfluxDB integration tutorial with a minimal overview and toctree. Added a comprehensive new "learn" tutorial page for importing data from InfluxDB into CrateDB.
MongoDB and Link Reference Updates
docs/_include/links.md
Updated and expanded link references for MongoDB and InfluxDB; changed some URLs to internal references and added new entries.
ETL, CDC, and Telemetry Docs Refactoring
docs/ingest/etl/index.md, docs/ingest/cdc/index.md, docs/ingest/telemetry/index.md
Condensed and restructured documentation for ETL, CDC, and telemetry integrations, replacing detailed text with concise references and removing embedded explanations.
RisingWave and Iceberg Documentation Adjustments
docs/integrate/risingwave/index.md, docs/integrate/risingwave/apache-iceberg.md
Updated internal reference labels and streamlined the "Learn" section for RisingWave.
Azure Functions, StreamSets, and Miscellaneous Reference Label Changes
docs/integrate/azure-functions/learn.rst, docs/integrate/streamsets/learn.rst
Changed document reference labels for Azure Functions and StreamSets "learn" pages.
Miscellaneous Documentation Formatting
docs/_include/card/timeseries-datashader.md, docs/integrate/meltano/index.md, docs/integrate/cluvio/index.md
Replaced inline warnings with admonition blocks and fixed minor markdown formatting issues.
MCP Documentation and Connect Index Updates
docs/connect/index.md, docs/connect/mcp/index.md
Added MCP page to connect index, restructured MCP documentation sections, and improved navigation and labeling.
Sphinx Configuration Update
docs/conf.py
Added a URL to the linkcheck ignore list to prevent timeout errors during documentation build.
Documentation Structural and Content Refinements
docs/connect/configure.md
Improved semantic markup and note presentation; included external markdown snippet and rephrased introductory text.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Docs
    participant IntegrationPage
    participant LearnPage

    User->>Docs: Access integration index
    Docs->>IntegrationPage: Link to integration (e.g., InfluxDB, MongoDB, etc.)
    User->>IntegrationPage: View overview and references
    IntegrationPage->>LearnPage: (Optional) Navigate to detailed tutorial
    User->>LearnPage: Follow step-by-step integration guide
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

enhancement, new content

Suggested reviewers

  • hammerhead
  • surister

Poem

In fields of docs where integrations grow,
New pages sprout, and old ones flow.
From Iceberg cool to Dynamo’s might,
The rabbit hops through links so bright.
Tutorials now in “learn” reside,
With every hop, more docs to guide!
🐇📚✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch layout-etl-cdc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🔭 Outside diff range comments (1)
docs/ingest/etl/index.md (1)

25-43: Add missing anchor definitions for ETL connector cross-references

I scanned the entire docs/ tree and found no MyST (label)= or RST .. _label: anchors for any of the connectors referenced in docs/ingest/etl/index.md. Without these, Sphinx will emit warnings for unresolved {ref} links. Please add one of the following in each connector’s doc page (and retain any legacy anchors for backward compatibility):

— MyST-style (at top of docs/integrations/<connector>/index.md):

(<connector>)=

# <Connector> integration

— RST-style (at top of docs/integrations/<connector>.rst):

.. _<connector>:

<Connector> integration
…

Missing anchors for:

  • apache-airflow
  • apache-flink
  • apache-hop
  • apache-iceberg
  • apache-kafka
  • apache-nifi
  • aws-dms
  • aws-dynamodb
  • aws-kinesis
  • azure-functions
  • dbt
  • estuary
  • influxdb
  • kestra
  • meltano
  • mongodb
  • mysql
  • node-red
  • risingwave
  • sql-server
  • streamsets
🧹 Nitpick comments (15)
docs/integrate/apache-iceberg/index.md (1)

14-16: Confirm Sphinx todo extension is enabled or switch to a standard admonition

:::{todo} renders only if the todo extension is enabled in the build. If not, use note:

-:::{todo}
-🚧 _Please note this page is a work in progress._ 🚧
-:::
+:::{note}
+🚧 This page is a work in progress. 🚧
+:::
docs/ingest/cdc/index.md (1)

8-10: Tighten wording (minor style)

Consider simplifying: “CrateDB integrates with third‑party CDC applications via its PostgreSQL interface.”

-CrateDB provides a variety of options to connect and integrate with third-party
-CDC applications, mostly using [CrateDB's PostgreSQL interface].
+CrateDB integrates with third‑party CDC applications via [CrateDB's PostgreSQL interface].
docs/integrate/aws-dynamodb/index.md (1)

20-22: Optional: add internal cross-refs alongside external links

Where applicable, complement external mentions with {ref} to our integration pages for better navigation. For example:

-Kinesis Stream, and consume that from an adapter to write into an analytical
+{ref}`aws-kinesis`, and consume that from an adapter to write into an analytical
docs/integrate/streamsets/index.md (2)

7-10: Tighten phrasing (“variety of different”).

Minor style nit to remove redundancy.

- that can ingest and transform data from a variety of different sources.
+ that can ingest and transform data from a variety of sources.

11-14: Grammar and duplication polish in the “from … to …” sentence.

  • Add “in” to “on-premises or in any cloud”.
  • Avoid listing “Kafka” twice (as both source and destination) to reduce noise.
-StreamSets Data Collector Engine makes it easy to run data pipelines from Kafka,
-Oracle, Salesforce, JDBC, Hive, and more to Snowflake, Databricks, S3, ADLS, Kafka
-and more. Data Collector Engine runs on-premises or any cloud, wherever your data
+StreamSets Data Collector Engine makes it easy to run data pipelines from Kafka,
+Oracle, Salesforce, JDBC, Hive, and more to Snowflake, Databricks, S3, ADLS,
+and more. Data Collector Engine runs on-premises or in any cloud, wherever your data
docs/integrate/mongodb/index.md (1)

12-12: Pronoun choice (“who” vs “that”).

Use “who” when referring to people.

-[MongoDB Atlas] is a multi-cloud database service by the same people that build MongoDB.
+[MongoDB Atlas] is a multi-cloud database service by the same people who build MongoDB.
docs/integrate/aws-kinesis/index.md (1)

16-19: Minor phrasing nit (“data stream … data records”).

Reduces redundancy.

-You can use Amazon Kinesis Data Streams to collect and process large streams of data
-records in real time. A typical Kinesis Data Streams application reads data from a
-data stream as data records.
+You can use Amazon Kinesis Data Streams to collect and process large data streams
+in real time. A typical application reads records from the stream.
docs/integrate/aws-dms/index.md (3)

17-21: Tighten phrasing around deployment locations

“either on-premises, or per EC2 instance databases” reads awkward. Suggest:

- AWS DMS supports migration between 20-plus database and analytics engines, either
- on-premises, or per EC2 instance databases. Supported data migration sources are:
+ AWS DMS supports migration between 20+ database and analytics engines, either
+ on-premises or on EC2-hosted databases. Supported data migration sources include:

30-33: Grammar/style: simplify deployment options sentence

More idiomatic wording and hyphenation of on-premises:

-CrateDB provides two variants how to conduct data migrations using AWS DMS.
-Either use it standalone / on your own premises, or use it in a completely
-managed environment with services of AWS and CrateDB Cloud.
+CrateDB supports two ways to run AWS DMS migrations:
+either standalone/on‑premises, or fully managed with AWS and CrateDB Cloud.

34-36: CDC wording clarity

Optional: uppercase CDC and clarify combo mode:

-AWS DMS supports both `full-load` and `cdc` operation modes, often used in
-combination with each other (`full-load-and-cdc`).
+AWS DMS supports both `full-load` and `CDC` operation modes, which are often
+combined (`full-load-and-CDC`).
docs/ingest/etl/index.md (1)

10-13: Concise, active voice

Minor style tweak; optional:

-CrateDB provides a variety of options to connect and integrate with third-party
+CrateDB provides many options to connect and integrate with third-party
 ETL applications, mostly using [CrateDB's PostgreSQL interface].
-CrateDB also provides a few native adapter components that can be used
-to leverage its advanced features.
+CrateDB also provides native adapter components to leverage advanced features.
docs/integrate/influxdb/learn.md (4)

134-143: Prefer headings over bold labels; unify list style

Switch bold labels to headings or rubric and use dash-style lists (per MD004/MD036).

-**CrateDB Cloud**
-* Host: ```purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net```
-* Username: ```admin```
-* Password: ```dZ..qB```
+### CrateDB Cloud
+- Host: `purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net`
+- Username: `admin`
+- Password: `dZ..qB`

-**InfluxDB Cloud**
-  * Host: ```eu-central-1-1.aws.cloud2.influxdata.com```
-  * Organization ID: ```9fafc869a91a3406```
-  * All-Access API token: ```T2..==```
+### InfluxDB Cloud
+- Host: `eu-central-1-1.aws.cloud2.influxdata.com`
+- Organization ID: `9fafc869a91a3406`
+- All-Access API token: `T2..==`

171-173: Minor style tweak

Reduce “also” and tighten phrasing.

-The InfluxDB I/O subsystem is based on the [influxio] package. Please also
-check its documentation to learn about more of its capabilities, supporting
-you when working with InfluxDB.
+The InfluxDB I/O subsystem is based on the [influxio] package. See its
+documentation for additional capabilities when working with InfluxDB.

181-190: Remove unused link definition

[What are series and bucket in InfluxDB] isn’t referenced; drop it to satisfy MD053.

-[What are series and bucket in InfluxDB]: https://stackoverflow.com/questions/58190272/what-are-series-and-bucket-in-influxdb/69951376#69951376

104-106: Docker --link is deprecated; consider networks

Optional: use a user-defined network and container names instead of --link for ctk/crash.

-alias crash="docker run --rm -it --link=cratedb ghcr.io/crate/cratedb-toolkit:latest crash"
-alias ctk="docker run --rm -it --link=cratedb --link=influxdb ghcr.io/crate/cratedb-toolkit:latest ctk"
+# Assuming both services are on 'demo-net':
+alias crash="docker run --rm -it --network=demo-net ghcr.io/crate/cratedb-toolkit:latest crash"
+alias ctk="docker run --rm -it --network=demo-net ghcr.io/crate/cratedb-toolkit:latest ctk"

Follow-up: start both services with --network demo-net.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 99f70bd and 3546bc4.

📒 Files selected for processing (19)
  • docs/_include/links.md (2 hunks)
  • docs/ingest/cdc/index.md (2 hunks)
  • docs/ingest/etl/index.md (1 hunks)
  • docs/ingest/telemetry/index.md (1 hunks)
  • docs/integrate/apache-iceberg/index.md (1 hunks)
  • docs/integrate/aws-dms/index.md (1 hunks)
  • docs/integrate/aws-dynamodb/index.md (1 hunks)
  • docs/integrate/aws-kinesis/index.md (1 hunks)
  • docs/integrate/azure-functions/index.md (1 hunks)
  • docs/integrate/azure-functions/learn.rst (1 hunks)
  • docs/integrate/index.md (3 hunks)
  • docs/integrate/influxdb/index.md (1 hunks)
  • docs/integrate/influxdb/learn.md (1 hunks)
  • docs/integrate/mongodb/index.md (1 hunks)
  • docs/integrate/mysql/index.md (1 hunks)
  • docs/integrate/risingwave/apache-iceberg.md (1 hunks)
  • docs/integrate/risingwave/index.md (1 hunks)
  • docs/integrate/streamsets/index.md (1 hunks)
  • docs/integrate/streamsets/learn.rst (1 hunks)
🧰 Additional context used
🧠 Learnings (14)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.
📚 Learning: 2025-08-09T11:00:03.537Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/index.md:312-316
Timestamp: 2025-08-09T11:00:03.537Z
Learning: In the CrateDB Guide documentation, locally stale references (such as `(reference-architectures)=`, `(domain)=`, and similar anchor labels) should be retained even when they appear unused within the repository, as external projects might still reference them. This ensures backward compatibility for external documentation and projects.

Applied to files:

  • docs/integrate/streamsets/learn.rst
  • docs/integrate/risingwave/apache-iceberg.md
  • docs/integrate/risingwave/index.md
  • docs/ingest/telemetry/index.md
  • docs/integrate/index.md
  • docs/integrate/apache-iceberg/index.md
  • docs/integrate/aws-dms/index.md
  • docs/ingest/cdc/index.md
  • docs/_include/links.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/influxdb/index.md
📚 Learning: 2025-08-09T10:59:42.048Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

Applied to files:

  • docs/integrate/streamsets/learn.rst
  • docs/integrate/risingwave/apache-iceberg.md
  • docs/integrate/risingwave/index.md
  • docs/ingest/telemetry/index.md
  • docs/integrate/index.md
  • docs/integrate/apache-iceberg/index.md
  • docs/integrate/aws-dms/index.md
  • docs/ingest/cdc/index.md
  • docs/_include/links.md
  • docs/ingest/etl/index.md
  • docs/integrate/mongodb/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/influxdb/index.md
📚 Learning: 2025-08-09T11:03:33.164Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/ingest/etl/index.md:22-23
Timestamp: 2025-08-09T11:03:33.164Z
Learning: In the CrateDB Guide documentation, MyST Markdown files can use `{ref}` directives to reference labels defined in reStructuredText files using the RST syntax `.. _label:`. This cross-format referencing between MyST and RST is supported by Sphinx.

Applied to files:

  • docs/integrate/streamsets/learn.rst
  • docs/integrate/risingwave/index.md
  • docs/integrate/apache-iceberg/index.md
  • docs/ingest/cdc/index.md
  • docs/_include/links.md
  • docs/ingest/etl/index.md
  • docs/integrate/mongodb/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/influxdb/index.md
📚 Learning: 2025-08-08T16:50:14.965Z
Learnt from: amotl
PR: crate/cratedb-guide#234
File: docs/home/index.md:47-50
Timestamp: 2025-08-08T16:50:14.965Z
Learning: In the CrateDB Guide docs (MyST), the correct intersphinx target for the CrateDB Cloud documentation homepage is `cloud:index` (not `cloud:docs-index` or `cloud-docs-index`). Use `:link: cloud:index` on cards/links. The `cloud` mapping is inherited via crate-docs-theme.

Applied to files:

  • docs/integrate/risingwave/index.md
  • docs/integrate/index.md
  • docs/integrate/aws-dms/index.md
  • docs/ingest/cdc/index.md
  • docs/_include/links.md
  • docs/ingest/etl/index.md
  • docs/integrate/mongodb/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/influxdb/index.md
📚 Learning: 2025-05-18T13:39:58.391Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/cratedb-mcp.md:3-8
Timestamp: 2025-05-18T13:39:58.391Z
Learning: In MyST Markdown, the correct syntax for rubric directives in the CrateDB documentation is:
```
:::{rubric} Title
:::
```
followed by the content outside of the directive. This is different from other admonition blocks where content is typically wrapped inside the directive.

Applied to files:

  • docs/integrate/risingwave/index.md
  • docs/ingest/telemetry/index.md
📚 Learning: 2025-05-18T12:50:38.681Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.

Applied to files:

  • docs/integrate/risingwave/index.md
  • docs/integrate/aws-dms/index.md
  • docs/ingest/cdc/index.md
  • docs/_include/links.md
  • docs/ingest/etl/index.md
  • docs/integrate/mongodb/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/influxdb/index.md
📚 Learning: 2025-05-18T12:50:36.393Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.

Applied to files:

  • docs/integrate/risingwave/index.md
  • docs/integrate/aws-dynamodb/index.md
  • docs/integrate/aws-kinesis/index.md
  • docs/ingest/telemetry/index.md
  • docs/integrate/index.md
  • docs/integrate/aws-dms/index.md
  • docs/ingest/cdc/index.md
  • docs/_include/links.md
  • docs/ingest/etl/index.md
  • docs/integrate/mongodb/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/influxdb/index.md
📚 Learning: 2025-08-07T23:11:09.657Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:1-1
Timestamp: 2025-08-07T23:11:09.657Z
Learning: In the CrateDB Guide repository, intersphinx mappings like `crate-admin-ui` are inherited from the root project `crate-docs-theme` rather than being defined locally in `docs/conf.py`. This allows shared documentation configurations across multiple CrateDB documentation projects.

Applied to files:

  • docs/integrate/risingwave/index.md
  • docs/integrate/index.md
  • docs/ingest/cdc/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/influxdb/index.md
📚 Learning: 2025-08-07T23:11:08.311Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:11-11
Timestamp: 2025-08-07T23:11:08.311Z
Learning: In the CrateDB Guide repository, intersphinx mappings like "cloud" are defined within the root project `crate-docs-theme` and inherited by the documentation projects, so they don't need to be explicitly defined in individual `docs/conf.py` files.

Applied to files:

  • docs/integrate/index.md
  • docs/integrate/aws-dms/index.md
  • docs/ingest/cdc/index.md
  • docs/integrate/mongodb/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/influxdb/index.md
📚 Learning: 2025-05-18T13:25:11.786Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-05T07:14:57.416Z
Learnt from: hammerhead
PR: crate/cratedb-guide#221
File: docs/connect/configure.md:58-66
Timestamp: 2025-08-05T07:14:57.416Z
Learning: In CrateDB connection strings, the user:password@ syntax is valid for HTTP Basic authentication on port 4200, but PostgreSQL JDBC drivers do not support this format and require credentials as query parameters (?user=<user>&password=<password>) instead.

Applied to files:

  • docs/ingest/cdc/index.md
📚 Learning: 2025-06-05T14:29:15.512Z
Learnt from: amotl
PR: crate/cratedb-guide#207
File: docs/integrate/etl/iceberg-risingwave.md:205-207
Timestamp: 2025-06-05T14:29:15.512Z
Learning: The `records.Database("crate://", echo=True)` connection string for CrateDB works with defaults: localhost as host, "crate" as user, and blank password. This is valid and functional code in the records library.

Applied to files:

  • docs/ingest/cdc/index.md
📚 Learning: 2025-08-06T02:11:46.794Z
Learnt from: amotl
PR: crate/cratedb-guide#222
File: docs/integrate/powerbi/index.md:16-23
Timestamp: 2025-08-06T02:11:46.794Z
Learning: The `project:` role in MyST Markdown syntax like `[](project:#powerbi-desktop)` and `[](project:#powerbi-service)` is valid cross-reference syntax in the CrateDB Guide repository's MyST Markdown setup.

Applied to files:

  • docs/integrate/mongodb/index.md
🪛 markdownlint-cli2 (0.17.2)
docs/integrate/aws-kinesis/index.md

27-27: Link fragments should be valid

(MD051, link-fragments)

docs/_include/links.md

30-30: Link and image reference definitions should be needed
Unused link or image reference definition: "influxdb"

(MD053, link-image-reference-definitions)


42-42: Link and image reference definitions should be needed
Unused link or image reference definition: "mongodb"

(MD053, link-image-reference-definitions)


43-43: Link and image reference definitions should be needed
Unused link or image reference definition: "mongodb atlas"

(MD053, link-image-reference-definitions)


44-44: Link and image reference definitions should be needed
Unused link or image reference definition: "mongodb cdc relay"

(MD053, link-image-reference-definitions)


45-45: Link and image reference definitions should be needed
Unused link or image reference definition: "mongodb change streams"

(MD053, link-image-reference-definitions)


46-46: Link and image reference definitions should be needed
Unused link or image reference definition: "mongodb collections and databases"

(MD053, link-image-reference-definitions)


47-47: Link and image reference definitions should be needed
Unused link or image reference definition: "mongodb table loader"

(MD053, link-image-reference-definitions)

docs/ingest/etl/index.md

18-18: Link fragments should be valid

(MD051, link-fragments)

docs/integrate/influxdb/learn.md

134-134: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


135-135: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


136-136: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


137-137: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


139-139: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


140-140: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


140-140: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


141-141: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


141-141: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


142-142: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


142-142: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


162-162: Bare URL used

(MD034, no-bare-urls)


189-189: Link and image reference definitions should be needed
Unused link or image reference definition: "what are series and bucket in influxdb"

(MD053, link-image-reference-definitions)

🪛 LanguageTool
docs/integrate/streamsets/index.md

[style] ~9-~9: Consider using a more concise synonym.
Context: ...that can ingest and transform data from a variety of different sources. StreamSets Data Col...

(A_VARIETY_OF)

docs/ingest/cdc/index.md

[style] ~8-~8: Consider using a more concise synonym.
Context: ...links.md ::: :::{div} CrateDB provides a variety of options to connect and integrate with t...

(A_VARIETY_OF)

docs/ingest/etl/index.md

[style] ~10-~10: Consider using a more concise synonym.
Context: ...links.md ::: :::{div} CrateDB provides a variety of options to connect and integrate with t...

(A_VARIETY_OF)


[locale-violation] ~18-~18: In American English, “take a look” is more commonly used.
Context: ... how to use them optimally. Please also have a look at support for solutions. ::: ...

(HAVE_A_LOOK)

docs/integrate/mongodb/index.md

[style] ~12-~12: Consider using “who” when you are referring to people instead of objects.
Context: ...oud database service by the same people that build MongoDB. Atlas simplifies deployi...

(THAT_WHO)

docs/integrate/influxdb/learn.md

[style] ~171-~171: The word “also” tends to be overused. Consider using a formal alternative to strengthen your wording.
Context: ...tem is based on the [influxio] package. Please also check its documentation to learn about ...

(PLEASE_ALSO_CHECK)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (20)
docs/integrate/streamsets/learn.rst (1)

1-1: Ignore legacy anchor addition in learn.rst

The legacy streamsets anchor is already defined in docs/integrate/streamsets/index.md, and all existing {ref}streamsets`` links in the CDC and ETL guides resolve to that definition. No changes are needed to docs/integrate/streamsets/learn.rst.

Likely an incorrect or invalid review comment.

docs/integrate/risingwave/index.md (2)

78-80: LGTM: Simplified Learn section

Concise pointer to the canonical tutorial improves signal and avoids duplication.


86-92: Hidden toctree entry “apache-iceberg” validated

  • docs/integrate/risingwave/apache-iceberg.md exists and defines the (risingwave-iceberg) label.
  • Cross-references using {ref}risingwave-iceberg`` appear in the docs, confirming navigation builders pick it up.

No further changes needed.

docs/ingest/telemetry/index.md (1)

6-13: Verified anchors for Prometheus and Telegraf
Both (prometheus)= and (telegraf)= labels are still defined, so all references will resolve correctly:

  • docs/integrate/prometheus/index.md:1 (prometheus)=
  • docs/integrate/telegraf/index.md:1 (telegraf)=
  • docs/use/industrial/distributed-ml.rst:47 .. _Prometheus:

No further changes needed here.

docs/integrate/index.md (2)

16-60: Index expansion looks consistent and alphabetically placed

The additions are well slotted alphabetically and match the new pages introduced in this PR. No structural concerns here.


22-22: All new toctree targets and anchors verified

  • All eight docs/integrate/<name>/index.md files are present.
  • Each {ref} anchor ((apache-iceberg)=, (azure-functions)=, (aws-dms)=, (aws-dynamodb)=, (aws-kinesis)=, (mongodb)=, (mysql)=, (streamsets)=) is defined (line 1 of its respective index.md).

No broken links detected.

docs/integrate/mysql/index.md (1)

1-2: Nice: legacy anchors retained for compatibility

Keeping both (mysql)= and (mariadb)= aligns with the project’s guidance to preserve legacy labels for external references.

docs/integrate/apache-iceberg/index.md (1)

11-13: Confirmed {ref} target risingwave-iceberg exists

  • Label defined at docs/integrate/risingwave/apache-iceberg.md:1
    Cross-reference will resolve correctly.
docs/ingest/cdc/index.md (1)

19-25: Use a toctree instead of a manual bullet list for navigation

The PR objective mentions standardizing on toctree; this page still lists refs manually.

Replace the list with a toctree:

-- {ref}`aws-dms`
-- {ref}`aws-dynamodb`
-- {ref}`aws-kinesis`
-- {ref}`debezium`
-- {ref}`mongodb`
-- {ref}`streamsets`
+:::{toctree}
+:maxdepth: 1
+
+integrate/aws-dms/index
+integrate/aws-dynamodb/index
+integrate/aws-kinesis/index
+integrate/debezium/index
+integrate/mongodb/index
+integrate/streamsets/index
+:::

Likely an incorrect or invalid review comment.

docs/integrate/aws-dynamodb/index.md (1)

11-15: Link reference definitions verified

I ran the verification script against docs/_include/links.md and confirmed that definitions for all referenced links (DynamoDB Table Loader, DynamoDB CDC Relay, Amazon DynamoDB Streams, Amazon Kinesis Data Streams, DynamoDB CDC Relay with AWS Lambda, Replicating CDC events from DynamoDB to CrateDB) are present. No broken links detected.

docs/integrate/azure-functions/index.md (1)

1-39: Solid addition; structure and cross-links are consistent.

Includes shared links, clear About/Learn sections, and a learn toctree. No blocking issues.

docs/integrate/mongodb/index.md (1)

40-43: Fix contradictory parenthetical; verify the CLI command name.

This row describes streaming CDC changes but labels it as “(full-load)”. That’s misleading; should be “(cdc)”. Also, please verify whether the CLI command is indeed ctk load table for CDC, or if there is a dedicated CDC command.

-*   - [MongoDB CDC Relay]
+*   - [MongoDB CDC Relay]
     - ✅
     - ✅
-    - CLI `ctk load table` for streaming changes of collections into CrateDB (`full-load`).
+    - CLI `ctk load table` for streaming changes of collections into CrateDB (`cdc`).

Follow-up:

  • If the CDC tooling uses a different subcommand (e.g., relay-specific), adjust the command name accordingly.
docs/_include/links.md (1)

30-47: Intentional central link registry; safe to keep “unused” refs.

markdownlint’s MD053 warnings about unused definitions are expected here. This file centralizes link refs and maintains backward compatibility for external consumers, per prior guidance. No change needed.

docs/integrate/aws-dms/index.md (2)

1-3: Good call keeping dual anchors for back-compat

Having both (aws-dms)= and (cdc-dms)= aligns with the repo’s pattern of retaining legacy labels. No issues here.


11-21: Verify link reference for “AWS Database Migration Service (AWS DMS)”

Ensure the reference label is defined in _include/links.md exactly as used (including parentheses). If not present, switch to a known label (e.g., [AWS DMS]) or add a new definition.

Would you like me to add the link definition in _include/links.md?

docs/ingest/etl/index.md (1)

10-20: Overall restructure LGTM

The concise index with refs aligns with the new layout strategy. Good move removing ad-hoc TOC assembly.

docs/integrate/influxdb/learn.md (1)

1-9: Comprehensive tutorial is clear and actionable

Solid structure, examples, and cloud notes. Once the minor nits above are addressed, this looks production-ready.

Also applies to: 39-57, 58-119, 120-170

docs/integrate/influxdb/index.md (3)

1-1: Anchor strategy check

You moved the legacy (integrate-influxdb)= label to learn.md and introduced (influxdb)= here. If inbound links previously targeted (integrate-influxdb)=, they now land on the tutorial, not the overview. If that’s intentional, all good; otherwise consider adding a small “Overview” at the top of learn.md or an alias note here.

Do you want me to scan the repo for references to (integrate-influxdb)= and summarize impacts?


10-15: About text LGTM

Clear, succinct intro to InfluxDB’s purpose and fit. No issues.


20-23: Toctree wiring

learn target exists and is discoverable; maxdepth=1 is appropriate. Good.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
docs/integrate/influxdb/learn.md (4)

15-18: Avoid embedding credentials in URLs; use env vars/placeholders

Replace inline user:password@ with environment variables to prevent leaking secrets in examples.

Apply:

 ctk load table \
   "influxdb2://example:[email protected]:8086/testdrive/demo" \
-  --cratedb-sqlalchemy-url="crate://user:[email protected]:4200/testdrive/demo"
+  --cratedb-sqlalchemy-url="crate://user:${CRATEPW}@cratedb.example.org:4200/testdrive/demo"

And define the variable before use (outside this hunk):

export CRATEPW='your-password'

28-31: Repeat: remove cleartext credentials from example URL

Same issue in the line protocol example. Use an env var for the password.

 ctk load table \
   "https://github.com/influxdata/influxdb2-sample-data/raw/master/air-sensor-data/air-sensor-data.lp" \
-  --cratedb-sqlalchemy-url="crate://user:[email protected]:4200/testdrive/air-sensor-data"
+  --cratedb-sqlalchemy-url="crate://user:${CRATEPW}@cratedb.example.org:4200/testdrive/air-sensor-data"

154-158: Cloud-to-Cloud: don’t publish secrets; parameterize both CrateDB and InfluxDB credentials

Use environment variables for both CrateDB password and InfluxDB org/token to avoid static secrets in docs and command history.

-export CRATEPW='dZ..qB'
+export CRATEPW='…'
+export INFLUX_ORG='9f…06'
+export INFLUX_TOKEN='T2..=='
 ctk load table \
-  "influxdb2://9f..06:[email protected]/testdrive/demo?ssl=true" \
-  --cratedb-sqlalchemy-url="crate://admin:${CRATEPW}@purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net:4200/testdrive/demo?ssl=true"
+  "influxdb2://${INFLUX_ORG}:${INFLUX_TOKEN}@eu-central-1-1.aws.cloud2.influxdata.com/testdrive/demo?ssl=true" \
+  --cratedb-sqlalchemy-url="crate://admin:${CRATEPW}@purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net:4200/testdrive/demo?ssl=true"

If CTK supports dedicated flags for Influx credentials, prefer those over URL embedding.


167-169: Avoid passing credentials in crash URL; use flags and env var instead

Prefer --host/--username and, if available, --passwordenv to keep secrets out of URLs and shell history.

-export CRATEPW='dZ..qB'
-crash --hosts 'https://admin:${CRATEPW}@purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net:4200' --command 'SELECT * FROM testdrive.demo;'
+export CRATEPW='…'
+# If supported by crash:
+# crash --host='purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net' --username='admin' --passwordenv CRATEPW --command 'SELECT * FROM testdrive.demo;'
+# Otherwise, omit password on CLI to get an interactive prompt:
+crash --host='purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net' --username='admin' --command 'SELECT * FROM testdrive.demo;'

If crash lacks --passwordenv, consider documenting the prompt-based flow instead.

🧹 Nitpick comments (1)
docs/ingest/cdc/index.md (1)

16-16: Minor wording: “take a look” in US English

Suggested copyedit for consistency with US English.

-Please also have a look at support for {ref}`generic ETL <etl>` solutions.
+Please also take a look at support for {ref}`generic ETL <etl>` solutions.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3546bc4 and dea439d.

📒 Files selected for processing (13)
  • docs/_include/card/timeseries-datashader.md (1 hunks)
  • docs/ingest/cdc/index.md (1 hunks)
  • docs/ingest/etl/index.md (1 hunks)
  • docs/integrate/apache-iceberg/index.md (1 hunks)
  • docs/integrate/aws-dms/index.md (1 hunks)
  • docs/integrate/aws-kinesis/index.md (1 hunks)
  • docs/integrate/cluvio/index.md (1 hunks)
  • docs/integrate/influxdb/learn.md (1 hunks)
  • docs/integrate/meltano/index.md (1 hunks)
  • docs/integrate/mongodb/index.md (1 hunks)
  • docs/integrate/mysql/index.md (1 hunks)
  • docs/integrate/risingwave/apache-iceberg.md (1 hunks)
  • docs/integrate/streamsets/index.md (1 hunks)
✅ Files skipped from review due to trivial changes (4)
  • docs/integrate/meltano/index.md
  • docs/_include/card/timeseries-datashader.md
  • docs/integrate/cluvio/index.md
  • docs/integrate/apache-iceberg/index.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • docs/integrate/risingwave/apache-iceberg.md
  • docs/integrate/aws-kinesis/index.md
  • docs/integrate/mysql/index.md
  • docs/integrate/aws-dms/index.md
🧰 Additional context used
🧠 Learnings (14)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.
📚 Learning: 2025-08-09T16:19:43.947Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
  • docs/integrate/streamsets/index.md
📚 Learning: 2025-08-09T10:59:42.048Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-09T11:00:03.537Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/index.md:312-316
Timestamp: 2025-08-09T11:00:03.537Z
Learning: In the CrateDB Guide documentation, locally stale references (such as `(reference-architectures)=`, `(domain)=`, and similar anchor labels) should be retained even when they appear unused within the repository, as external projects might still reference them. This ensures backward compatibility for external documentation and projects.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
📚 Learning: 2025-05-18T12:50:36.393Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-05-18T12:50:38.681Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-07T23:11:08.311Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:11-11
Timestamp: 2025-08-07T23:11:08.311Z
Learning: In the CrateDB Guide repository, intersphinx mappings like "cloud" are defined within the root project `crate-docs-theme` and inherited by the documentation projects, so they don't need to be explicitly defined in individual `docs/conf.py` files.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-09T11:03:33.164Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/ingest/etl/index.md:22-23
Timestamp: 2025-08-09T11:03:33.164Z
Learning: In the CrateDB Guide documentation, MyST Markdown files can use `{ref}` directives to reference labels defined in reStructuredText files using the RST syntax `.. _label:`. This cross-format referencing between MyST and RST is supported by Sphinx.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-08T16:50:14.965Z
Learnt from: amotl
PR: crate/cratedb-guide#234
File: docs/home/index.md:47-50
Timestamp: 2025-08-08T16:50:14.965Z
Learning: In the CrateDB Guide docs (MyST), the correct intersphinx target for the CrateDB Cloud documentation homepage is `cloud:index` (not `cloud:docs-index` or `cloud-docs-index`). Use `:link: cloud:index` on cards/links. The `cloud` mapping is inherited via crate-docs-theme.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-07T23:11:09.657Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:1-1
Timestamp: 2025-08-07T23:11:09.657Z
Learning: In the CrateDB Guide repository, intersphinx mappings like `crate-admin-ui` are inherited from the root project `crate-docs-theme` rather than being defined locally in `docs/conf.py`. This allows shared documentation configurations across multiple CrateDB documentation projects.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/integrate/influxdb/learn.md
📚 Learning: 2025-05-18T13:25:11.786Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-06T02:11:46.794Z
Learnt from: amotl
PR: crate/cratedb-guide#222
File: docs/integrate/powerbi/index.md:16-23
Timestamp: 2025-08-06T02:11:46.794Z
Learning: The `project:` role in MyST Markdown syntax like `[](project:#powerbi-desktop)` and `[](project:#powerbi-service)` is valid cross-reference syntax in the CrateDB Guide repository's MyST Markdown setup.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-05T07:14:57.416Z
Learnt from: hammerhead
PR: crate/cratedb-guide#221
File: docs/connect/configure.md:58-66
Timestamp: 2025-08-05T07:14:57.416Z
Learning: In CrateDB connection strings, the user:password@ syntax is valid for HTTP Basic authentication on port 4200, but PostgreSQL JDBC drivers do not support this format and require credentials as query parameters (?user=<user>&password=<password>) instead.

Applied to files:

  • docs/integrate/influxdb/learn.md
📚 Learning: 2025-06-05T14:29:15.512Z
Learnt from: amotl
PR: crate/cratedb-guide#207
File: docs/integrate/etl/iceberg-risingwave.md:205-207
Timestamp: 2025-06-05T14:29:15.512Z
Learning: The `records.Database("crate://", echo=True)` connection string for CrateDB works with defaults: localhost as host, "crate" as user, and blank password. This is valid and functional code in the records library.

Applied to files:

  • docs/integrate/influxdb/learn.md
🪛 LanguageTool
docs/ingest/cdc/index.md

[locale-violation] ~16-~16: In American English, “take a look” is more commonly used.
Context: ... how to use them optimally. Please also have a look at support for {ref}generic ETL <etl>...

(HAVE_A_LOOK)

docs/ingest/etl/index.md

[style] ~17-~17: The word “also” tends to be overused. Consider using a formal alternative to strengthen your wording.
Context: ...and outlines how to use them optimally. Please also take a look at support for {ref}cdc s...

(PLEASE_ALSO_CHECK)

docs/integrate/streamsets/index.md

[style] ~9-~9: Consider using a more concise synonym.
Context: ...that can ingest and transform data from a variety of sources. StreamSets Data Collector Eng...

(A_VARIETY_OF)

🪛 markdownlint-cli2 (0.17.2)
docs/integrate/influxdb/learn.md

168-168: Bare URL used

(MD034, no-bare-urls)

docs/integrate/mongodb/index.md

30-30: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


30-30: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


31-31: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


31-31: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


32-32: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


32-32: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


33-33: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


33-33: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


34-34: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


34-34: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


35-35: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


35-35: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


36-36: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


36-36: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


37-37: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


37-37: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


39-39: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


39-39: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


40-40: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


40-40: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


41-41: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


41-41: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


42-42: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


42-42: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


43-43: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


43-43: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


44-44: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


44-44: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


45-45: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


45-45: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


46-46: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


46-46: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

🔇 Additional comments (10)
docs/ingest/cdc/index.md (2)

8-11: LGTM: concise intro and positioning

Clear statement of CDC via PostgreSQL interface and native adapters aligns with the new layout.


19-24: Cross-reference anchors verified
All referenced labels—aws-dms, aws-dynamodb, aws-kinesis, debezium, mongodb, and streamsets—are defined in their respective docs/integrate/*/index.md files. No further changes are needed.

docs/integrate/streamsets/index.md (1)

1-22: Index + learn toctree pattern looks correct

Primary anchor on index and a learn subpage via toctree matches the intended layout. No blockers.

docs/integrate/mongodb/index.md (4)

22-24: Good scoping of full-load and CDC Learn pointers

Clear separation of full-load vs CDC concepts.


51-55: Verify MongoDB learn page and anchor
I wasn’t able to locate a learn file or the mongodb-learn anchor under docs/integrate/mongodb. Please manually verify that:

  • A tutorial file exists (e.g. docs/integrate/mongodb/learn.rst or mongodb-learn.rst).
  • It includes the anchor directive:
    .. _mongodb-learn:

Once in place, the hidden toctree entry will link correctly to that page.


43-49: Confirm Cloud intersphinx target availability

Please ensure that the cloud:integrations-mongo-cdc key actually exists in the CrateDB Cloud docs inventory (objects.inv). You can verify by fetching and grepping the remote inventory, for example:

curl -sSL https://docs.cratedb.cloud/objects.inv \
  | sed -n 's/^\([^ ]*\) .*/\1/p' \
  | grep integrations-mongo-cdc

If it isn’t present (or uses a different anchor), update the reference accordingly.

Affected locations:

  • docs/ingest/index.md:41
  • docs/integrate/mongodb/index.md:43

40-43: Update MongoDB CDC row with correct CTK syntax

I wasn’t able to find any dedicated ctk stream command in the repo—only ctk load table is referenced. It looks like CDC streaming may be handled via a mode/flag rather than a separate subcommand. Please confirm the exact syntax (e.g., --mode cdc, a stream flag, or a different subcommand) and update accordingly.

• File: docs/integrate/mongodb/index.md
Lines: 40–43

Suggested diff:

-   - CLI `ctk load table` for streaming changes of collections into CrateDB (`full-load`).
+   - CLI `ctk load table` for streaming changes of collections into CrateDB (`cdc` mode).
+     (e.g., `ctk load table --mode cdc …`)
docs/ingest/etl/index.md (3)

10-13: LGTM: intro mirrors CDC page and follows the new layout

Concise, consistent positioning with mention of native adapters.


18-18: Nice cross-link to CDC

Cross-reference fixed to {ref}. Matches earlier guidance.


25-43: All new reference anchors verified – no broken links detected

Each newly added {ref} anchor in docs/ingest/etl/index.md is defined in its corresponding integration guide:

  • docs/integrate/apache-iceberg/index.md: (apache-iceberg)=
  • docs/integrate/aws-dms/index.md: (aws-dms)=
  • docs/integrate/aws-dynamodb/index.md: (aws-dynamodb)=
  • docs/integrate/aws-kinesis/index.md: (aws-kinesis)=
  • docs/integrate/influxdb/index.md: (influxdb)=
  • docs/integrate/mongodb/index.md: (mongodb)=
  • docs/integrate/mysql/index.md: (mysql)=
  • docs/integrate/streamsets/index.md: (streamsets)=

No broken links expected.

@amotl amotl force-pushed the layout-etl-cdc branch 2 times, most recently from c187654 to 6fb05a4 Compare August 9, 2025 17:05
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
docs/integrate/influxdb/learn.md (3)

13-21: Avoid embedding credentials in URLs and commands

Replace inline passwords with env vars to prevent leakage and bad copy-paste habits.

-ctk load table \
-  "influxdb2://example:[email protected]:8086/testdrive/demo" \
-  --cratedb-sqlalchemy-url="crate://user:[email protected]:4200/testdrive/demo"
+export INFLUX_TOKEN='...'
+export CRATEPW='...'
+ctk load table \
+  "influxdb2://example:${INFLUX_TOKEN}@influxdb.example.org:8086/testdrive/demo" \
+  --cratedb-sqlalchemy-url="crate://user:${CRATEPW}@cratedb.example.org:4200/testdrive/demo"
@@
-export CRATEPW=password
-crash --host=cratedb.example.org --username=user --command='SELECT * FROM testdrive.demo;'
+crash --host=cratedb.example.org --username=user --command='SELECT * FROM testdrive.demo;'
+# If crash supports password env flags, prefer them; otherwise, it will prompt interactively.

Also applies to: 26-34


151-156: Redact secrets in Cloud-to-Cloud examples and use env vars

Do not include access tokens or passwords in example URLs; recommend env vars and safer crash invocation.

-export CRATEPW='dZ..qB'
-ctk load table \
-  "influxdb2://9f..06:[email protected]/testdrive/demo?ssl=true" \
-  --cratedb-sqlalchemy-url="crate://admin:${CRATEPW}@purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net:4200/testdrive/demo?ssl=true"
+export INFLUX_ORG='9f..06'
+export INFLUX_TOKEN='T2..=='
+export CRATEPW='...'
+ctk load table \
+  "influxdb2://${INFLUX_ORG}:${INFLUX_TOKEN}@eu-central-1-1.aws.cloud2.influxdata.com/testdrive/demo?ssl=true" \
+  --cratedb-sqlalchemy-url="crate://admin:${CRATEPW}@purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net:4200/testdrive/demo?ssl=true"
@@
-export CRATEPW='dZ..qB'
-crash --hosts 'https://admin:${CRATEPW}@purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net:4200' --command 'SELECT * FROM testdrive.demo;'
+# Prefer separate flags over embedding credentials in a URL; crash will prompt for password if not provided.
+crash --host='purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net' --username='admin' --command 'SELECT * FROM testdrive.demo;'

Also applies to: 165-167


1-3: Add the canonical learn-page anchor

Per convention, give the learn page its own anchor and keep (import-influxdb)= as an alias if useful.

+(integrate-influxdb-learn)=
 (import-influxdb)=
 # Import data from InfluxDB
🧹 Nitpick comments (2)
docs/integrate/streamsets/index.md (1)

11-14: Tighten phrasing; avoid duplication (“Kafka” appears twice)

Consider a crisper sentence and remove the duplicate “Kafka”.

-StreamSets Data Collector Engine makes it easy to run data pipelines from Kafka,
-Oracle, Salesforce, JDBC, Hive, and more to Snowflake, Databricks, S3, ADLS, Kafka
-and more. Data Collector Engine runs on-premises or in any cloud, wherever your data
-lives.
+StreamSets Data Collector Engine makes it easy to run data pipelines from Kafka,
+Oracle, Salesforce, JDBC, Hive, and more to Snowflake, Databricks, S3, and ADLS.
+It runs on‑premises or in any cloud, wherever your data lives.
docs/ingest/etl/index.md (1)

18-18: Tighten wording for concision

Reduce “also” and prefer a crisper construction.

-Please also take a look at support for {ref}`cdc` solutions.
+Additionally, see {ref}`cdc` solutions.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c187654 and 6fb05a4.

📒 Files selected for processing (14)
  • docs/_include/card/timeseries-datashader.md (1 hunks)
  • docs/ingest/cdc/index.md (1 hunks)
  • docs/ingest/etl/index.md (1 hunks)
  • docs/integrate/apache-iceberg/index.md (1 hunks)
  • docs/integrate/aws-dms/index.md (1 hunks)
  • docs/integrate/aws-kinesis/index.md (1 hunks)
  • docs/integrate/cluvio/index.md (1 hunks)
  • docs/integrate/influxdb/index.md (1 hunks)
  • docs/integrate/influxdb/learn.md (1 hunks)
  • docs/integrate/meltano/index.md (1 hunks)
  • docs/integrate/mongodb/index.md (1 hunks)
  • docs/integrate/mysql/index.md (1 hunks)
  • docs/integrate/risingwave/apache-iceberg.md (1 hunks)
  • docs/integrate/streamsets/index.md (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • docs/integrate/cluvio/index.md
  • docs/integrate/meltano/index.md
🚧 Files skipped from review as they are similar to previous changes (6)
  • docs/integrate/risingwave/apache-iceberg.md
  • docs/_include/card/timeseries-datashader.md
  • docs/integrate/apache-iceberg/index.md
  • docs/integrate/aws-kinesis/index.md
  • docs/integrate/aws-dms/index.md
  • docs/integrate/mysql/index.md
🧰 Additional context used
🧠 Learnings (14)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.
📚 Learning: 2025-08-09T16:19:43.947Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

Applied to files:

  • docs/integrate/influxdb/index.md
  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
  • docs/integrate/streamsets/index.md
📚 Learning: 2025-08-09T10:59:42.048Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

Applied to files:

  • docs/integrate/influxdb/index.md
  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-09T11:00:03.537Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/index.md:312-316
Timestamp: 2025-08-09T11:00:03.537Z
Learning: In the CrateDB Guide documentation, locally stale references (such as `(reference-architectures)=`, `(domain)=`, and similar anchor labels) should be retained even when they appear unused within the repository, as external projects might still reference them. This ensures backward compatibility for external documentation and projects.

Applied to files:

  • docs/integrate/influxdb/index.md
  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
📚 Learning: 2025-05-18T12:50:36.393Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.

Applied to files:

  • docs/integrate/influxdb/index.md
  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-05-18T12:50:38.681Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.

Applied to files:

  • docs/integrate/influxdb/index.md
  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-07T23:11:09.657Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:1-1
Timestamp: 2025-08-07T23:11:09.657Z
Learning: In the CrateDB Guide repository, intersphinx mappings like `crate-admin-ui` are inherited from the root project `crate-docs-theme` rather than being defined locally in `docs/conf.py`. This allows shared documentation configurations across multiple CrateDB documentation projects.

Applied to files:

  • docs/integrate/influxdb/index.md
  • docs/integrate/influxdb/learn.md
📚 Learning: 2025-08-07T23:11:08.311Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:11-11
Timestamp: 2025-08-07T23:11:08.311Z
Learning: In the CrateDB Guide repository, intersphinx mappings like "cloud" are defined within the root project `crate-docs-theme` and inherited by the documentation projects, so they don't need to be explicitly defined in individual `docs/conf.py` files.

Applied to files:

  • docs/integrate/influxdb/index.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-08T16:50:14.965Z
Learnt from: amotl
PR: crate/cratedb-guide#234
File: docs/home/index.md:47-50
Timestamp: 2025-08-08T16:50:14.965Z
Learning: In the CrateDB Guide docs (MyST), the correct intersphinx target for the CrateDB Cloud documentation homepage is `cloud:index` (not `cloud:docs-index` or `cloud-docs-index`). Use `:link: cloud:index` on cards/links. The `cloud` mapping is inherited via crate-docs-theme.

Applied to files:

  • docs/integrate/influxdb/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-09T11:03:33.164Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/ingest/etl/index.md:22-23
Timestamp: 2025-08-09T11:03:33.164Z
Learning: In the CrateDB Guide documentation, MyST Markdown files can use `{ref}` directives to reference labels defined in reStructuredText files using the RST syntax `.. _label:`. This cross-format referencing between MyST and RST is supported by Sphinx.

Applied to files:

  • docs/integrate/influxdb/index.md
  • docs/ingest/cdc/index.md
  • docs/ingest/etl/index.md
  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-06T02:11:46.794Z
Learnt from: amotl
PR: crate/cratedb-guide#222
File: docs/integrate/powerbi/index.md:16-23
Timestamp: 2025-08-06T02:11:46.794Z
Learning: The `project:` role in MyST Markdown syntax like `[](project:#powerbi-desktop)` and `[](project:#powerbi-service)` is valid cross-reference syntax in the CrateDB Guide repository's MyST Markdown setup.

Applied to files:

  • docs/ingest/cdc/index.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-05T07:14:57.416Z
Learnt from: hammerhead
PR: crate/cratedb-guide#221
File: docs/connect/configure.md:58-66
Timestamp: 2025-08-05T07:14:57.416Z
Learning: In CrateDB connection strings, the user:password@ syntax is valid for HTTP Basic authentication on port 4200, but PostgreSQL JDBC drivers do not support this format and require credentials as query parameters (?user=<user>&password=<password>) instead.

Applied to files:

  • docs/integrate/influxdb/learn.md
📚 Learning: 2025-06-05T14:29:15.512Z
Learnt from: amotl
PR: crate/cratedb-guide#207
File: docs/integrate/etl/iceberg-risingwave.md:205-207
Timestamp: 2025-06-05T14:29:15.512Z
Learning: The `records.Database("crate://", echo=True)` connection string for CrateDB works with defaults: localhost as host, "crate" as user, and blank password. This is valid and functional code in the records library.

Applied to files:

  • docs/integrate/influxdb/learn.md
📚 Learning: 2025-05-18T13:25:11.786Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.

Applied to files:

  • docs/integrate/mongodb/index.md
🪛 LanguageTool
docs/ingest/cdc/index.md

[style] ~15-~15: The word “also” tends to be overused. Consider using a formal alternative to strengthen your wording.
Context: ...and outlines how to use them optimally. Please also take a look at support for {ref}`generi...

(PLEASE_ALSO_CHECK)

docs/ingest/etl/index.md

[style] ~17-~17: The word “also” tends to be overused. Consider using a formal alternative to strengthen your wording.
Context: ...and outlines how to use them optimally. Please also take a look at support for {ref}cdc s...

(PLEASE_ALSO_CHECK)

docs/integrate/streamsets/index.md

[style] ~9-~9: Consider using a more concise synonym.
Context: ...that can ingest and transform data from a variety of sources. StreamSets Data Collector Eng...

(A_VARIETY_OF)

🪛 markdownlint-cli2 (0.17.2)
docs/integrate/influxdb/learn.md

166-166: Bare URL used

(MD034, no-bare-urls)

docs/integrate/mongodb/index.md

30-30: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


30-30: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


31-31: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


31-31: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


32-32: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


32-32: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


33-33: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


33-33: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


34-34: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


34-34: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


35-35: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


35-35: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


36-36: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


36-36: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


37-37: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


37-37: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


39-39: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


39-39: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


40-40: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


40-40: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


41-41: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


41-41: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


42-42: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


42-42: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


43-43: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


43-43: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


44-44: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


44-44: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


45-45: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


45-45: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


46-46: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


46-46: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (9)
docs/integrate/streamsets/index.md (1)

1-3: Anchor placement and structure look correct

Main anchor in index with a simple toctree to learn aligns with the new pattern. No blockers.

docs/integrate/influxdb/index.md (1)

1-1: Index/learn split and anchors are on point

Main anchor here with a lean intro and a toctree to “learn” matches the layout NG conventions. Keeping legacy aliases like (integrate-influxdb)= is good for compatibility.

Also applies to: 22-25

docs/integrate/influxdb/learn.md (1)

158-161: Good callout on required SSL parameters

The explicit note about ssl=true on both URLs is essential. Keep it.

docs/ingest/cdc/index.md (2)

8-11: Concise hub page looks good

Lean intro with cross-refs aligns with the new structure.

Also applies to: 16-16


19-24: All {ref} targets verified
All anchor references in docs/ingest/cdc/index.md (aws-dms, aws-dynamodb, aws-kinesis, debezium, mongodb, streamsets) exist in the repository. No further action needed.

docs/integrate/mongodb/index.md (2)

1-25: Overall structure matches the pattern

Index contains main anchor, About + Learn sections, and a hidden toctree to learn content. Looks good.

Also applies to: 51-55


26-49: All link references resolve

The bracketed references in docs/integrate/mongodb/index.md are all defined in docs/_include/links.md:

No further action required.

docs/ingest/etl/index.md (2)

25-25: Anchors Verified: All Required Labels Present

All referenced anchors—apache-iceberg, aws-dms, aws-dynamodb, aws-kinesis, influxdb, mongodb, and mysql—are defined as MyST labels in their respective integration pages. No additions or changes required.


10-13: Reference-style link resolution verified

  • Found definition in docs/_include/links.md (line 13) and a duplicate in docs/integrate/index.md (line 63).
  • The “[CrateDB’s PostgreSQL interface]” key resolves correctly in builds; no broken links detected.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
docs/connect/mcp/index.md (5)

32-52: Align grid to 1-column for MCP pages (author preference)

Per prior MCP layout guidance, keep cards stacked vertically instead of side-by-side.

-::::{grid} 1 2 2 2
+::::{grid} 1 1 1 1

Referencing retrieved learning: MCP pages prefer a 1-column grid layout for multiple cards.


20-24: Normalize reference label casing (defensive consistency)

The link references are defined with capitalized labels ([Prompts], [Resources], [Tools], [Transports]) but used in lowercase in text. While most parsers are case-insensitive, aligning usage avoids edge-case breaks.

-The main entities of MCP are [prompts], [resources], and [tools].
+The main entities of MCP are [Prompts], [Resources], and [Tools].
 MCP clients call MCP servers, either by invoking them as a subprocess and
 communicating via Standard Input/Output (stdio), Server-Sent Events (sse),
-or HTTP Streams (streamable-http), see [transports].
+or HTTP Streams (streamable-http), see [Transports].

21-23: Acronym consistency: capitalize “SSE”

Minor editorial polish for standard acronym casing.

-communicating via Standard Input/Output (stdio), Server-Sent Events (sse),
+communicating via Standard Input/Output (stdio), Server-Sent Events (SSE),

66-66: Remove unused link reference to satisfy markdownlint

“Community Forum” definition is now unused; safe to drop.

-[Community Forum]: https://community.cratedb.com/

1-1: Preserve/restore legacy anchors for backward compatibility

To avoid breaking external references, consider adding explicit anchor labels (keep both legacy and current, if applicable).

Example (please verify actual historic labels before applying):

+(mcp)=
+(connect-mcp)=
 # Model Context Protocol (MCP)

If you confirm prior anchors, I can generate a targeted patch across related pages.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fb05a4 and 9efca51.

📒 Files selected for processing (4)
  • docs/connect/configure.md (3 hunks)
  • docs/connect/index.md (2 hunks)
  • docs/connect/mcp/index.md (3 hunks)
  • docs/integrate/index.md (3 hunks)
✅ Files skipped from review due to trivial changes (2)
  • docs/connect/configure.md
  • docs/connect/index.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/integrate/index.md
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.
📚 Learning: 2025-05-18T13:25:11.786Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.

Applied to files:

  • docs/connect/mcp/index.md
📚 Learning: 2025-08-09T10:59:42.048Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

Applied to files:

  • docs/connect/mcp/index.md
📚 Learning: 2025-08-09T16:19:43.947Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

Applied to files:

  • docs/connect/mcp/index.md
📚 Learning: 2025-08-09T11:00:03.537Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/index.md:312-316
Timestamp: 2025-08-09T11:00:03.537Z
Learning: In the CrateDB Guide documentation, locally stale references (such as `(reference-architectures)=`, `(domain)=`, and similar anchor labels) should be retained even when they appear unused within the repository, as external projects might still reference them. This ensures backward compatibility for external documentation and projects.

Applied to files:

  • docs/connect/mcp/index.md
📚 Learning: 2025-05-18T12:50:36.393Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.

Applied to files:

  • docs/connect/mcp/index.md
📚 Learning: 2025-05-18T12:50:38.681Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.

Applied to files:

  • docs/connect/mcp/index.md
🪛 markdownlint-cli2 (0.17.2)
docs/connect/mcp/index.md

66-66: Link and image reference definitions should be needed
Unused link or image reference definition: "community forum"

(MD053, link-image-reference-definitions)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (3)
docs/connect/mcp/index.md (3)

11-12: Section rename to “About” reads cleaner—LGTM

Good structural improvement and consistent with the rest of the guide.


25-27: “Usage” placement after the intro improves flow—LGTM

Clear progression from concept to application.


59-64: Specific forum thread link is a solid improvement—LGTM

Clear call-to-action to the right discussion.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (5)
docs/integrate/influxdb/learn.md (5)

1-3: Align learn-page anchor with repo convention (-learn under integrate) and keep main anchor in index.md

Per the established pattern, the primary integration anchor lives in index.md; learn pages use an “integrate-...-learn” anchor. Rename here and add the main anchor in docs/integrate/influxdb/index.md. This follows the retrieved learning for anchor placement.

-(influxdb-learn)=
+(integrate-influxdb-learn)=
 (import-influxdb)=
 # Import data from InfluxDB

If helpful, I can prep a companion snippet for index.md to host (integrate-influxdb)= and retain any legacy aliases for backward compatibility.


14-17: Avoid embedding credentials in URLs (use env vars or placeholders)

Don’t publish literal passwords/tokens in examples. Prefer env vars or placeholders.

 ctk load table \
   "influxdb2://example:[email protected]:8086/testdrive/demo" \
-  --cratedb-sqlalchemy-url="crate://user:[email protected]:4200/testdrive/demo"
+  --cratedb-sqlalchemy-url="crate://user:${CRATEPW}@cratedb.example.org:4200/testdrive/demo"

Add a short note above this snippet to export CRATEPW or point to a secrets mechanism.


27-30: Scrub credentials in file-based import example too

Same principle: avoid hard-coded secrets.

 ctk load table \
   "https://github.com/influxdata/influxdb2-sample-data/raw/master/air-sensor-data/air-sensor-data.lp" \
-  --cratedb-sqlalchemy-url="crate://user:[email protected]:4200/testdrive/air-sensor-data"
+  --cratedb-sqlalchemy-url="crate://user:${CRATEPW}@cratedb.example.org:4200/testdrive/air-sensor-data"

153-157: Cloud import: avoid embedding tokens/passwords and keep expansion safe

Use env vars for both CrateDB password and InfluxDB token/Org, and avoid hard-coding secrets in URLs.

-export CRATEPW='dZ..qB'
-ctk load table \
-  "influxdb2://9f..06:[email protected]/testdrive/demo?ssl=true" \
-  --cratedb-sqlalchemy-url="crate://admin:${CRATEPW}@purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net:4200/testdrive/demo?ssl=true"
+export CRATEPW='<PASSWORD>'
+export INFLUX_ORG='<INFLUX_ORG_ID>'
+export INFLUX_TOKEN='<INFLUX_TOKEN>'
+ctk load table \
+  "influxdb2://${INFLUX_ORG}:${INFLUX_TOKEN}@eu-central-1-1.aws.cloud2.influxdata.com/testdrive/demo?ssl=true" \
+  --cratedb-sqlalchemy-url="crate://admin:${CRATEPW}@purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net:4200/testdrive/demo?ssl=true"

166-168: Bug: single quotes prevent ${CRATEPW} expansion; also prefer flags over URL creds

Single quotes block variable expansion, so ${CRATEPW} won’t substitute. Also, consider using flags/env vars instead of embedding credentials in the URL. Likely the flag is --host (singular), not --hosts.

-export CRATEPW='dZ..qB'
-crash --hosts 'https://admin:${CRATEPW}@purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net:4200' --command 'SELECT * FROM testdrive.demo;'
+export CRATEPW='<PASSWORD>'
+# Preferred (no URL credentials):
+crash --host purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net --username admin --passwordenv=CRATEPW --command 'SELECT * FROM testdrive.demo;'
+# If you must keep URL form, ensure expansion works (not recommended):
+# crash --hosts "https://admin:${CRATEPW}@purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net:4200" --command 'SELECT * FROM testdrive.demo;'

If --passwordenv is not supported, I can suggest an alternative that still avoids committing secrets to history.

🧹 Nitpick comments (4)
docs/integrate/mongodb/index.md (1)

26-49: Silence markdownlint false-positives for MyST list-table

markdownlint flags MD004/MD007 on MyST list-table syntax even though this is valid. Add inline disables around the block to keep CI green without altering content.

Apply this diff:

+<!-- markdownlint-disable MD004 MD007 -->
 :::{list-table}
 :header-rows: 1
 :widths: auto

 *   - Feature
     - CrateDB
     - CrateDB Cloud
-    - Description 
+    - Description
 *   - [MongoDB Table Loader]
     - ✅
     - ✅
     - CLI `ctk load table` for loading collections into CrateDB (`full-load`).
       Tutorial: {ref}`import-mongodb`
 *   - [MongoDB CDC Relay]
     - ✅
     - ✅
     - CLI `ctk load table` for streaming changes of collections into CrateDB (`cdc`).
 *   - {ref}`MongoDB CDC integration <cloud:integrations-mongo-cdc>`
     - ❌
     - ✅
     - Managed data loading from MongoDB and MongoDB Atlas into CrateDB Cloud
       (`full-load` and `cdc`), including advanced data translation and compensation
       strategies.
 :::
+<!-- markdownlint-enable MD004 MD007 -->
docs/integrate/streamsets/index.md (1)

7-13: Tighten wording, remove duplication, expand acronyms once

Small copyedits: Oxford comma, avoid “a variety of,” remove duplicated “Kafka” and repeated “and more,” expand ADLS/S3 once, and trim marketing fluff.

Apply this diff:

-The [StreamSets Data Collector] is a lightweight and powerful engine that
-allows you to build streaming, batch and change-data-capture (CDC) pipelines
-that can ingest and transform data from a variety of sources.
+The [StreamSets Data Collector] is a lightweight, powerful engine for building
+streaming, batch, and change data capture (CDC) pipelines that ingest and transform
+data from various sources.
 
-StreamSets Data Collector Engine makes it easy to run data pipelines from Kafka,
-Oracle, Salesforce, JDBC, Hive, and more to Snowflake, Databricks, S3, ADLS, Kafka
-and more. It runs on‑premises or in any cloud, wherever your data lives.
+Use it to run pipelines from sources such as Kafka, Oracle, Salesforce, JDBC, and Hive
+to destinations including Snowflake, Databricks, Amazon S3, and Azure Data Lake Storage (ADLS).
+It runs on‑premises or in any cloud.
docs/integrate/influxdb/learn.md (2)

135-146: Replace realistic-looking credentials with obvious placeholders

Even anonymized tokens/passwords set a precedent. Use clear placeholders for docs.

-:::{rubric} CrateDB Cloud
-:::
-- Host: ```purple-shaak-ti.eks1.eu-west-1.aws.cratedb.net```
-- Username: ```admin```
-- Password: ```dZ..qB```
+:::{rubric} CrateDB Cloud
+:::
+- Host: ```<CRATEDB_HOST>```
+- Username: ```<CRATEDB_USER>```
+- Password: ```<CRATEPW>```
 
-:::{rubric} InfluxDB Cloud
-:::
-- Host: ```eu-central-1-1.aws.cloud2.influxdata.com```
-- Organization ID: ```9fafc869a91a3406```
-- All-Access API token: ```T2..==```
+:::{rubric} InfluxDB Cloud
+:::
+- Host: ```<INFLUX_HOST>```
+- Organization ID: ```<INFLUX_ORG_ID>```
+- All-Access API token: ```<INFLUX_TOKEN>```

50-50: Minor prose tweak: use a proper note/admonition for the “via” citation

Stylistic nit: replace the leading “--” with a blockquote or note for clarity.

Example:

> via: [What are series and bucket in InfluxDB]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9efca51 and f70fefb.

📒 Files selected for processing (5)
  • docs/connect/mcp/index.md (4 hunks)
  • docs/integrate/influxdb/learn.md (1 hunks)
  • docs/integrate/mongodb/index.md (1 hunks)
  • docs/integrate/mysql/index.md (1 hunks)
  • docs/integrate/streamsets/index.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/connect/mcp/index.md
  • docs/integrate/mysql/index.md
🧰 Additional context used
🧠 Learnings (14)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.
📚 Learning: 2025-08-09T16:19:43.947Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

Applied to files:

  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
  • docs/integrate/streamsets/index.md
📚 Learning: 2025-05-18T12:50:36.393Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.

Applied to files:

  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-05-18T12:50:38.681Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.

Applied to files:

  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-09T10:59:42.048Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

Applied to files:

  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-08T16:50:14.965Z
Learnt from: amotl
PR: crate/cratedb-guide#234
File: docs/home/index.md:47-50
Timestamp: 2025-08-08T16:50:14.965Z
Learning: In the CrateDB Guide docs (MyST), the correct intersphinx target for the CrateDB Cloud documentation homepage is `cloud:index` (not `cloud:docs-index` or `cloud-docs-index`). Use `:link: cloud:index` on cards/links. The `cloud` mapping is inherited via crate-docs-theme.

Applied to files:

  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-09T11:03:33.164Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/ingest/etl/index.md:22-23
Timestamp: 2025-08-09T11:03:33.164Z
Learning: In the CrateDB Guide documentation, MyST Markdown files can use `{ref}` directives to reference labels defined in reStructuredText files using the RST syntax `.. _label:`. This cross-format referencing between MyST and RST is supported by Sphinx.

Applied to files:

  • docs/integrate/influxdb/learn.md
  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-09T11:00:03.537Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/index.md:312-316
Timestamp: 2025-08-09T11:00:03.537Z
Learning: In the CrateDB Guide documentation, locally stale references (such as `(reference-architectures)=`, `(domain)=`, and similar anchor labels) should be retained even when they appear unused within the repository, as external projects might still reference them. This ensures backward compatibility for external documentation and projects.

Applied to files:

  • docs/integrate/influxdb/learn.md
📚 Learning: 2025-08-07T23:11:09.657Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:1-1
Timestamp: 2025-08-07T23:11:09.657Z
Learning: In the CrateDB Guide repository, intersphinx mappings like `crate-admin-ui` are inherited from the root project `crate-docs-theme` rather than being defined locally in `docs/conf.py`. This allows shared documentation configurations across multiple CrateDB documentation projects.

Applied to files:

  • docs/integrate/influxdb/learn.md
📚 Learning: 2025-08-05T07:14:57.416Z
Learnt from: hammerhead
PR: crate/cratedb-guide#221
File: docs/connect/configure.md:58-66
Timestamp: 2025-08-05T07:14:57.416Z
Learning: In CrateDB connection strings, the user:password@ syntax is valid for HTTP Basic authentication on port 4200, but PostgreSQL JDBC drivers do not support this format and require credentials as query parameters (?user=<user>&password=<password>) instead.

Applied to files:

  • docs/integrate/influxdb/learn.md
📚 Learning: 2025-06-05T14:29:15.512Z
Learnt from: amotl
PR: crate/cratedb-guide#207
File: docs/integrate/etl/iceberg-risingwave.md:205-207
Timestamp: 2025-06-05T14:29:15.512Z
Learning: The `records.Database("crate://", echo=True)` connection string for CrateDB works with defaults: localhost as host, "crate" as user, and blank password. This is valid and functional code in the records library.

Applied to files:

  • docs/integrate/influxdb/learn.md
📚 Learning: 2025-08-07T23:11:08.311Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:11-11
Timestamp: 2025-08-07T23:11:08.311Z
Learning: In the CrateDB Guide repository, intersphinx mappings like "cloud" are defined within the root project `crate-docs-theme` and inherited by the documentation projects, so they don't need to be explicitly defined in individual `docs/conf.py` files.

Applied to files:

  • docs/integrate/mongodb/index.md
📚 Learning: 2025-08-06T02:11:46.794Z
Learnt from: amotl
PR: crate/cratedb-guide#222
File: docs/integrate/powerbi/index.md:16-23
Timestamp: 2025-08-06T02:11:46.794Z
Learning: The `project:` role in MyST Markdown syntax like `[](project:#powerbi-desktop)` and `[](project:#powerbi-service)` is valid cross-reference syntax in the CrateDB Guide repository's MyST Markdown setup.

Applied to files:

  • docs/integrate/mongodb/index.md
📚 Learning: 2025-05-18T13:25:11.786Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.

Applied to files:

  • docs/integrate/mongodb/index.md
🪛 markdownlint-cli2 (0.17.2)
docs/integrate/influxdb/learn.md

167-167: Bare URL used

(MD034, no-bare-urls)

docs/integrate/mongodb/index.md

30-30: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


30-30: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


31-31: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


31-31: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


32-32: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


32-32: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


33-33: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


33-33: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


34-34: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


34-34: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


35-35: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


35-35: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


36-36: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


36-36: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


37-37: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


37-37: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


39-39: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


39-39: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


40-40: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


40-40: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


41-41: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


41-41: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


42-42: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


42-42: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


43-43: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


43-43: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


44-44: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


44-44: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


45-45: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


45-45: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


46-46: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


46-46: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

🪛 LanguageTool
docs/integrate/streamsets/index.md

[style] ~9-~9: Consider using a more concise synonym.
Context: ...that can ingest and transform data from a variety of sources. StreamSets Data Collector Eng...

(A_VARIETY_OF)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (10)
docs/integrate/mongodb/index.md (4)

1-2: Anchor and title look correct; consider retaining legacy anchors for back-compat.

Pattern aligns with the new layout. If older anchors existed (e.g., from ETL/CDC pages), keep them alongside (mongodb)= to avoid breaking inbound links.

If applicable, add previous labels above the H1, for example:

(old-anchor-1)=
(old-anchor-2)=

21-24: Terminology consistency: full-load vs CDC

Good separation between initial loads (“full-load”) and streaming changes (“cdc”). No issues here.


39-42: CDC labeling is correct; please confirm the CLI verb for streaming mode

“Streaming changes … (cdc)” is right. Double-check that the command and verb are accurate for CDC streaming (e.g., whether it’s still ctk load table or a relay-style subcommand).

If the verb differs, adjust the phrasing accordingly.


43-49: Cloud intersphinx target is correct

cloud:integrations-mongo-cdc via the inherited “cloud” mapping matches the documented pattern. Looks good.

docs/integrate/streamsets/index.md (3)

1-2: Anchor placement and title LGTM

Anchor (streamsets)= in index.md matches the established integration pattern; heading is consistent.


1-1: No legacy StreamSets anchors found—no additional aliases needed
The only StreamSets label in the repository is the new (streamsets)= in docs/integrate/streamsets/index.md. No other legacy anchors or references to preserve were detected across the docs.


23-23: External link validated

The URL https://www.softwareag.com/en_corporate/platform/integration-apis/data-collector-engine.html returns HTTP 200 OK. No update to the link is needed.

docs/integrate/influxdb/learn.md (3)

167-167: markdownlint MD034 (bare URL) — verify if linting includes code blocks

The linter flagged a bare URL at this line. If your config lints inside fenced code blocks, consider suppressing or accept as-is; otherwise this may be a false positive.

Would you like me to run a quick lint check and adjust the config or snippet accordingly?


20-22: Remove the --passwordenv suggestion: crash auto-reads CRATEPW

The crash CLI automatically uses the CRATEPW environment variable for authentication, so no extra flags are required. Your original snippet already exports CRATEPW, and that will be picked up when a password is needed. You can simply clarify the placeholder:

export CRATEPW='<your_password>'
crash --host=cratedb.example.org --username=user --command='SELECT * FROM testdrive.demo;'

Likely an incorrect or invalid review comment.


33-35: Ensure the password isn’t exposed in your shell history

Rather than hard-coding the password or leaving CRATEPW unused, inject it at runtime. For example, you can use HTTP Basic auth in the URL (which crash supports) or, if available, a dedicated password flag:

• Option 1: Embed creds in the connection URL

-export CRATEPW=password
-crash --host=cratedb.example.org --username=user --command='SELECT * FROM testdrive."air-sensor-data";'
+export CRATEPW='<PASSWORD>'
+crash --hosts "https://user:${CRATEPW}@cratedb.example.org:4200" \
+      --command 'SELECT * FROM testdrive."air-sensor-data";'

• Option 2: Use a password‐env flag (verify via crash --help)

-export CRATEPW=password
-crash --host=cratedb.example.org --username=user --command='SELECT * FROM testdrive."air-sensor-data";'
+export CRATEPW='<PASSWORD>'
+crash --host=cratedb.example.org --username=user \
+      --passwordenv=CRATEPW \
+      --command 'SELECT * FROM testdrive."air-sensor-data";'

Please confirm which of these approaches (URL auth or --passwordenv) is supported by your crash client and update the example accordingly.

@amotl amotl marked this pull request as ready for review August 9, 2025 21:55
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
docs/integrate/streamsets/index.md (1)

18-21: toctree correctly points to learn page (cross-format OK)

Referencing learn (RST) from MyST toctree is supported by Sphinx and matches the pattern used elsewhere.

🧹 Nitpick comments (1)
docs/integrate/streamsets/index.md (1)

11-13: Replace non-breaking hyphen in “on‑premises” with standard hyphen

There’s a non-breaking hyphen (U+2011) in “on‑premises”. Prefer ASCII hyphen for consistency, searchability, and to avoid lint false-positives.

- It runs on‑premises or in any cloud.
+ It runs on-premises or in any cloud.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f70fefb and b8d438e.

📒 Files selected for processing (7)
  • docs/conf.py (1 hunks)
  • docs/connect/mcp/index.md (4 hunks)
  • docs/integrate/influxdb/learn.md (1 hunks)
  • docs/integrate/mongodb/index.md (1 hunks)
  • docs/integrate/mongodb/learn.md (1 hunks)
  • docs/integrate/mysql/index.md (1 hunks)
  • docs/integrate/streamsets/index.md (1 hunks)
✅ Files skipped from review due to trivial changes (4)
  • docs/integrate/mongodb/learn.md
  • docs/conf.py
  • docs/integrate/mysql/index.md
  • docs/integrate/influxdb/learn.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/integrate/mongodb/index.md
  • docs/connect/mcp/index.md
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.
📚 Learning: 2025-08-09T16:19:43.947Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.

Applied to files:

  • docs/integrate/streamsets/index.md
📚 Learning: 2025-08-09T21:40:46.125Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/streamsets/index.md:18-21
Timestamp: 2025-08-09T21:40:46.125Z
Learning: In the CrateDB Guide documentation, reStructuredText files (`.rst`) use the RST label syntax `.. _label:` while MyST Markdown files (`.md`) use the syntax `(label)=`. Both formats are used throughout the repository and labels can be cross-referenced between them.

Applied to files:

  • docs/integrate/streamsets/index.md
📚 Learning: 2025-08-09T11:03:33.164Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/ingest/etl/index.md:22-23
Timestamp: 2025-08-09T11:03:33.164Z
Learning: In the CrateDB Guide documentation, MyST Markdown files can use `{ref}` directives to reference labels defined in reStructuredText files using the RST syntax `.. _label:`. This cross-format referencing between MyST and RST is supported by Sphinx.

Applied to files:

  • docs/integrate/streamsets/index.md
📚 Learning: 2025-08-06T02:11:46.794Z
Learnt from: amotl
PR: crate/cratedb-guide#222
File: docs/integrate/powerbi/index.md:16-23
Timestamp: 2025-08-06T02:11:46.794Z
Learning: The `project:` role in MyST Markdown syntax like `[](project:#powerbi-desktop)` and `[](project:#powerbi-service)` is valid cross-reference syntax in the CrateDB Guide repository's MyST Markdown setup.

Applied to files:

  • docs/integrate/streamsets/index.md
📚 Learning: 2025-05-18T12:50:38.681Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.

Applied to files:

  • docs/integrate/streamsets/index.md
📚 Learning: 2025-08-08T16:50:14.965Z
Learnt from: amotl
PR: crate/cratedb-guide#234
File: docs/home/index.md:47-50
Timestamp: 2025-08-08T16:50:14.965Z
Learning: In the CrateDB Guide docs (MyST), the correct intersphinx target for the CrateDB Cloud documentation homepage is `cloud:index` (not `cloud:docs-index` or `cloud-docs-index`). Use `:link: cloud:index` on cards/links. The `cloud` mapping is inherited via crate-docs-theme.

Applied to files:

  • docs/integrate/streamsets/index.md
📚 Learning: 2025-05-18T12:50:36.393Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.

Applied to files:

  • docs/integrate/streamsets/index.md
📚 Learning: 2025-08-09T10:59:42.048Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.

Applied to files:

  • docs/integrate/streamsets/index.md
📚 Learning: 2025-08-09T11:00:03.537Z
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/index.md:312-316
Timestamp: 2025-08-09T11:00:03.537Z
Learning: In the CrateDB Guide documentation, locally stale references (such as `(reference-architectures)=`, `(domain)=`, and similar anchor labels) should be retained even when they appear unused within the repository, as external projects might still reference them. This ensures backward compatibility for external documentation and projects.

Applied to files:

  • docs/integrate/streamsets/index.md
📚 Learning: 2025-08-07T23:11:08.311Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:11-11
Timestamp: 2025-08-07T23:11:08.311Z
Learning: In the CrateDB Guide repository, intersphinx mappings like "cloud" are defined within the root project `crate-docs-theme` and inherited by the documentation projects, so they don't need to be explicitly defined in individual `docs/conf.py` files.

Applied to files:

  • docs/integrate/streamsets/index.md
🪛 LanguageTool
docs/integrate/streamsets/index.md

[grammar] ~13-~13: Ensure spelling is correct
Context: ... Azure Data Lake Storage (ADLS). It runs on‑premises or in any cloud. :::{rubric} Learn :::...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🔇 Additional comments (2)
docs/integrate/streamsets/index.md (2)

1-1: Anchor placement/prefix looks correct for integration landing pages

Main anchor (streamsets)= in index.md aligns with the integration docs pattern. Good.


1-3: Add missing MyST anchor (streamsets)= to docs/integrate/streamsets/index.md

We detected {ref}streamsets`` in docs/ingest/etl/index.md and `docs/ingest/cdc/index.md` but no corresponding `(streamsets)=` label exists. Please insert this alias at the top of `docs/integrate/streamsets/index.md`:

+ (streamsets)=
  # StreamSets

This restores backward-compatibility for those references.

⛔ Skipped due to learnings
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/connect/ide.md:2-2
Timestamp: 2025-08-09T10:59:42.048Z
Learning: In the CrateDB Guide documentation, legacy anchor labels (like `(ide)=`) should be retained alongside newer, more specific ones (like `(connect-ide)=`) to maintain backward compatibility for external projects that might reference them, even if they appear redundant within the repository.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/ingest/etl/index.md:22-23
Timestamp: 2025-08-09T11:03:33.164Z
Learning: In the CrateDB Guide documentation, MyST Markdown files can use `{ref}` directives to reference labels defined in reStructuredText files using the RST syntax `.. _label:`. This cross-format referencing between MyST and RST is supported by Sphinx.
Learnt from: amotl
PR: crate/cratedb-guide#236
File: docs/index.md:312-316
Timestamp: 2025-08-09T11:00:03.537Z
Learning: In the CrateDB Guide documentation, locally stale references (such as `(reference-architectures)=`, `(domain)=`, and similar anchor labels) should be retained even when they appear unused within the repository, as external projects might still reference them. This ensures backward compatibility for external documentation and projects.
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/streamsets/index.md:18-21
Timestamp: 2025-08-09T21:40:46.125Z
Learning: In the CrateDB Guide documentation, reStructuredText files (`.rst`) use the RST label syntax `.. _label:` while MyST Markdown files (`.md`) use the syntax `(label)=`. Both formats are used throughout the repository and labels can be cross-referenced between them.
Learnt from: amotl
PR: crate/cratedb-guide#234
File: docs/home/index.md:47-50
Timestamp: 2025-08-08T16:50:14.965Z
Learning: In the CrateDB Guide docs (MyST), the correct intersphinx target for the CrateDB Cloud documentation homepage is `cloud:index` (not `cloud:docs-index` or `cloud-docs-index`). Use `:link: cloud:index` on cards/links. The `cloud` mapping is inherited via crate-docs-theme.
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.947Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
Learnt from: amotl
PR: crate/cratedb-guide#222
File: docs/integrate/powerbi/index.md:16-23
Timestamp: 2025-08-06T02:11:46.794Z
Learning: The `project:` role in MyST Markdown syntax like `[](project:#powerbi-desktop)` and `[](project:#powerbi-service)` is valid cross-reference syntax in the CrateDB Guide repository's MyST Markdown setup.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.

Base automatically changed from layout-ng to main August 12, 2025 02:39
amotl added 7 commits August 12, 2025 04:40
- Dissolve individual pages in category section `etl`, relocating them
  into dedicated items within the backbone section `integrate` instead.

- Relocated items: Azure Functions, Apache Iceberg, InfluxDB, MongoDB,
  MySQL and MariaDB, RisingWave, Streamsets.

- Dissolve weird page toc assembly on ETL and CDC category index pages,
  using `toctree` only for now.
There’s a non-breaking hyphen (U+2011) in “on‑premises”. Prefer ASCII hyphen for consistency, searchability, and to avoid lint false-positives.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

guidance Matters of layout, shape, and structure. refactoring Changing shape or layout, or moving content around. sanding-500 Sanding medium-sized details.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants