From 4dd01ce6b4518ec91f5b30d5b07cfc046111b343 Mon Sep 17 00:00:00 2001 From: Brian Munkholm Date: Thu, 26 Jun 2025 21:23:32 +0200 Subject: [PATCH 1/4] REmove page-local TOC --- docs/admin/bootstrap-checks.rst | 5 --- docs/admin/circuit-breaker.rst | 34 +++++++------- .../clustering/logical-replication-setup.rst | 5 --- docs/admin/clustering/multi-node-setup.rst | 27 +++++------ docs/admin/clustering/multi-zone-setup.rst | 5 --- docs/admin/clustering/scale/kubernetes.rst | 5 --- docs/admin/going-into-production.rst | 5 --- docs/admin/sharding-partitioning.rst | 5 --- docs/admin/troubleshooting/crate-node.rst | 8 ---- docs/admin/troubleshooting/jcmd/docker.rst | 5 --- docs/admin/troubleshooting/system-tables.rst | 5 --- docs/admin/upgrade/full.rst | 5 --- docs/admin/upgrade/planning.rst | 12 ++--- docs/admin/upgrade/rolling.rst | 6 --- docs/domain/timeseries/advanced.md | 9 +--- docs/domain/timeseries/generate/cli.rst | 5 --- docs/domain/timeseries/generate/go.rst | 6 --- docs/domain/timeseries/generate/node.rst | 6 --- docs/domain/timeseries/generate/python.rst | 6 --- .../timeseries/learn/normalize-pandas.rst | 5 --- docs/domain/timeseries/longterm.md | 2 +- docs/install/cloud/aws/ec2-setup.rst | 5 --- docs/install/cloud/aws/s3-setup.rst | 5 --- docs/install/cloud/azure/vm.rst | 5 --- docs/install/container/docker.rst | 6 --- .../container/kubernetes/kubernetes.rst | 4 -- docs/integrate/etl/mysql.rst | 5 --- docs/integrate/visualize/grafana.rst | 45 +++++++++---------- docs/integrate/visualize/metabase.rst | 11 ++--- docs/performance/inserts/bulk.rst | 4 -- docs/performance/inserts/methods.rst | 5 --- docs/performance/inserts/parallel.rst | 5 --- docs/performance/inserts/testing.rst | 5 --- docs/performance/inserts/tuning.rst | 5 --- docs/performance/sharding.rst | 5 --- 35 files changed, 56 insertions(+), 230 deletions(-) diff --git a/docs/admin/bootstrap-checks.rst b/docs/admin/bootstrap-checks.rst index 6f9968ec..6d8ad01a 100644 --- a/docs/admin/bootstrap-checks.rst +++ b/docs/admin/bootstrap-checks.rst @@ -23,11 +23,6 @@ run, but it is still a good idea to follow these instructions. Docker. Consult the additional documentation on Docker :ref:`resource constraints ` for more information. -.. rubric:: Table of contents - -.. contents:: - :local: - System settings =============== diff --git a/docs/admin/circuit-breaker.rst b/docs/admin/circuit-breaker.rst index 5ac531b8..64b12b0c 100644 --- a/docs/admin/circuit-breaker.rst +++ b/docs/admin/circuit-breaker.rst @@ -13,14 +13,14 @@ Think of the miniature breakers inside a household fuse box: if too many applian trips and cuts power to prevent the wires from melting. The same principle applies in software, only the resource under pressure is memory, CPU, file descriptors, or an external service. -In CrateDB, the critical resource is **RAM**. Queries run in parallel across many shards; a single -oversized aggregation or JOIN can allocate gigabytes in milliseconds. The breaker detects this and aborts the query with a +In CrateDB, the critical resource is **RAM**. Queries run in parallel across many shards; a single +oversize aggregation or JOIN can allocate gigabytes in milliseconds. The breaker detects this and aborts the query with a ``CircuitBreakingException`` instead of letting the JVM run out of heap and crash the node. How Circuit Breakers Work in CrateDB ==================================== A query executes as an ordered set of operations. Before running each stage, CrateDB estimates the extra memory that step will need. -If the projected total would exceed the breaker limit, the system aborts the query and returns a ``CircuitBreakingException``. +If the projected total would exceed the breaker limit, the system aborts the query and returns a ``CircuitBreakingException``. This pre-emptive trip prevents the JVM's garbage collector from reaching an unrecoverable out-of-memory state. It is important to understand CrateDB doesn’t aspire to do a fully accurate memory accounting, but instead opts for a best-effort approach, @@ -29,9 +29,9 @@ since a precise estimate is tricky to achieve. Types of Circuit Breakers ========================= There are six different Circuit Breaker types which are described in detail in the `cluster settings`_ documentation page: ``query``, -``request``, ``jobs_log``, ``operations_log``, ``total`` and ``accounting``, which was deprecated and will be removed soon. The ``total`` Circuit Breaker, also -known as ``parent``, accounts for all others, meaning that it controls the general use of memory, tripping an operation if a -combination of the circuit breakers threatens the cluster. +``request``, ``jobs_log``, ``operations_log``, ``total`` and ``accounting``, which was deprecated and will be removed soon. The ``total`` Circuit Breaker, also +known as ``parent``, accounts for all others, meaning that it controls the general use of memory, tripping an operation if a +combination of the circuit breakers threatens the cluster. Monitoring & Observability ========================== @@ -45,29 +45,29 @@ deployment to collecting metrics and displaying them on a Grafana dashboard. Exception Handling ================== .. code-block:: console - + CircuitBreakingException[Allocating 2mb for 'query: mergeOnHandler' failed, breaker would use 976.4mb in total. Limit is 972.7mb. Either increase memory and limit, change the query or reduce concurrent query load] -* **Understanding the error** +* **Understanding the error** The memory estimate for **mergeOnHandler** exceeded the ``indices.breaker.query.limit``, so the query was aborted and the exception returned. -* **Immediate actions** +* **Immediate actions** * **Optimize the query** - see :ref:`Query Optimization 101 ` for detailed guidance. * **Identify memory-hungry queries** - run: - + .. code-block:: psql - + SELECT js.id, - stmt, - username, - sum(used_bytes) sum_bytes - FROM sys.operations op - JOIN sys.jobs js ON op.job_id = js.id - GROUP BY js.id, stmt, username + stmt, + username, + sum(used_bytes) sum_bytes + FROM sys.operations op + JOIN sys.jobs js ON op.job_id = js.id + GROUP BY js.id, stmt, username ORDER BY sum_bytes DESC; diff --git a/docs/admin/clustering/logical-replication-setup.rst b/docs/admin/clustering/logical-replication-setup.rst index a5747d90..6deba703 100644 --- a/docs/admin/clustering/logical-replication-setup.rst +++ b/docs/admin/clustering/logical-replication-setup.rst @@ -10,11 +10,6 @@ As a publish/subscribe model, it allows a publishing cluster to make certain tables available for subscription. Subscribing clusters pull changes from a publication and replay them on their side. -.. rubric:: Table of contents - -.. contents:: - :local: - .. _requirements: Requirements diff --git a/docs/admin/clustering/multi-node-setup.rst b/docs/admin/clustering/multi-node-setup.rst index 1e84f89a..4054a9ad 100644 --- a/docs/admin/clustering/multi-node-setup.rst +++ b/docs/admin/clustering/multi-node-setup.rst @@ -15,11 +15,6 @@ process :ref:`manually `. This guide shows you how to bootstrap (set up) a multi-node CrateDB cluster using different methods. -.. rubric:: Table of contents - -.. contents:: - :local: - .. _cluster-bootstrapping: @@ -97,11 +92,11 @@ instructions. sh$ tar -xzf crate-*.tar.gz -2. It is common to configure the :ref:`metadata gateway ` so +2. It is common to configure the :ref:`metadata gateway ` so that the cluster waits for all data nodes to be online before starting the - recovery of the shards. In this case let's set - `gateway.expected_data_nodes`_ to **3** and - `gateway.recover_after_data_nodes`_ also to **3**. You can specify these + recovery of the shards. In this case let's set + `gateway.expected_data_nodes`_ to **3** and + `gateway.recover_after_data_nodes`_ also to **3**. You can specify these settings in the `configuration`_ file of the unpacked directory. .. NOTE:: @@ -321,8 +316,8 @@ network partition (also known as a `split-brain`_ scenario). CrateDB (versions 4.x and above) will automatically determine the ideal `quorum size`_, but if you are using CrateDB versions 3.x and below, you must manually set -the quorum size using the `discovery.zen.minimum_master_nodes`_ setting and for -a three-node cluster, you must declare all nodes to be master-eligible. +the quorum size using the `discovery.zen.minimum_master_nodes`_ setting and for +a three-node cluster, you must declare all nodes to be master-eligible. .. _metadata-gateway: @@ -332,13 +327,13 @@ Metadata gateway When running a multi-node cluster, you can configure the :ref:`metadata gateway ` settings so that CrateDB delays recovery until a certain number of nodes is available. -This is useful because if recovery is started when some nodes are down -CrateDB will proceed on the basis the nodes that are down may not be coming -back, and it will create new replicas and rebalance shards as necessary. -This is an expensive operation that, depending on the context, may be better +This is useful because if recovery is started when some nodes are down +CrateDB will proceed on the basis the nodes that are down may not be coming +back, and it will create new replicas and rebalance shards as necessary. +This is an expensive operation that, depending on the context, may be better avoided if the nodes are only down for a short period of time. So, for instance, for a three-nodes cluster, you can decide to set -`gateway.expected_data_nodes`_ to **3**, and +`gateway.expected_data_nodes`_ to **3**, and `gateway.recover_after_data_nodes`_ also to **3**. You can specify both settings in your `configuration`_ file: diff --git a/docs/admin/clustering/multi-zone-setup.rst b/docs/admin/clustering/multi-zone-setup.rst index 7efa0f1f..b8f02306 100644 --- a/docs/admin/clustering/multi-zone-setup.rst +++ b/docs/admin/clustering/multi-zone-setup.rst @@ -20,11 +20,6 @@ In some cases, it may be necessary to run a cluster across multiple data centers or availability zones (*zones*, for short). This guide shows you how to set up a multi-zone CrateDB cluster. -.. rubric:: Table of contents - -.. contents:: - :local: - .. _multi-zone-requirements: diff --git a/docs/admin/clustering/scale/kubernetes.rst b/docs/admin/clustering/scale/kubernetes.rst index 1a5af110..f473ac5d 100644 --- a/docs/admin/clustering/scale/kubernetes.rst +++ b/docs/admin/clustering/scale/kubernetes.rst @@ -26,11 +26,6 @@ Together, Docker and Kubernetes are a fantastic way to deploy and scale CrateDB. The official `CrateDB Docker image`_. -.. rubric:: Table of contents - -.. contents:: - :local: - .. _scaling-kube-kube: diff --git a/docs/admin/going-into-production.rst b/docs/admin/going-into-production.rst index 2fda5533..d7cadfe5 100644 --- a/docs/admin/going-into-production.rst +++ b/docs/admin/going-into-production.rst @@ -7,11 +7,6 @@ Going into production Running CrateDB in different environments requires different approaches. This document outlines the basics you need to consider when going into production. -.. rubric:: Table of contents - -.. contents:: - :local: - .. _prod-bootstrapping: diff --git a/docs/admin/sharding-partitioning.rst b/docs/admin/sharding-partitioning.rst index 69e95ca5..1fcb1ba2 100644 --- a/docs/admin/sharding-partitioning.rst +++ b/docs/admin/sharding-partitioning.rst @@ -4,11 +4,6 @@ Sharding and Partitioning ######################### -.. rubric:: Table of contents - -.. contents:: - :local: - Introduction ============ diff --git a/docs/admin/troubleshooting/crate-node.rst b/docs/admin/troubleshooting/crate-node.rst index 8343f514..9a10a2c6 100644 --- a/docs/admin/troubleshooting/crate-node.rst +++ b/docs/admin/troubleshooting/crate-node.rst @@ -14,14 +14,6 @@ Using this command, you can: the event that you lose too many nodes to be able to form a quorum. * Detach nodes from an old cluster so they can be moved to a new cluster. -.. rubric:: Table of contents - -.. toctree:: - :maxdepth: 1 - -.. contents:: - :local: - .. _crate-node-repurpose: diff --git a/docs/admin/troubleshooting/jcmd/docker.rst b/docs/admin/troubleshooting/jcmd/docker.rst index d0cc4379..3a6a5c0c 100644 --- a/docs/admin/troubleshooting/jcmd/docker.rst +++ b/docs/admin/troubleshooting/jcmd/docker.rst @@ -21,11 +21,6 @@ how to solve it. identical to non-containerized applications. -.. rubric:: Table of contents - -.. contents:: - :local: - Run ``jcmd`` inside container ============================= diff --git a/docs/admin/troubleshooting/system-tables.rst b/docs/admin/troubleshooting/system-tables.rst index 6431640d..747d287b 100644 --- a/docs/admin/troubleshooting/system-tables.rst +++ b/docs/admin/troubleshooting/system-tables.rst @@ -13,11 +13,6 @@ analyze, identify the problem, and start mitigating it. While there is :ref:`detailed information about all system tables `, this guide runs you through the most common situations. -.. rubric:: Table of contents - -.. contents:: - :local: - Step 1: Inspect health checks ============================= diff --git a/docs/admin/upgrade/full.rst b/docs/admin/upgrade/full.rst index 5001227f..9b16650e 100644 --- a/docs/admin/upgrade/full.rst +++ b/docs/admin/upgrade/full.rst @@ -6,11 +6,6 @@ Full Restart Upgrade ==================== -.. rubric:: Table of contents - -.. contents:: - :local: - Introduction ============ diff --git a/docs/admin/upgrade/planning.rst b/docs/admin/upgrade/planning.rst index af79a736..61ec75ed 100644 --- a/docs/admin/upgrade/planning.rst +++ b/docs/admin/upgrade/planning.rst @@ -8,12 +8,6 @@ General Upgrade Guidelines ========================== -.. rubric:: Table of contents - -.. contents:: - :local: - - Upgrade Planning ================ Before kicking off an upgrade, there is a set of guidelines to ensure the best outcome. Below you may find the fundamental steps to prepare for an upgrade. @@ -42,7 +36,7 @@ Perform a cluster-wide backup of your production CrateDB and ensure you have a r For the newly written records, you should consider using a mechanism to queue them (e.g. message queue), so these messages can be replayed if needed. .. WARNING:: - + Before starting the upgrade process, ensure no backup processes are triggered, so disable any scheduled backup. Define a rollback plan @@ -50,7 +44,7 @@ Define a rollback plan The rollback plan may vary depending on the specific infrastructure and upgrade process in use. It is also essential to adapt this outline to your organization's specific needs and incorporate any additional steps or considerations that are relevant to your environment. A set of steps to serve as an example is listed below: -* **Identify the issue:** Determine the specific problem that occurred during the upgrade. This could be related to data corruption, performance degradation, application errors, or any other issue that affects the normal functioning of CrateDB. Identify if there are any potential risks to the system's stability, security, or performance. +* **Identify the issue:** Determine the specific problem that occurred during the upgrade. This could be related to data corruption, performance degradation, application errors, or any other issue that affects the normal functioning of CrateDB. Identify if there are any potential risks to the system's stability, security, or performance. * **Communicate the situation:** Notify all relevant stakeholders, including individuals involved in the upgrade process. Clearly explain the problem and the decision to initiate a rollback. @@ -67,6 +61,6 @@ Upgrade Execution Choose the upgrade strategy below that works best for your scenario. -- :ref:`rolling_upgrade` +- :ref:`rolling_upgrade` - :ref:`full_restart_upgrade` diff --git a/docs/admin/upgrade/rolling.rst b/docs/admin/upgrade/rolling.rst index 3b7f67a4..f6f02213 100644 --- a/docs/admin/upgrade/rolling.rst +++ b/docs/admin/upgrade/rolling.rst @@ -5,11 +5,6 @@ Rolling Upgrade =============== -.. rubric:: Table of contents - -.. contents:: - :local: - Introduction ============ @@ -286,4 +281,3 @@ again that have been disabled in the first step: cr> SET GLOBAL TRANSIENT "cluster.routing.allocation.enable" = 'all'; SET OK, 1 row affected (... sec) - diff --git a/docs/domain/timeseries/advanced.md b/docs/domain/timeseries/advanced.md index c90672dd..fbb8e604 100644 --- a/docs/domain/timeseries/advanced.md +++ b/docs/domain/timeseries/advanced.md @@ -12,13 +12,6 @@ with CrateDB. {tags-primary}`Exploratory data analysis` {tags-primary}`Metadata integration` - - -:::{contents} -:local: -:depth: 2 -::: - :::{include} /_include/links.md ::: @@ -199,7 +192,7 @@ operations. This tutorial illustrates how to augment time series data with metadata, in order to enable more comprehensive analysis. It uses a time series dataset that -captures various device readings, such as battery, CPU, and memory information. +captures various device readings, such as battery, CPU, and memory information. {{ '{}(#timeseries-objects)'.format(tutorial) }} ::: diff --git a/docs/domain/timeseries/generate/cli.rst b/docs/domain/timeseries/generate/cli.rst index 2afb7243..4f12ee35 100644 --- a/docs/domain/timeseries/generate/cli.rst +++ b/docs/domain/timeseries/generate/cli.rst @@ -12,11 +12,6 @@ This tutorial will show you how to generate :ref:`mock time series data :ref:`gen-ts` -.. rubric:: Table of contents - -.. contents:: - :local: - Prerequisites ============= diff --git a/docs/domain/timeseries/generate/go.rst b/docs/domain/timeseries/generate/go.rst index 290c7804..f9ec1889 100644 --- a/docs/domain/timeseries/generate/go.rst +++ b/docs/domain/timeseries/generate/go.rst @@ -12,12 +12,6 @@ This tutorial will show you how to generate some :ref:`mock time series data :ref:`gen-ts` -.. rubric:: Table of contents - -.. contents:: - :local: - - Prerequisites ============= diff --git a/docs/domain/timeseries/generate/node.rst b/docs/domain/timeseries/generate/node.rst index 28a43737..84723b24 100644 --- a/docs/domain/timeseries/generate/node.rst +++ b/docs/domain/timeseries/generate/node.rst @@ -11,12 +11,6 @@ This tutorial will show you how to generate :ref:`mock time series data :ref:`gen-ts` -.. rubric:: Table of contents - -.. contents:: - :local: - - Prerequisites ============= diff --git a/docs/domain/timeseries/generate/python.rst b/docs/domain/timeseries/generate/python.rst index 59da7803..737c21c5 100644 --- a/docs/domain/timeseries/generate/python.rst +++ b/docs/domain/timeseries/generate/python.rst @@ -11,12 +11,6 @@ This tutorial will show you how to generate :ref:`mock time series data :ref:`gen-ts` -.. rubric:: Table of contents - -.. contents:: - :local: - - Prerequisites ============= diff --git a/docs/domain/timeseries/learn/normalize-pandas.rst b/docs/domain/timeseries/learn/normalize-pandas.rst index 376989f0..cda78a92 100644 --- a/docs/domain/timeseries/learn/normalize-pandas.rst +++ b/docs/domain/timeseries/learn/normalize-pandas.rst @@ -48,11 +48,6 @@ using SQL. :ref:`Tutorials for generating mock time series data ` -.. rubric:: Table of contents - -.. contents:: - :local: - .. _ni-prereq: diff --git a/docs/domain/timeseries/longterm.md b/docs/domain/timeseries/longterm.md index 103ce799..0194bff3 100644 --- a/docs/domain/timeseries/longterm.md +++ b/docs/domain/timeseries/longterm.md @@ -90,7 +90,7 @@ Wetterdienst uses CrateDB for mass storage of weather data, allowing you to query it efficiently. It provides access to data at more than ten canonical sources of raw weather data from domestic weather agencies. -[![Wetterdienst Documentation](https://img.shields.io/badge/Documentation-Data%20Export-darkgreen?logo=Markdown)](https://wetterdienst.readthedocs.io/en/latest/usage/python-api/#export) +[![Wetterdienst Documentation](https://img.shields.io/badge/Documentation-Data%20Export-darkgreen?logo=Markdown)](https://wetterdienst.readthedocs.io/en/latest/usage/python-api.html#export) [![Wetterdienst Project](https://img.shields.io/badge/Repository-Wetterdienst-darkblue?logo=GitHub)](https://github.com/earthobservations/wetterdienst) ::: diff --git a/docs/install/cloud/aws/ec2-setup.rst b/docs/install/cloud/aws/ec2-setup.rst index 1338c935..93739558 100644 --- a/docs/install/cloud/aws/ec2-setup.rst +++ b/docs/install/cloud/aws/ec2-setup.rst @@ -5,11 +5,6 @@ Running CrateDB on Amazon EC2 ============================= -.. rubric:: Table of contents - -.. contents:: - :local: - Introduction ============ diff --git a/docs/install/cloud/aws/s3-setup.rst b/docs/install/cloud/aws/s3-setup.rst index 372a3022..ae0e4996 100644 --- a/docs/install/cloud/aws/s3-setup.rst +++ b/docs/install/cloud/aws/s3-setup.rst @@ -9,11 +9,6 @@ CrateDB supports using the `Amazon S3`_ (Amazon Simple Storage Service) as a snapshot repository. For this, you need to register the AWS plugin with CrateDB. -.. rubric:: Table of contents - -.. contents:: - :local: - Basic configuration =================== diff --git a/docs/install/cloud/azure/vm.rst b/docs/install/cloud/azure/vm.rst index 6166e48c..e8630b75 100644 --- a/docs/install/cloud/azure/vm.rst +++ b/docs/install/cloud/azure/vm.rst @@ -8,11 +8,6 @@ Getting CrateDB working on Azure with Linux or Windows is a simple process. You can use Azure's management console or CLI interface (`Learn how to install here`_). -.. rubric:: Table of contents - -.. contents:: - :local: - Azure and Linux =============== diff --git a/docs/install/container/docker.rst b/docs/install/container/docker.rst index 5323ef92..44261a59 100644 --- a/docs/install/container/docker.rst +++ b/docs/install/container/docker.rst @@ -24,12 +24,6 @@ This document covers the essentials of running CrateDB on Docker. The official `CrateDB Docker image`_. -.. rubric:: Table of contents - -.. contents:: - :local: - - Quick start =========== diff --git a/docs/install/container/kubernetes/kubernetes.rst b/docs/install/container/kubernetes/kubernetes.rst index 617bd310..b3b5cb48 100644 --- a/docs/install/container/kubernetes/kubernetes.rst +++ b/docs/install/container/kubernetes/kubernetes.rst @@ -29,10 +29,6 @@ Together, Docker and Kubernetes are a fantastic way to deploy and scale CrateDB. The official `CrateDB Docker image`_. -.. rubric:: Table of contents - -.. contents:: - :local: Managing Kubernetes =================== diff --git a/docs/integrate/etl/mysql.rst b/docs/integrate/etl/mysql.rst index 215701cd..e23c1293 100644 --- a/docs/integrate/etl/mysql.rst +++ b/docs/integrate/etl/mysql.rst @@ -11,11 +11,6 @@ Various ways exist to migrate your existing data from MySQL_ to CrateDB_. However, these methods may differ in performance. A fast and reliable way to migrate is to use CrateDB's existing export and import tools. -.. rubric:: Table of contents - -.. contents:: - :local: - Setting up the example table ============================ diff --git a/docs/integrate/visualize/grafana.rst b/docs/integrate/visualize/grafana.rst index 99ccff30..7fd774ea 100644 --- a/docs/integrate/visualize/grafana.rst +++ b/docs/integrate/visualize/grafana.rst @@ -5,9 +5,9 @@ Visualize data with Grafana =========================== -`Grafana`_ is an open-source tool that helps you build real-time dashboards, -graphs, and all sorts of data visualizations. It is the perfect complement -to CrateDB, which is purpose-built for monitoring large volumes of machine +`Grafana`_ is an open-source tool that helps you build real-time dashboards, +graphs, and all sorts of data visualizations. It is the perfect complement +to CrateDB, which is purpose-built for monitoring large volumes of machine data in real-time. For the purposes of this guide, it is assumed that you @@ -15,30 +15,25 @@ have a cluster up and running and can access the Console. If not, please refer to the :ref:`tutorial on how to deploy a cluster for the first time `. -.. rubric:: Table of contents - -.. contents:: - :local: - .. _grafana-load-dataset: Load a sample dataset ===================== -To visualize data with Grafana, a dataset is needed first. In this sample, +To visualize data with Grafana, a dataset is needed first. In this sample, demo data is added directly via the CrateDB Cloud Console. To import the data -go to the Overview page of your deployed cluster. +go to the Overview page of your deployed cluster. .. image:: /_assets/img/integrations/cloud-cluster-overview.png :alt: Cloud Console Clusters overview Once on the Overview page, click on the *import the demo data* link in the "Next steps" section of the Console. A window with 2 SQL statements will -appear. The first of them creates a table that will host the data from NYC +appear. The first of them creates a table that will host the data from NYC Taxi & Limousine Commission which is used in this example. The second statement imports the data into the table created in the first step. These -statements must be executed in the shown order. First "1. Create the table" +statements must be executed in the shown order. First "1. Create the table" and then "2. Import the data". .. image:: /_assets/img/integrations/grafana/grafana-import.png @@ -78,8 +73,8 @@ Grafana Home page. :alt: Grafana Home page To visualize the data, you must add a data source. To do this, click on the -cogwheel "Settings" icon in the left menu bar. This should take you to the -Data sources Configuration page. +cogwheel "Settings" icon in the left menu bar. This should take you to the +Data sources Configuration page. .. image:: /_assets/img/integrations/grafana/grafana-settings.png :alt: Grafana Settings @@ -99,9 +94,9 @@ screenshot below. The *host* and *user* credentials may appear differently to you. The host can be found on the Overview page of your cluster on CrateDB Cloud under the -*Learn how to connect to the cluster* link. You will want to use the psql +*Learn how to connect to the cluster* link. You will want to use the psql link. Depending on the region where your cluster is deployed it might look -something like: +something like: .. code-block:: console @@ -120,10 +115,10 @@ on to creating some dashboards. Build your first Grafana dashboard ================================== -Now that you've got the data imported to CrateDB Cloud and Grafana connected +Now that you've got the data imported to CrateDB Cloud and Grafana connected to it, it's time to visualize that data. In Grafana this is done using Dashboards. To create a new dashboard click on the *Create your first -dashboard* on the Grafana homepage. You will be greeted by a dashboard +dashboard* on the Grafana homepage. You will be greeted by a dashboard creation page. .. image:: /_assets/img/integrations/grafana/grafana-new-dashboard.png @@ -131,10 +126,10 @@ creation page. In Grafana, dashboards are composed of individual blocks called panels, to which you can assign different visualization types and individual queries. -First, click on *Add new panel*. +First, click on *Add new panel*. That will bring you to the panel creation page. Here you define the -query for your panel, the type of visualization (like graphs, stats, tables, +query for your panel, the type of visualization (like graphs, stats, tables, or bar charts), and the time range. Grafana offers a lot of options for data visualization, so this guide will showcase two simple use-cases. It is recommended to look into the documentation on `Grafana panels`_. @@ -160,9 +155,9 @@ plot the number of rides per day in the first week of July 2019: .. NOTE:: Something important to know about the "Time series" format mode in Grafana - is that your query needs to return a column called "time". Grafana will - identify this as your time metric, so make sure the column has the proper - datatype (any datatype representing an `epoch time`_). In this query, + is that your query needs to return a column called "time". Grafana will + identify this as your time metric, so make sure the column has the proper + datatype (any datatype representing an `epoch time`_). In this query, we're labeling pickup_datetime as "time" for this reason. Once you input these SQL statements, there are a couple of adjustments you can @@ -206,7 +201,7 @@ of the new panel: Under the graph itself, click on the *average_distance_per_ride*. This will show only the value we are interested in. Also, in the right menu under "Graph -style" select "Bars" once again. After that, you should have a panel similar +style" select "Bars" once again. After that, you should have a panel similar to this: .. image:: /_assets/img/integrations/grafana/grafana-panel2.png @@ -218,7 +213,7 @@ Dashboard overview, you will have a collection of two very useful graphs. .. image:: /_assets/img/integrations/grafana/grafana-dashboard-final.png :alt: Grafana completed dashboard -Now you know how to get started with data visualization in Grafana. To find +Now you know how to get started with data visualization in Grafana. To find out more, refer to the `Grafana documentation`_. diff --git a/docs/integrate/visualize/metabase.rst b/docs/integrate/visualize/metabase.rst index 8ecc971e..d8d5bf7a 100644 --- a/docs/integrate/visualize/metabase.rst +++ b/docs/integrate/visualize/metabase.rst @@ -6,11 +6,6 @@ Visualize data with Metabase This tutorial introduces `Metabase`_, an ultimate data analysis and visualization tool that unlocks the full potential of your data. -.. rubric:: Table of contents - -.. contents:: - :local: - .. _metabase-prereqs: Prerequisites @@ -31,7 +26,7 @@ Initial configuration Metabase offers both cloud version and local installation. Whichever you choose, the first step will be adding your CrateDB cluster as a new database. -To do that, go to the ``Admin Settings`` -> ``Setup``, and choose +To do that, go to the ``Admin Settings`` -> ``Setup``, and choose the ``Add a database`` option. .. image:: /_assets/img/integrations/metabase/metabase-add-database.png @@ -70,10 +65,10 @@ Now you are ready to visualize your data. Metabase works by asking questions. You ask a question, and Metabase answers it in a visual form. These questions can then be saved to form dashboards. To ask a question, go to ``Home`` and click on ``New`` -> ``Question`` in the upper right corner. Then select the -database and a table from it. +database and a table from it. As an example, we ask about the Average tip amount, -sorted by the passenger count. +sorted by the passenger count. .. image:: /_assets/img/integrations/metabase/metabase-question.png :alt: Asking a question diff --git a/docs/performance/inserts/bulk.rst b/docs/performance/inserts/bulk.rst index a749e355..0ce279ea 100644 --- a/docs/performance/inserts/bulk.rst +++ b/docs/performance/inserts/bulk.rst @@ -37,10 +37,6 @@ The rest of this document goes into more detail. nodes. CrateDB is a :ref:`distributed database `, and so, increasing overall cluster size is generally a good way to improve performance. -.. rubric:: Table of contents - -.. contents:: - :local: .. SEEALSO:: diff --git a/docs/performance/inserts/methods.rst b/docs/performance/inserts/methods.rst index fe4be156..42c7975d 100644 --- a/docs/performance/inserts/methods.rst +++ b/docs/performance/inserts/methods.rst @@ -9,11 +9,6 @@ CrateDB supports multiple ways to insert data. Some insert methods can be faster than others, depending on your setup. Choosing the best insert method is an easy way to improve insert performance. -.. rubric:: Table of contents - -.. contents:: - :local: - .. _insert_statement_types: diff --git a/docs/performance/inserts/parallel.rst b/docs/performance/inserts/parallel.rst index 351f23d0..22acbb6f 100644 --- a/docs/performance/inserts/parallel.rst +++ b/docs/performance/inserts/parallel.rst @@ -25,11 +25,6 @@ response before sending another insert. even better performance from bulk opperations. -.. rubric:: Table of contents - -.. contents:: - :local: - Example ======= diff --git a/docs/performance/inserts/testing.rst b/docs/performance/inserts/testing.rst index 06537b44..41c429d9 100644 --- a/docs/performance/inserts/testing.rst +++ b/docs/performance/inserts/testing.rst @@ -12,11 +12,6 @@ by testing insert performance on a single node. You should only increase the size of your cluster for testing once you have established the baseline performance on a single node. -.. rubric:: Table of contents - -.. contents:: - :local: - Test data ========= diff --git a/docs/performance/inserts/tuning.rst b/docs/performance/inserts/tuning.rst index c8c6f822..4ce9edf6 100644 --- a/docs/performance/inserts/tuning.rst +++ b/docs/performance/inserts/tuning.rst @@ -7,11 +7,6 @@ Configuration Tuning for Inserts This document outlines a number of hardware and software configuration changes you can make to tune your setup for inserts performance. -.. rubric:: Table of contents - -.. contents:: - :local: - Hardware ======== diff --git a/docs/performance/sharding.rst b/docs/performance/sharding.rst index 51fe382c..94e06833 100644 --- a/docs/performance/sharding.rst +++ b/docs/performance/sharding.rst @@ -21,11 +21,6 @@ the type of hardware you're using. If you are looking for an intro to sharding, see :ref:`sharding `. -.. rubric:: Table of contents - -.. contents:: - :local: - Optimising for query performance ================================ From 2109c32c3ed343fec4a0f14d82b2fa2a20d3e1ef Mon Sep 17 00:00:00 2001 From: Brian Munkholm Date: Thu, 26 Jun 2025 21:58:16 +0200 Subject: [PATCH 2/4] Wording tweaks after Coderabbit review --- docs/admin/circuit-breaker.rst | 2 +- docs/admin/clustering/multi-node-setup.rst | 8 +-- docs/admin/upgrade/planning.rst | 69 ++++++++++++++++------ docs/integrate/visualize/grafana.rst | 8 +-- docs/integrate/visualize/metabase.rst | 9 ++- 5 files changed, 63 insertions(+), 33 deletions(-) diff --git a/docs/admin/circuit-breaker.rst b/docs/admin/circuit-breaker.rst index 64b12b0c..b96210e7 100644 --- a/docs/admin/circuit-breaker.rst +++ b/docs/admin/circuit-breaker.rst @@ -20,7 +20,7 @@ oversize aggregation or JOIN can allocate gigabytes in milliseconds. The breaker How Circuit Breakers Work in CrateDB ==================================== A query executes as an ordered set of operations. Before running each stage, CrateDB estimates the extra memory that step will need. -If the projected total would exceed the breaker limit, the system aborts the query and returns a ``CircuitBreakingException``. +If the projected total exceeds the breaker limit, the system aborts the query and returns a ``CircuitBreakingException``. This pre-emptive trip prevents the JVM's garbage collector from reaching an unrecoverable out-of-memory state. It is important to understand CrateDB doesn’t aspire to do a fully accurate memory accounting, but instead opts for a best-effort approach, diff --git a/docs/admin/clustering/multi-node-setup.rst b/docs/admin/clustering/multi-node-setup.rst index 4054a9ad..71f19168 100644 --- a/docs/admin/clustering/multi-node-setup.rst +++ b/docs/admin/clustering/multi-node-setup.rst @@ -94,7 +94,7 @@ instructions. 2. It is common to configure the :ref:`metadata gateway ` so that the cluster waits for all data nodes to be online before starting the - recovery of the shards. In this case let's set + recovery of the shards. In this case, let's set `gateway.expected_data_nodes`_ to **3** and `gateway.recover_after_data_nodes`_ also to **3**. You can specify these settings in the `configuration`_ file of the unpacked directory. @@ -316,7 +316,7 @@ network partition (also known as a `split-brain`_ scenario). CrateDB (versions 4.x and above) will automatically determine the ideal `quorum size`_, but if you are using CrateDB versions 3.x and below, you must manually set -the quorum size using the `discovery.zen.minimum_master_nodes`_ setting and for +the quorum size using the `discovery.zen.minimum_master_nodes`_ setting. For a three-node cluster, you must declare all nodes to be master-eligible. .. _metadata-gateway: @@ -328,8 +328,8 @@ When running a multi-node cluster, you can configure the :ref:`metadata gateway settings so that CrateDB delays recovery until a certain number of nodes is available. This is useful because if recovery is started when some nodes are down -CrateDB will proceed on the basis the nodes that are down may not be coming -back, and it will create new replicas and rebalance shards as necessary. +CrateDB will proceed on the basis that the nodes that are down may not come +back, creating new replicas and rebalance shards as necessary. This is an expensive operation that, depending on the context, may be better avoided if the nodes are only down for a short period of time. So, for instance, for a three-nodes cluster, you can decide to set diff --git a/docs/admin/upgrade/planning.rst b/docs/admin/upgrade/planning.rst index 61ec75ed..dd3ffcda 100644 --- a/docs/admin/upgrade/planning.rst +++ b/docs/admin/upgrade/planning.rst @@ -10,49 +10,80 @@ General Upgrade Guidelines Upgrade Planning ================ -Before kicking off an upgrade, there is a set of guidelines to ensure the best outcome. Below you may find the fundamental steps to prepare for an upgrade. +Before kicking off an upgrade, consider the following steps to prepare for an +upgrade. .. NOTE:: - This is not an exhaustive list, so you should consider your organization's specific needs and incorporate any additional steps or considerations that are relevant to your environment. + This is not an exhaustive list, so you should consider your organization's + specific needs and incorporate any additional steps or considerations that + are relevant to your environment. Acknowledge breaking changes ---------------------------- -Review the :ref:`release notes ` and documentation for the target version to understand any potential impact on existing functionality. -Ensure to review the intermediate versions' documentation also. For example, when upgrading from 4.8 to 5.3, besides reviewing 5.3 release notes, check for version 5.0, 5.1, and so on. +Review the :ref:`release notes ` and documentation +for the target version to understand any potential impact on existing functionality. +Ensure to review the intermediate versions' documentation also. For example, when +upgrading from 4.8 to 5.3, besides reviewing 5.3 release notes, check for version +5.0, 5.1, and so on. Set up a test environment ------------------------- -Create a test environment that closely resembles your production environment, including the same CrateDB version, hardware, and network configuration. Populate the test environment with representative data and perform thorough testing to ensure compatibility and functionality, including functional and non-functional testing. +Create a test environment that closely resembles your production environment, +including the same CrateDB version, hardware, and network configuration. +Populate the test environment with representative data and perform thorough +testing to ensure compatibility and functionality, including functional and +non-functional testing. Back up and plan recovery ------------------------- -Perform a cluster-wide backup of your production CrateDB and ensure you have a reliable recovery mechanism in place. Read more in the :ref:`snapshots ` documentation. +Perform a cluster-wide backup of your production CrateDB and ensure you have a +reliable recovery mechanism in place. Read more in the +:ref:`snapshots ` documentation. -For the newly written records, you should consider using a mechanism to queue them (e.g. message queue), so these messages can be replayed if needed. +For the newly written records, you should consider using a mechanism to queue +them (e.g. message queue), so these messages can be replayed if needed. .. WARNING:: - Before starting the upgrade process, ensure no backup processes are triggered, so disable any scheduled backup. + Before starting the upgrade, ensure no backup jobs are started by disabling + any scheduled backup. Define a rollback plan ---------------------- -The rollback plan may vary depending on the specific infrastructure and upgrade process in use. It is also essential to adapt this outline to your organization's specific needs and incorporate any additional steps or considerations that are relevant to your environment. A set of steps to serve as an example is listed below: - -* **Identify the issue:** Determine the specific problem that occurred during the upgrade. This could be related to data corruption, performance degradation, application errors, or any other issue that affects the normal functioning of CrateDB. Identify if there are any potential risks to the system's stability, security, or performance. - -* **Communicate the situation:** Notify all relevant stakeholders, including individuals involved in the upgrade process. Clearly explain the problem and the decision to initiate a rollback. - -* **Execute the rollback:** The rollback process may differ depending on the version jump. If upgrading from one patch release to another and there is no data corruption, only a performance issue, a simple in-place downgrade to the previous patch release is sufficient. For major/minor version jumps or in case of data corruption, restoring from a backup is required. - -* **Perform data validation:** Conduct a thorough data validation process to ensure the integrity of the CrateDB Cluster. Verify that all critical data is intact and accurate. If needed, replay the messages from the message queue. - -* **Share insights:** Communicate any findings and the defined plan to retry the upgrade. +The rollback plan may vary depending on the specific infrastructure and upgrade +process in use. It is also essential to adapt this outline to your organization's +specific needs and incorporate any additional steps or considerations that are +relevant to your environment. A set of steps to serve as an example is listed +below: + +* **Identify the issue:** Determine the specific problem that occurred during +the upgrade. This could be related to data corruption, performance degradation, +application errors, or any other issue that affects the normal functioning of +CrateDB. Identify if there are any potential risks to the system's stability, +security, or performance. + +* **Communicate the situation:** Notify all relevant stakeholders, including +individuals involved in the upgrade process. Clearly explain the problem and the +decision to initiate a rollback. + +* **Execute the rollback:** The rollback process may differ depending on the +version jump. If upgrading from one patch release to another and there is no data +corruption, only a performance issue, a simple in-place downgrade to the previous +patch release is sufficient. For major/minor version jumps or in case of data +corruption, restoring from a backup is required. + +* **Perform data validation:** Conduct a thorough data validation process to +ensure the integrity of the CrateDB Cluster. Verify that all critical data is +intact and accurate. If needed, replay the messages from the message queue. + +* **Share insights:** Communicate any findings and the defined plan to retry the +upgrade. diff --git a/docs/integrate/visualize/grafana.rst b/docs/integrate/visualize/grafana.rst index 7fd774ea..80a333bb 100644 --- a/docs/integrate/visualize/grafana.rst +++ b/docs/integrate/visualize/grafana.rst @@ -31,7 +31,7 @@ go to the Overview page of your deployed cluster. Once on the Overview page, click on the *import the demo data* link in the "Next steps" section of the Console. A window with 2 SQL statements will appear. The first of them creates a table that will host the data from NYC -Taxi & Limousine Commission which is used in this example. The second +Taxi & Limousine Commission which is used in this example. The second statement imports the data into the table created in the first step. These statements must be executed in the shown order. First "1. Create the table" and then "2. Import the data". @@ -117,7 +117,7 @@ Build your first Grafana dashboard Now that you've got the data imported to CrateDB Cloud and Grafana connected to it, it's time to visualize that data. In Grafana this is done using -Dashboards. To create a new dashboard click on the *Create your first +Dashboards. To create a new dashboard, click on the *Create your first dashboard* on the Grafana homepage. You will be greeted by a dashboard creation page. @@ -126,12 +126,12 @@ creation page. In Grafana, dashboards are composed of individual blocks called panels, to which you can assign different visualization types and individual queries. -First, click on *Add new panel*. +First, click *Add new panel*. That will bring you to the panel creation page. Here you define the query for your panel, the type of visualization (like graphs, stats, tables, or bar charts), and the time range. Grafana offers a lot of options for data -visualization, so this guide will showcase two simple use-cases. It is +visualization, so this guide will showcase two simple use cases. It is recommended to look into the documentation on `Grafana panels`_. To create a panel, you start by defining the query. To do that click on the diff --git a/docs/integrate/visualize/metabase.rst b/docs/integrate/visualize/metabase.rst index d8d5bf7a..6ff8659d 100644 --- a/docs/integrate/visualize/metabase.rst +++ b/docs/integrate/visualize/metabase.rst @@ -24,9 +24,9 @@ import your own data similarly to how it's done `in this how-to`_ . Initial configuration --------------------- -Metabase offers both cloud version and local installation. Whichever you -choose, the first step will be adding your CrateDB cluster as a new database. -To do that, go to the ``Admin Settings`` -> ``Setup``, and choose +Metabase offers both a cloud version and a local installation. Whichever you +choose, the first step will be to add your CrateDB cluster as a new database. +To do that, go to ``Admin Settings`` -> ``Setup`` and choose the ``Add a database`` option. .. image:: /_assets/img/integrations/metabase/metabase-add-database.png @@ -67,8 +67,7 @@ can then be saved to form dashboards. To ask a question, go to ``Home`` and click on ``New`` -> ``Question`` in the upper right corner. Then select the database and a table from it. -As an example, we ask about the Average tip amount, -sorted by the passenger count. +As an example, we ask for the average tip amount, sorted by the passenger count. .. image:: /_assets/img/integrations/metabase/metabase-question.png :alt: Asking a question From 70b0d5deda2401689ba38076a046e97148351f35 Mon Sep 17 00:00:00 2001 From: Brian Munkholm Date: Thu, 26 Jun 2025 22:22:05 +0200 Subject: [PATCH 3/4] fix --- docs/admin/upgrade/planning.rst | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/admin/upgrade/planning.rst b/docs/admin/upgrade/planning.rst index dd3ffcda..9f9d1856 100644 --- a/docs/admin/upgrade/planning.rst +++ b/docs/admin/upgrade/planning.rst @@ -63,27 +63,27 @@ relevant to your environment. A set of steps to serve as an example is listed below: * **Identify the issue:** Determine the specific problem that occurred during -the upgrade. This could be related to data corruption, performance degradation, -application errors, or any other issue that affects the normal functioning of -CrateDB. Identify if there are any potential risks to the system's stability, -security, or performance. + the upgrade. This could be related to data corruption, performance degradation, + application errors, or any other issue that affects the normal functioning of + CrateDB. Identify if there are any potential risks to the system's stability, + security, or performance. * **Communicate the situation:** Notify all relevant stakeholders, including -individuals involved in the upgrade process. Clearly explain the problem and the -decision to initiate a rollback. + individuals involved in the upgrade process. Clearly explain the problem and the + decision to initiate a rollback. * **Execute the rollback:** The rollback process may differ depending on the -version jump. If upgrading from one patch release to another and there is no data -corruption, only a performance issue, a simple in-place downgrade to the previous -patch release is sufficient. For major/minor version jumps or in case of data -corruption, restoring from a backup is required. + version jump. If upgrading from one patch release to another and there is no data + corruption, only a performance issue, a simple in-place downgrade to the previous + patch release is sufficient. For major/minor version jumps or in case of data + corruption, restoring from a backup is required. * **Perform data validation:** Conduct a thorough data validation process to -ensure the integrity of the CrateDB Cluster. Verify that all critical data is -intact and accurate. If needed, replay the messages from the message queue. + ensure the integrity of the CrateDB Cluster. Verify that all critical data is + intact and accurate. If needed, replay the messages from the message queue. * **Share insights:** Communicate any findings and the defined plan to retry the -upgrade. + upgrade. From d31447efebd81bf348a48c78c0e2b29b2c203250 Mon Sep 17 00:00:00 2001 From: Brian Munkholm Date: Thu, 26 Jun 2025 22:47:32 +0200 Subject: [PATCH 4/4] Disable linkcheck for web.archive --- docs/conf.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/conf.py b/docs/conf.py index 5298fd7e..75121740 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -51,6 +51,8 @@ r"https://www.softwareag.com/.*", # 403 Client Error: Forbidden for url r"https://dzone.com/.*", + # 504 Client Error: Gateway Timeout for url + r"https://web.archive.org/.*", ] linkcheck_anchors_ignore_for_url += [