|
| 1 | +.. _monitoring-grafana_dashboard-page: |
| 2 | + |
| 3 | +=============================================================================== |
| 4 | +Grafana dashboard |
| 5 | +=============================================================================== |
| 6 | + |
| 7 | +.. IMPORTANT:: |
| 8 | + |
| 9 | + TODO: |
| 10 | + |
| 11 | + - Update to 3.x config |
| 12 | + - Use built-in 3.x roles for **crud** |
| 13 | + - Use built-in 3.x roles for **expirationd** |
| 14 | + - Remove Cartridge content |
| 15 | + - Remove TDG content |
| 16 | + |
| 17 | +Tarantool Grafana dashboard is available as part of |
| 18 | +`Grafana Official & community built dashboards <https://grafana.com/grafana/dashboards>`_. |
| 19 | +There's a version |
| 20 | +`for Prometheus data source <https://grafana.com/grafana/dashboards/13054>`_ |
| 21 | +and one `for InfluxDB data source <https://grafana.com/grafana/dashboards/12567>`_. |
| 22 | +There are also separate dashboards for TDG applications: |
| 23 | +`for Prometheus data source <https://grafana.com/grafana/dashboards/16406>`_ |
| 24 | +and `for InfluxDB data source <https://grafana.com/grafana/dashboards/16405>`_. |
| 25 | +Tarantool Grafana dashboard is a ready for import template with basic memory, |
| 26 | +space operations, and HTTP load panels, based on default `metrics <https://github.com/tarantool/metrics>`_ |
| 27 | +package functionality. |
| 28 | + |
| 29 | +Dashboard requires using ``metrics`` **0.15.0** or newer for complete experience; |
| 30 | +``'alias'`` :ref:`global label <metrics-api_reference-labels>` must be set on each instance |
| 31 | +to properly display panels (e.g. provided with ``cartridge.roles.metrics`` role). |
| 32 | + |
| 33 | +To support `CRUD <https://github.com/tarantool/crud>`_ statistics, install ``CRUD`` |
| 34 | +**0.11.1** or newer. Call ``crud.cfg`` on router to enable CRUD statistics collect |
| 35 | +with latency quantiles. |
| 36 | + |
| 37 | +.. code-block:: lua |
| 38 | +
|
| 39 | + crud.cfg{ |
| 40 | + stats = true, |
| 41 | + stats_driver='metrics', |
| 42 | + stats_quantiles=true |
| 43 | + } |
| 44 | +
|
| 45 | +To support `expirationd <https://github.com/tarantool/expirationd>`_ statistics, |
| 46 | +install ``expirationd`` **1.2.0** or newer. Call ``expirationd.cfg`` on instance |
| 47 | +to enable statistics export. |
| 48 | + |
| 49 | +.. code-block:: lua |
| 50 | +
|
| 51 | + expirationd.cfg{metrics = true} |
| 52 | +
|
| 53 | +.. image:: images/Prometheus_dashboard_1.png |
| 54 | + :width: 30% |
| 55 | + |
| 56 | +.. image:: images/Prometheus_dashboard_2.png |
| 57 | + :width: 30% |
| 58 | + |
| 59 | +.. image:: images/Prometheus_dashboard_3.png |
| 60 | + :width: 30% |
| 61 | + |
| 62 | +.. _monitoring-grafana_dashboard-monitoring_stack: |
| 63 | + |
| 64 | +------------------------------------------------------------------------------- |
| 65 | +Prepare a monitoring stack |
| 66 | +------------------------------------------------------------------------------- |
| 67 | + |
| 68 | +Since there are Prometheus and InfluxDB data source Grafana dashboards, |
| 69 | +you can use |
| 70 | + |
| 71 | +- `Telegraf <https://www.influxdata.com/time-series-platform/telegraf/>`_ |
| 72 | + as a server agent for collecting metrics, `InfluxDB <https://www.influxdata.com/>`_ |
| 73 | + as a time series database for storing metrics, and `Grafana <https://grafana.com/>`_ |
| 74 | + as a visualization platform; or |
| 75 | +- `Prometheus <https://prometheus.io/>`_ as both a server agent for collecting metrics |
| 76 | + and a time series database for storing metrics, and `Grafana <https://grafana.com/>`_ |
| 77 | + as a visualization platform. |
| 78 | + |
| 79 | +For issues concerning setting up Prometheus, Telegraf, InfluxDB, or Grafana instances |
| 80 | +please refer to the corresponding project's documentation. |
| 81 | + |
| 82 | +.. _monitoring-grafana_dashboard-collect_metrics: |
| 83 | + |
| 84 | +------------------------------------------------------------------------------- |
| 85 | +Collect metrics with server agents |
| 86 | +------------------------------------------------------------------------------- |
| 87 | + |
| 88 | +To collect metrics for Prometheus, first set up metrics output with |
| 89 | +``prometheus`` format. You can use :ref:`cartridge.roles.metrics <monitoring-getting_started-cartridge_role>` |
| 90 | +configuration or set up the :ref:`Prometheus output plugin <metrics-plugins-available>` |
| 91 | +manually. To start collecting metrics, |
| 92 | +`add a job <https://prometheus.io/docs/prometheus/latest/getting_started/#configure-prometheus-to-monitor-the-sample-targets>`_ |
| 93 | +to Prometheus configuration with each Tarantool instance URI as a target and |
| 94 | +metrics path as it was configured on Tarantool instances: |
| 95 | + |
| 96 | +.. code-block:: yaml |
| 97 | +
|
| 98 | + scrape_configs: |
| 99 | + - job_name: tarantool |
| 100 | + static_configs: |
| 101 | + - targets: |
| 102 | + - "example_project:8081" |
| 103 | + - "example_project:8082" |
| 104 | + - "example_project:8083" |
| 105 | + metrics_path: "/metrics/prometheus" |
| 106 | +
|
| 107 | +
|
| 108 | +To collect metrics for InfluxDB, use the Telegraf agent. |
| 109 | +First off, configure Tarantool metrics output in ``json`` format |
| 110 | +with :ref:`cartridge.roles.metrics <monitoring-getting_started-cartridge_role>` |
| 111 | +configuration or corresponding :ref:`JSON output plugin <metrics-plugins-available>`. |
| 112 | +To start collecting metrics, add `http input <https://github.com/influxdata/telegraf/blob/release-1.17/plugins/inputs/http/README.md>`_ |
| 113 | +to Telegraf configuration including each Tarantool instance metrics URL: |
| 114 | + |
| 115 | +.. code-block:: toml |
| 116 | +
|
| 117 | + [[inputs.http]] |
| 118 | + urls = [ |
| 119 | + "http://example_project:8081/metrics/json", |
| 120 | + "http://example_project:8082/metrics/json", |
| 121 | + "http://example_project:8083/metrics/json" |
| 122 | + ] |
| 123 | + timeout = "30s" |
| 124 | + tag_keys = [ |
| 125 | + "metric_name", |
| 126 | + "label_pairs_alias", |
| 127 | + "label_pairs_quantile", |
| 128 | + "label_pairs_path", |
| 129 | + "label_pairs_method", |
| 130 | + "label_pairs_status", |
| 131 | + "label_pairs_operation", |
| 132 | + "label_pairs_level", |
| 133 | + "label_pairs_id", |
| 134 | + "label_pairs_engine", |
| 135 | + "label_pairs_name", |
| 136 | + "label_pairs_index_name", |
| 137 | + "label_pairs_delta", |
| 138 | + "label_pairs_stream", |
| 139 | + "label_pairs_thread", |
| 140 | + "label_pairs_kind" |
| 141 | + ] |
| 142 | + insecure_skip_verify = true |
| 143 | + interval = "10s" |
| 144 | + data_format = "json" |
| 145 | + name_prefix = "tarantool_" |
| 146 | + fieldpass = ["value"] |
| 147 | +
|
| 148 | +Be sure to include each label key as ``label_pairs_<key>`` so it will be |
| 149 | +extracted with plugin. For example, if you use :code:`{ state = 'ready' }` labels |
| 150 | +somewhere in metric collectors, add ``label_pairs_state`` tag key. |
| 151 | + |
| 152 | +For TDG dashboard, please use |
| 153 | + |
| 154 | +.. code-block:: toml |
| 155 | +
|
| 156 | + [[inputs.http]] |
| 157 | + urls = [ |
| 158 | + "http://example_tdg_project:8081/metrics/json", |
| 159 | + "http://example_tdg_project:8082/metrics/json", |
| 160 | + "http://example_tdg_project:8083/metrics/json" |
| 161 | + ] |
| 162 | + timeout = "30s" |
| 163 | + tag_keys = [ |
| 164 | + "metric_name", |
| 165 | + "label_pairs_alias", |
| 166 | + "label_pairs_quantile", |
| 167 | + "label_pairs_path", |
| 168 | + "label_pairs_method", |
| 169 | + "label_pairs_status", |
| 170 | + "label_pairs_operation", |
| 171 | + "label_pairs_level", |
| 172 | + "label_pairs_id", |
| 173 | + "label_pairs_engine", |
| 174 | + "label_pairs_name", |
| 175 | + "label_pairs_index_name", |
| 176 | + "label_pairs_delta", |
| 177 | + "label_pairs_stream", |
| 178 | + "label_pairs_thread", |
| 179 | + "label_pairs_type", |
| 180 | + "label_pairs_connector_name", |
| 181 | + "label_pairs_broker_name", |
| 182 | + "label_pairs_topic", |
| 183 | + "label_pairs_request", |
| 184 | + "label_pairs_kind", |
| 185 | + "label_pairs_thread_name", |
| 186 | + "label_pairs_type_name", |
| 187 | + "label_pairs_operation_name", |
| 188 | + "label_pairs_schema", |
| 189 | + "label_pairs_entity", |
| 190 | + "label_pairs_status_code" |
| 191 | + ] |
| 192 | + insecure_skip_verify = true |
| 193 | + interval = "10s" |
| 194 | + data_format = "json" |
| 195 | + name_prefix = "tarantool_" |
| 196 | + fieldpass = ["value"] |
| 197 | +
|
| 198 | +If you connect Telegraf instance to InfluxDB storage, metrics will be stored |
| 199 | +with ``"<name_prefix>http"`` measurement (``"tarantool_http"`` in our example). |
| 200 | + |
| 201 | +.. _monitoring-grafana_dashboard-import: |
| 202 | + |
| 203 | +------------------------------------------------------------------------------- |
| 204 | +Import the dashboard |
| 205 | +------------------------------------------------------------------------------- |
| 206 | +Open Grafana import menu. |
| 207 | + |
| 208 | +.. image:: images/grafana_import.png |
| 209 | + :align: left |
| 210 | + |
| 211 | +To import a specific dashboard, choose one of the following options: |
| 212 | + |
| 213 | +- paste the dashboard id (``12567`` for InfluxDB dashboard, ``13054`` for Prometheus dashboard, |
| 214 | + ``16405`` for InfluxDB TDG dashboard, ``16406`` for Prometheus TDG dashboard), or |
| 215 | +- paste a link to the dashboard ( |
| 216 | + https://grafana.com/grafana/dashboards/12567 for InfluxDB dashboard, |
| 217 | + https://grafana.com/grafana/dashboards/13054 for Prometheus dashboard, |
| 218 | + https://grafana.com/grafana/dashboards/16405 for InfluxDB TDG dashboard, |
| 219 | + https://grafana.com/grafana/dashboards/16406 for Prometheus TDG dashboard), or |
| 220 | +- paste the dashboard JSON file contents, or |
| 221 | +- upload the dashboard JSON file. |
| 222 | + |
| 223 | +Set dashboard name, folder and uid (if needed). |
| 224 | + |
| 225 | +.. image:: images/grafana_import_setup.png |
| 226 | + :align: left |
| 227 | + |
| 228 | +You can choose datasource and datasource variables after import. |
| 229 | + |
| 230 | +.. image:: images/grafana_variables_setup.png |
| 231 | + :align: left |
| 232 | + |
| 233 | +.. _monitoring-grafana_dashboard-troubleshooting: |
| 234 | + |
| 235 | +------------------------------------------------------------------------------- |
| 236 | +Troubleshooting |
| 237 | +------------------------------------------------------------------------------- |
| 238 | + |
| 239 | +If there are no data on the graphs, make sure that you picked datasource and job/measurement correctly. |
| 240 | + |
| 241 | +If there are no data on the graphs, make sure that you have ``info`` group of Tarantool metrics |
| 242 | +(in particular, ``tnt_info_uptime``). |
| 243 | + |
| 244 | +If some Prometheus graphs show no data because of ``parse error: missing unit character in duration``, |
| 245 | +ensure that you use Grafana 7.2 or newer. |
| 246 | + |
| 247 | +If some Prometheus graphs display ``parse error: bad duration syntax "1m0"`` or similar error, you need |
| 248 | +to update your Prometheus version. See |
| 249 | +`grafana/grafana#44542 <https://github.com/grafana/grafana/issues/44542>`_ for more details. |
0 commit comments