You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Update roadmap for 2021
Implemented since last update:
- Per-tenant retention,
- Soft Multitenancy,
- Prometheus metadata support,
- Bulk loading Thanos data,
- Alertmanager sharding
I also made some changes for better readability.
Signed-off-by: Bryan Boreham <[email protected]>
* Remove bulk-loading from roadmap
It can be done using thanosconvert tool
Signed-off-by: Bryan Boreham <[email protected]>
* Fixed whitespace noise
Signed-off-by: Marco Pracucci <[email protected]>
Co-authored-by: Marco Pracucci <[email protected]>
Copy file name to clipboardExpand all lines: docs/roadmap.md
+13-14Lines changed: 13 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,9 @@ weight: 10
5
5
slug: roadmap
6
6
---
7
7
8
-
The following is only a selection of some of the major features we plan to implement in the near future. To get a more complete overview of planned features and current work, see the issue trackers for the various repositories, for example, the [Cortex repo](https://github.com/cortexproject/cortex/issues). Note that these are not ordered by priority.
8
+
This document highlights some ideas for major features we'd like to implement in the near future.
9
+
To get a more complete overview of planned features and current work, see the [issue tracker](https://github.com/cortexproject/cortex/issues).
10
+
Note that these are not ordered by priority.
9
11
10
12
## Helm charts and other packaging
11
13
@@ -27,22 +29,19 @@ tenants:
27
29
28
30
We have all the metrics to track how many series, samples and queries each tenant is sending but don't have dashboards that help with this. We plan to have dashboards and UIs that will help operators monitor and control each tenants usage out of the box.
29
31
30
-
## Downsampling and Per tenant/metric retention
32
+
## Downsampling
33
+
Downsampling means storing fewer samples, e.g. one per minute instead of one every 15 seconds.
34
+
This makes queries over long periods more efficient. It can reduce storage space slightly if the full-detail data is discarded.
31
35
32
-
Currently, we only support a single retention period for all metrics and tenants. For most operators, the ability to set per tenant retention and also custom retention for subsets of metrics is important. We will add support per tenant and metric retention policies. Also, we currently store all the samples we ingested, and there is no way to reduce the resolution for the metrics. We plan to add downsampling to allow users to store less data when needed.
36
+
## Per-metric retention
33
37
34
-
## Soft Multitenancy
38
+
Cortex blocks storage supports deleting all data for a tenant after a time period (e.g. 3 months, 1 year), but we would also like to have custom retention for subsets of metrics (e.g. delete server metrics but retain business metrics).
35
39
36
-
Currently our multitenancy allows a tenant to view _all_ their metrics and only their metrics. There is no way for an "admin" tenant to view all the metrics in the system but for particular teams to only view theirs. This is another feature we plan to add into Cortex.
37
-
38
-
## Exemplar and Prometheus metadata support
39
-
40
-
There is currently an ongoing effort in Prometheus to add [exemplar support](https://docs.google.com/document/d/1ymZlc9yuTj8GvZyKz1r3KDRrhaOjZ1W1qZVW_5Gj7gA/edit) and we should be an active stakeholder in the discussion. The plan is to propagate the exemplars through remote write and make them available for querying in Cortex. We currently have experimental metadata support for Prometheus but this is using the Grafana Cloud Agent. We should help move this [PR forward](https://github.com/prometheus/prometheus/pull/6815) and also add persistence of the metadata (right now it's only in-mem).
41
-
42
-
## Bulk loading historical data
43
-
44
-
This is another highly requested features. There is currently no way to backfill the existing data in local Prometheus to Cortex. The plan is to add an API for users to ship the TSDB blocks to Cortex and a side-car / command to do this.
let you link metric samples to other data, such as distributed tracing.
43
+
As of early 2021 Prometheus will collect exemplars and send them via remote write, but Cortex needs to be extended to handle them.
45
44
46
45
## Scalability
47
46
48
-
Scalability has always been a focus for the project, but there is a lot more work to be done. We can now scale to 100s of Millions of active series but 1 Billion active series is still an unknown. We also need to make the Alertmanager horizontally scalable with the number of users.
47
+
Scalability has always been a focus for the project, but there is a lot more work to be done. We can now scale to 100s of Millions of active series but 1 Billion active series is still an unknown.
0 commit comments