Skip to content

Commit ec2d25b

Browse files
authored
Experimental TSDB: Enable Azure Storage Backend (#2083)
* Adding Microsoft Azure backend support for TSDB storage engine. Signed-off-by: Ken Haines <[email protected]> * correcting minor typo caught by linter Signed-off-by: Ken Haines <[email protected]> * updating the vendor's module for thanos to include azure file Signed-off-by: Ken Haines <[email protected]> * a few more doc tweaks for consistency Signed-off-by: Ken Haines <[email protected]>
1 parent 21c7e24 commit ec2d25b

File tree

10 files changed

+514
-15
lines changed

10 files changed

+514
-15
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
* [ENHANCEMENT] Experimental TSDB: Export TSDB Syncer metrics from Compactor component, they are prefixed with `cortex_compactor_`. #2023
1414
* [ENHANCEMENT] Experimental TSDB: Added dedicated flag `-experimental.tsdb.bucket-store.tenant-sync-concurrency` to configure the maximum number of concurrent tenants for which blocks are synched. #2026
1515
* [ENHANCEMENT] Experimental TSDB: Expose metrics for objstore operations (prefixed with `cortex_<component>_thanos_objstore_`, component being one of `ingester`, `querier` and `compactor`). #2027
16+
* [ENHANCEMENT] Experiemental TSDB: Added support for Azure Storage to be used for block storage, in addition to S3 and GCS. #2083
1617
* [ENHANCEMENT] Cassanda Storage: added `max_retries`, `retry_min_backoff` and `retry_max_backoff` configuration options to enable retrying recoverable errors. #2054
1718
* [ENHANCEMENT] Allow to configure HTTP and gRPC server listen address, maximum number of simultaneous connections and connection keepalive settings.
1819
* `-server.http-listen-address`

docs/architecture.md

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -33,15 +33,16 @@ The chunks storage stores each single time series into a separate object called
3333
For this reason, the chunks storage consists of:
3434

3535
* An index for the Chunks. This index can be backed by:
36-
* [Amazon DynamoDB](https://aws.amazon.com/dynamodb)
37-
* [Google Bigtable](https://cloud.google.com/bigtable)
38-
* [Apache Cassandra](https://cassandra.apache.org)
36+
* [Amazon DynamoDB](https://aws.amazon.com/dynamodb)
37+
* [Google Bigtable](https://cloud.google.com/bigtable)
38+
* [Apache Cassandra](https://cassandra.apache.org)
3939
* An object store for the Chunk data itself, which can be:
40-
* [Amazon DynamoDB](https://aws.amazon.com/dynamodb)
41-
* [Google Bigtable](https://cloud.google.com/bigtable)
42-
* [Apache Cassandra](https://cassandra.apache.org)
43-
* [Amazon S3](https://aws.amazon.com/s3)
44-
* [Google Cloud Storage](https://cloud.google.com/storage/)
40+
* [Amazon DynamoDB](https://aws.amazon.com/dynamodb)
41+
* [Google Bigtable](https://cloud.google.com/bigtable)
42+
* [Apache Cassandra](https://cassandra.apache.org)
43+
* [Amazon S3](https://aws.amazon.com/s3)
44+
* [Google Cloud Storage](https://cloud.google.com/storage/)
45+
* [Microsoft Azure Storage](https://azure.microsoft.com/en-us/services/storage/)
4546

4647
Internally, the access to the chunks storage relies on a unified interface called "chunks store". Unlike other Cortex components, the chunk store is not a separate service, but rather a library embedded in the services that need to access the long-term storage: [ingester](#ingester), [querier](#querier) and [ruler](#ruler).
4748

@@ -59,6 +60,7 @@ The blocks storage doesn't require a dedicated storage backend for the index. Th
5960

6061
* [Amazon S3](https://aws.amazon.com/s3)
6162
* [Google Cloud Storage](https://cloud.google.com/storage/)
63+
* [Microsoft Azure Storage](https://azure.microsoft.com/en-us/services/storage/)
6264

6365
For more information, please check out the [Blocks storage](operations/blocks-storage.md) documentation.
6466

@@ -142,7 +144,7 @@ We recommend randomly load balancing write requests across distributor instances
142144

143145
### Ingester
144146

145-
The **ingester** service is responsible for writing incoming series to a [long-term storage backend](#storage) on the write path and returning in-memory series samples for queries on the read path.
147+
The **ingester** service is responsible for writing incoming series to a [long-term storage backend](#storage) on the write path and returning in-memory series samples for queries on the read path.
146148

147149
Incoming series are not immediately written to the storage but kept in memory and periodically flushed to the storage (by default, 12 hours for the chunks storage and 2 hours for the experimental blocks storage). For this reason, the [queriers](#querier) may need to fetch samples both from ingesters and long-term storage while executing a query on the read path.
148150

@@ -154,7 +156,7 @@ Ingesters contain a **lifecycler** which manages the lifecycle of an ingester an
154156

155157
3. `ACTIVE` is an ingester's state when it is fully initialized. It may receive both write and read requests for tokens it owns.
156158

157-
4. `LEAVING` is an ingester's state when it is shutting down. It cannot receive write requests anymore, while it could still receive read requests for series it has in memory. While in this state, the ingester may look for a `PENDING` ingester to start a hand-over process with, used to transfer the state from `LEAVING` ingester to the `PENDING` one, during a rolling update (`PENDING` ingester moves to `JOINING` state during hand-over process). If there is no new ingester to accept hand-over, ingester in `LEAVING` state will flush data to storage instead.
159+
4. `LEAVING` is an ingester's state when it is shutting down. It cannot receive write requests anymore, while it could still receive read requests for series it has in memory. While in this state, the ingester may look for a `PENDING` ingester to start a hand-over process with, used to transfer the state from `LEAVING` ingester to the `PENDING` one, during a rolling update (`PENDING` ingester moves to `JOINING` state during hand-over process). If there is no new ingester to accept hand-over, ingester in `LEAVING` state will flush data to storage instead.
158160

159161
5. `UNHEALTHY` is an ingester's state when it has failed to heartbeat to the ring's KV Store. While in this state, distributors skip the ingester while building the replication set for incoming series and the ingester does not receive write or read requests.
160162

docs/operations/blocks-storage.md

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ The supported backends for the blocks storage are:
1111

1212
* [Amazon S3](https://aws.amazon.com/s3)
1313
* [Google Cloud Storage](https://cloud.google.com/storage/)
14+
* [Microsoft Azure Storage](https://azure.microsoft.com/en-us/services/storage/)
1415

1516
_Internally, this storage engine is based on [Thanos](https://thanos.io), but no Thanos knowledge is required in order to run it._
1617

@@ -28,7 +29,6 @@ When the blocks storage is used, each **ingester** creates a per-tenant TSDB and
2829

2930
The in-memory samples are periodically flushed to disk - and the WAL truncated - when a new TSDB Block is cut, which by default occurs every 2 hours. Each new Block cut is then uploaded to the long-term storage and kept in the ingester for some more time, in order to give queriers enough time to discover the new Block from the storage and download its index header.
3031

31-
3232
In order to effectively use the **WAL** and being able to recover the in-memory series upon ingester abruptly termination, the WAL needs to be stored to a persistent local disk which can survive in the event of an ingester failure (ie. AWS EBS volume or GCP persistent disk when running in the cloud). For example, if you're running the Cortex cluster in Kubernetes, you may use a StatefulSet with a persistent volume claim for the ingesters.
3333

3434
### The read path
@@ -138,7 +138,7 @@ tsdb:
138138
# storage. 0 disables the limit.
139139
# CLI flag: -experimental.tsdb.bucket-store.max-sample-count
140140
[max_sample_count: <int> | default = 0]
141-
141+
142142
# Max number of concurrent queries to execute against the long-term storage
143143
# on a per-tenant basis.
144144
# CLI flag: -experimental.tsdb.bucket-store.max-concurrent
@@ -189,6 +189,26 @@ tsdb:
189189
# Google SDK default logic.
190190
# CLI flag: -experimental.tsdb.gcs.service-account string
191191
[ service_account: <string>]
192+
193+
# Configures the Azure storage backend
194+
# Required only when "azure" backend has been selected.
195+
azure:
196+
# Azure storage account name
197+
# CLI flag: -experimental.tsdb.azure.account-name
198+
account_name: <string>
199+
# Azure storage account key
200+
# CLI flag: -experimental.tsdb.azure.account-key
201+
account_key: <string>
202+
# Azure storage container name
203+
# CLI flag: -experimental.tsdb.azure.container-name
204+
container_name: <string>
205+
# Azure storage endpoint suffix without schema.
206+
# The account name will be prefixed to this value to create the FQDN
207+
# CLI flag: -experimental.tsdb.azure.endpoint-suffix
208+
endpoint_suffix: <string>
209+
# Number of retries for recoverable errors
210+
# CLI flag: -experimental.tsdb.azure.max-retries
211+
[ max_retries: <int> | default=20 ]
192212
```
193213

194214
### `compactor_config`
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
package azure
2+
3+
import (
4+
"github.com/go-kit/kit/log"
5+
"github.com/thanos-io/thanos/pkg/objstore"
6+
"github.com/thanos-io/thanos/pkg/objstore/azure"
7+
yaml "gopkg.in/yaml.v2"
8+
)
9+
10+
func NewBucketClient(cfg Config, name string, logger log.Logger) (objstore.Bucket, error) {
11+
bucketConfig := azure.Config{
12+
StorageAccountName: cfg.StorageAccountName,
13+
StorageAccountKey: cfg.StorageAccountKey,
14+
ContainerName: cfg.ContainerName,
15+
Endpoint: cfg.Endpoint,
16+
MaxRetries: cfg.MaxRetries,
17+
}
18+
19+
// Thanos currently doesn't support passing the config as is, but expects a YAML,
20+
// so we're going to serialize it.
21+
serialized, err := yaml.Marshal(bucketConfig)
22+
if err != nil {
23+
return nil, err
24+
}
25+
26+
return azure.NewBucket(logger, serialized, name)
27+
}
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
package azure
2+
3+
import (
4+
"flag"
5+
)
6+
7+
// Config holds the config options for an Azure backend
8+
type Config struct {
9+
StorageAccountName string `yaml:"account_name"`
10+
StorageAccountKey string `yaml:"account_key"`
11+
ContainerName string `yaml:"container_name"`
12+
Endpoint string `yaml:"endpoint_suffix"`
13+
MaxRetries int `yaml:"max_retries"`
14+
}
15+
16+
// RegisterFlags registers the flags for TSDB Azure storage
17+
func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
18+
f.StringVar(&cfg.StorageAccountName, "experimental.tsdb.azure.account-name", "", "Azure storage account name")
19+
f.StringVar(&cfg.StorageAccountKey, "experimental.tsdb.azure.account-key", "", "Azure storage account key")
20+
f.StringVar(&cfg.ContainerName, "experimental.tsdb.azure.container-name", "", "Azure storage container name")
21+
f.StringVar(&cfg.Endpoint, "experimental.tsdb.azure.endpoint-suffix", "", "Azure storage endpoint suffix without schema. The account name will be prefixed to this value to create the FQDN")
22+
f.IntVar(&cfg.MaxRetries, "experimental.tsdb.azure.max-retries", 20, "Number of retries for recoverable errors")
23+
}

pkg/storage/tsdb/bucket_client.go

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ package tsdb
33
import (
44
"context"
55

6+
"github.com/cortexproject/cortex/pkg/storage/tsdb/backend/azure"
67
"github.com/cortexproject/cortex/pkg/storage/tsdb/backend/gcs"
78
"github.com/cortexproject/cortex/pkg/storage/tsdb/backend/s3"
89
"github.com/go-kit/kit/log"
@@ -16,6 +17,8 @@ func NewBucketClient(ctx context.Context, cfg Config, name string, logger log.Lo
1617
return s3.NewBucketClient(cfg.S3, name, logger)
1718
case BackendGCS:
1819
return gcs.NewBucketClient(ctx, cfg.GCS, name, logger)
20+
case BackendAzure:
21+
return azure.NewBucketClient(cfg.Azure, name, logger)
1922
default:
2023
return nil, errUnsupportedBackend
2124
}

pkg/storage/tsdb/config.go

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ import (
88
"time"
99

1010
"github.com/alecthomas/units"
11+
"github.com/cortexproject/cortex/pkg/storage/tsdb/backend/azure"
1112
"github.com/cortexproject/cortex/pkg/storage/tsdb/backend/gcs"
1213
"github.com/cortexproject/cortex/pkg/storage/tsdb/backend/s3"
1314
)
@@ -19,6 +20,9 @@ const (
1920
// BackendGCS is the value for the GCS storage backend
2021
BackendGCS = "gcs"
2122

23+
// BackendAzure is the value for the Azure storage backend
24+
BackendAzure = "azure"
25+
2226
// TenantIDExternalLabel is the external label set when shipping blocks to the storage
2327
TenantIDExternalLabel = "__org_id__"
2428
)
@@ -43,8 +47,9 @@ type Config struct {
4347
MaxTSDBOpeningConcurrencyOnStartup int `yaml:"max_tsdb_opening_concurrency_on_startup"`
4448

4549
// Backends
46-
S3 s3.Config `yaml:"s3"`
47-
GCS gcs.Config `yaml:"gcs"`
50+
S3 s3.Config `yaml:"s3"`
51+
GCS gcs.Config `yaml:"gcs"`
52+
Azure azure.Config `yaml:"azure"`
4853
}
4954

5055
// DurationList is the block ranges for a tsdb
@@ -88,6 +93,7 @@ func (d *DurationList) ToMilliseconds() []int64 {
8893
func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
8994
cfg.S3.RegisterFlags(f)
9095
cfg.GCS.RegisterFlags(f)
96+
cfg.Azure.RegisterFlags(f)
9197
cfg.BucketStore.RegisterFlags(f)
9298

9399
if len(cfg.BlockRanges) == 0 {
@@ -105,7 +111,7 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
105111

106112
// Validate the config
107113
func (cfg *Config) Validate() error {
108-
if cfg.Backend != BackendS3 && cfg.Backend != BackendGCS {
114+
if cfg.Backend != BackendS3 && cfg.Backend != BackendGCS && cfg.Backend != BackendAzure {
109115
return errUnsupportedBackend
110116
}
111117

0 commit comments

Comments
 (0)