|
| 1 | +--- |
| 2 | +title: "Schema Configuration" |
| 3 | +linkTitle: "Schema Configuration" |
| 4 | +weight: 1 |
| 5 | +slug: schema-configuration |
| 6 | +--- |
| 7 | + |
| 8 | +Cortex uses a NoSQL Store to store its index and optionally an Object store to store its chunks. Cortex has overtime evolved its schema to be more optimal and better fit the use cases and query patterns that arose. |
| 9 | + |
| 10 | +Currently there are 9 schemas that are used in production but we recommend running with `v9` schema when possible. You can move from one schema to another if a new schema fits your purpose better, but you still need to configure Cortex to make sure it can read the old data in the old schemas. |
| 11 | + |
| 12 | +You can configure the schemas using a YAML config file, that you can point to using the `-schema-config-file` flag. It has the following YAML spec: |
| 13 | + |
| 14 | +```yaml |
| 15 | +configs: []<period_config> |
| 16 | +``` |
| 17 | +
|
| 18 | +Where `period_config` is |
| 19 | +``` |
| 20 | +# In YYYY-MM-DD format, for example: 2020-03-01. |
| 21 | +from: <string> |
| 22 | +# The index client to use, valid options: aws-dynamo, bigtable, bigtable-hashed, cassandra, boltdb. |
| 23 | +store: <string> |
| 24 | +# The object client to use. If none is specified, `store` is used for storing chunks as well. Valid options: s3, aws-dynamo, bigtable, bigtable-hashed, gcs, cassandra, filesystem. |
| 25 | +object_store: <string> |
| 26 | +# The schema version to use. Valid ones are v1, v2, v3,... v6, v9, v10, v11. Recommended for production: v9. |
| 27 | +schema: <string> |
| 28 | +index: <periodic_table_config> |
| 29 | +chunks: <periodic_table_config> |
| 30 | +``` |
| 31 | +
|
| 32 | +Where `periodic_table_config` is |
| 33 | +``` |
| 34 | +# The prefix to use for the tables. |
| 35 | +prefix: <string> |
| 36 | +# We typically run Cortex with new tables every week to keep the index size low and to make retention easier. This sets the period at which new tables are created and used. Typically 168h (1week). |
| 37 | +period: <duration> |
| 38 | +# The tags that can be set on the dynamo table. |
| 39 | +tags: <map[string]string> |
| 40 | +``` |
| 41 | +
|
| 42 | +Now an example of this file (also something recommended when starting out) is: |
| 43 | +``` |
| 44 | +configs: |
| 45 | + - from: "2020-03-01" # Or typically a week before the Cortex cluster was created. |
| 46 | + schema: v9 |
| 47 | + index: |
| 48 | + period: 168h |
| 49 | + prefix: cortex_index_ |
| 50 | + # Chunks section is optional and required only if you're not using a |
| 51 | + # separate object store. |
| 52 | + chunks: |
| 53 | + period: 168h |
| 54 | + prefix: cortex_chunks |
| 55 | + store: aws-dynamo/bigtable-hashed/cassandra/boltdb |
| 56 | + object_store: <above options>/s3/gcs/azure/filesystem |
| 57 | +``` |
| 58 | +
|
| 59 | +An example of an advanced schema file with a lot of changes: |
| 60 | +``` |
| 61 | +configs: |
| 62 | + # Starting from 2018-08-23 Cortex should store chunks and indexes |
| 63 | + # on Google BigTable using weekly periodic tables. The chunks table |
| 64 | + # names will be prefixed with "dev_chunks_", while index tables will be |
| 65 | + # prefixed with "dev_index_". |
| 66 | + - from: "2018-08-23" |
| 67 | + schema: v9 |
| 68 | + chunks: |
| 69 | + period: 168h0m0s |
| 70 | + prefix: dev_chunks_ |
| 71 | + index: |
| 72 | + period: 168h0m0s |
| 73 | + prefix: dev_index_ |
| 74 | + store: gcp-columnkey |
| 75 | + |
| 76 | + # Starting 2018-02-13 we moved from BigTable to GCS for storing the chunks. |
| 77 | + - from: "2019-02-13" |
| 78 | + schema: v9 |
| 79 | + chunks: |
| 80 | + period: 168h |
| 81 | + prefix: dev_chunks_ |
| 82 | + index: |
| 83 | + period: 168h |
| 84 | + prefix: dev_index_ |
| 85 | + object_store: gcs |
| 86 | + store: gcp-columnkey |
| 87 | + |
| 88 | + # Starting 2019-02-24 we moved our index from bigtable-columnkey to bigtable-hashed |
| 89 | + # which improves the distribution of keys. |
| 90 | + - from: "2019-02-24" |
| 91 | + schema: v9 |
| 92 | + chunks: |
| 93 | + period: 168h |
| 94 | + prefix: dev_chunks_ |
| 95 | + index: |
| 96 | + period: 168h |
| 97 | + prefix: dev_index_ |
| 98 | + object_store: gcs |
| 99 | + store: bigtable-hashed |
| 100 | + |
| 101 | + # Starting 2019-03-05 we moved from v9 schema to v10 schema. |
| 102 | + - from: "2019-03-05" |
| 103 | + schema: v10 |
| 104 | + chunks: |
| 105 | + period: 168h |
| 106 | + prefix: dev_chunks_ |
| 107 | + index: |
| 108 | + period: 168h |
| 109 | + prefix: dev_index_ |
| 110 | + object_store: gcs |
| 111 | + store: bigtable-hashed |
| 112 | +``` |
| 113 | +
|
| 114 | +Note how we started out with v9 and just Bigtable, but later migrated to GCS as the object store, finally moving to v10. This is a complex schema file showing several changes changes over the time, while a typical schema config file usually has just one or two schema versions. |
| 115 | +
|
| 116 | +### Migrating from flags to schema file |
| 117 | +
|
| 118 | +Legacy versions of Cortex did support the ability to configure schema via flags. If you are still using flags, you need to migrate your configuration from flags to the config file. |
| 119 | +
|
| 120 | +If you're using: |
| 121 | +
|
| 122 | +* `chunk.storage-client`: then set the corresponding `object_store` field correctly in the schema file. |
| 123 | +* `dynamodb.daily-buckets-from`: then set the corresponding `from` date with `v2` schema. |
| 124 | +* `dynamodb.base64-buckets-from`: then set the corresponding `from` date with `v3` schema. |
| 125 | +* `dynamodb.v{4,5,6,9}-schema-from`: then set the corresponding `from` date with schema `v{4,5,6,9}` |
| 126 | +* `bigtable.column-key-from`: then set the corresponding `from` date and use the `store` as `bigtable-columnkey`. |
| 127 | +* `dynamodb.use-periodic-tables`: then set the right `index` and `chunk` fields with corresponding values from `dynamodb.periodic-table.{prefix, period, tag}` and `dynamodb.chunk-table.{prefix, period, tag}` flags. Note that the default period is 7 days, so please set the `period` as `168h` in the config file if none is set in the flags. |
0 commit comments