Skip to content

Commit d00e46b

Browse files
authored
[Config] Document 'snapshot' and 'wal' configuration settings (#4023)
- Added `snapshot` and `wal` reference sections - Updated `box.cfg` reference sections related to snapshots or WAL - Updated `WAL extensions` topic in the Enterprise section - Added `Persistence` topic to `Configuration` section - Updated old links - Slightly updated `File formats` page Fixes #4013 Fixes #3668 Fixes #3727 Fixes #3396
1 parent fe9585d commit d00e46b

File tree

19 files changed

+920
-243
lines changed

19 files changed

+920
-243
lines changed

doc/book/admin/backups.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,15 +37,15 @@ that are made after the last snapshot are incremental backups. Therefore taking
3737
a backup is a matter of copying the snapshot and WAL files.
3838

3939
1. Use ``tar`` to make a (possibly compressed) copy of the latest .snap and .xlog
40-
files on the :ref:`memtx_dir <cfg_basic-memtx_dir>` and
41-
:ref:`wal_dir <cfg_basic-wal_dir>` directories.
40+
files on the :ref:`snapshot.dir <configuration_reference_snapshot_dir>` and
41+
:ref:`wal.dir <configuration_reference_wal_dir>` directories.
4242

4343
2. If there is a security policy, encrypt the .tar file.
4444

4545
3. Copy the .tar file to a safe place.
4646

4747
Later, restoring the database is a matter of taking the .tar file and putting
48-
its contents back in the ``memtx_dir`` and ``wal_dir`` directories.
48+
its contents back in the ``snapshot.dir`` and ``wal.dir`` directories.
4949

5050
.. _admin-backups-hot_backup_vinyl_memtx:
5151

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
groups:
2+
group001:
3+
replicasets:
4+
replicaset001:
5+
instances:
6+
instance001:
7+
snapshot:
8+
dir: 'var/lib/{{ instance_name }}/snapshots'
9+
count: 3
10+
by:
11+
interval: 7200
12+
wal_size: 1000000000000000000
13+
iproto:
14+
listen:
15+
- uri: '127.0.0.1:3301'
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
instance001:
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
groups:
2+
group001:
3+
replicasets:
4+
replicaset001:
5+
instances:
6+
instance001:
7+
wal:
8+
dir: 'var/lib/{{ instance_name }}/wals'
9+
mode: 'write'
10+
dir_rescan_delay: 3
11+
cleanup_delay: 18000
12+
max_size: 268435456
13+
ext:
14+
new: true
15+
old: true
16+
spaces:
17+
bands:
18+
old: false
19+
iproto:
20+
listen:
21+
- uri: '127.0.0.1:3301'
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
instance001:

doc/concepts/atomic/thread_model.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ There are also several supplementary threads that serve additional capabilities:
6969
Separate threads are required because each replica can point to a different position in the log and can run at different speeds.
7070

7171
* There is a thread pool for ad hoc asynchronous tasks,
72-
such as a DNS resolver or :ref:`fsync <cfg_binary_logging_snapshots-wal_mode>`.
72+
such as a DNS resolver or :ref:`fsync <configuration_reference_wal_mode>`.
7373

7474
* There are OpenMP threads used to parallelize sorting
7575
(hence, to parallelize building :ref:`indexes <concepts-data_model_indexes>`).

doc/concepts/configuration.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -426,7 +426,8 @@ When the limit is reached, ``INSERT`` or ``UPDATE`` requests fail with :ref:`ER_
426426
Snapshots and write-ahead logs
427427
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
428428

429-
The ``snapshot.dir`` and ``wal.dir`` options can be used to configure directories for storing snapshots and write-ahead logs.
429+
The :ref:`snapshot.dir <configuration_reference_snapshot_dir>` and :ref:`wal.dir <configuration_reference_wal_dir>`
430+
options can be used to configure directories for storing snapshots and write-ahead logs.
430431
For example, you can place snapshots and write-ahead logs on different hard drives for better reliability.
431432

432433
.. code-block:: yaml
@@ -438,6 +439,7 @@ For example, you can place snapshots and write-ahead logs on different hard driv
438439
dir: '/media/drive2/wals'
439440
440441
To learn more about the persistence mechanism in Tarantool, see the :ref:`Persistence <concepts-data_model-persistence>` section.
442+
Read more about snapshot and WAL configuration: :ref:`Persistence <configuration_persistence>`.
441443

442444

443445

@@ -447,6 +449,7 @@ To learn more about the persistence mechanism in Tarantool, see the :ref:`Persis
447449

448450
configuration/configuration_etcd
449451
configuration/configuration_code
452+
configuration/configuration_persistence
450453
configuration/configuration_connections
451454
configuration/configuration_credentials
452455
configuration/configuration_authentication
Lines changed: 284 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,284 @@
1+
.. _configuration_persistence:
2+
3+
Persistence
4+
===========
5+
6+
To ensure data persistence, Tarantool provides the abilities to:
7+
8+
* Record each data change request into a :ref:`write-ahead log <internals-wal>` (WAL) file (``.xlog`` files).
9+
* Take :ref:`snapshots <internals-snapshot>` that contain an on-disk copy of the entire data set for a given moment
10+
(``.snap`` files). It is possible to set automatic snapshot creation using the :ref:`checkpoint daemon <configuration_persistence_checkpoint_daemon>`.
11+
12+
During the recovery process, Tarantool can load the latest snapshot file and then read the requests from the WAL files,
13+
produced after this snapshot was made.
14+
This topic describes how to configure:
15+
16+
* the snapshot creation in the :ref:`snapshot <configuration_reference_snapshot>` section of a :ref:`YAML configuration <configuration>`.
17+
* the recording to the write-ahead log in the :ref:`wal <configuration_reference_wal>` section of a YAML configuration.
18+
19+
To learn more about the persistence mechanism in Tarantool, see the :ref:`Persistence <concepts-data_model-persistence>` section.
20+
The formats of WAL and snapshot files are described in detail in the :ref:`File formats <internals-data_persistence>` section.
21+
22+
.. _configuration_persistence_snapshot:
23+
24+
Configure the snapshots
25+
-----------------------
26+
27+
**Example on GitHub**: `snapshot <https://github.com/tarantool/doc/tree/latest/doc/code_snippets/snippets/config/instances.enabled/persistence_snapshot>`_
28+
29+
This section describes how to define snapshot settings in the :ref:`snapshot <configuration_reference_snapshot>` section of a YAML configuration.
30+
31+
.. _configuration_persistence_snapshot_creation:
32+
33+
Set up automatic snapshot creation
34+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
35+
36+
In Tarantool, it is possible to automate the :ref:`snapshot creation </reference/reference_lua/box_snapshot>`.
37+
Automatic creation is enabled by default and can be configured in two ways:
38+
39+
* A new snapshot is taken once in a given period (see :ref:`snapshot.by.interval <configuration_reference_snapshot_by_interval>`).
40+
* A new snapshot is taken once the size of all WAL files created since the last snapshot exceeds a given limit
41+
(see :ref:`snapshot.by.wal_size <configuration_reference_snapshot_by_wal_size>`).
42+
43+
The ``snapshot.by.interval`` option sets up the :ref:`checkpoint daemon <configuration_persistence_checkpoint_daemon>`
44+
that takes a new snapshot every ``snapshot.by.interval`` seconds.
45+
If the ``snapshot.by.interval`` option is set to zero, the checkpoint daemon is disabled.
46+
47+
The ``snapshot.by.wal_size`` option defines the maximum size in bytes for all WAL files created since the last snapshot taken.
48+
Once this size is exceeded, the checkpoint daemon takes a snapshot. Then, :ref:`Tarantool garbage collector <configuration_persistence_garbage_collector>`
49+
deletes the old WAL files.
50+
51+
The example shows how to specify the ``snapshot.by.interval`` and the ``snapshot.by.wal_size`` options:
52+
53+
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_snapshot/config.yaml
54+
:language: yaml
55+
:start-at: by:
56+
:end-at: 1000000000000000000
57+
:dedent:
58+
59+
In the example, a new snapshot is created in two cases:
60+
61+
* every 2 hours (every 7200 seconds)
62+
* when the size for all WAL files created since the last snapshot reaches the size of 1e18 (1000000000000000000) bytes.
63+
64+
.. _configuration_persistence_snapshot_dir:
65+
66+
Specify a directory for snapshot files
67+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
68+
69+
To configure a directory where the snapshot files are stored, use the :ref:`snapshot.dir <configuration_reference_snapshot_dir>`
70+
configuration option.
71+
The example below shows how to specify a snapshot directory for ``instance001`` explicitly:
72+
73+
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_snapshot/config.yaml
74+
:language: yaml
75+
:start-at: instance001:
76+
:end-at: 'var/lib/{{ instance_name }}/snapshots'
77+
:dedent:
78+
79+
By default, WAL files and snapshot files are stored in the same directory ``var/lib/{{ instance_name }}``.
80+
However, you can specify different directories for them.
81+
For example, you can place snapshots and write-ahead logs on different hard drives for better reliability:
82+
83+
.. code-block:: yaml
84+
85+
instance001:
86+
snapshot:
87+
dir: '/media/drive1/snapshots'
88+
wal:
89+
dir: '/media/drive2/wals'
90+
91+
.. _configuration_persistence_snapshot_count:
92+
93+
Configure a maximum number of stored snapshots
94+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
95+
96+
You can set a limit on the number of snapshots stored in the :ref:`snapshot.dir <configuration_reference_snapshot_dir>`
97+
directory using the :ref:`snapshot.count <configuration_reference_snapshot_count>` option.
98+
Once the number of snapshots reaches the given limit, :ref:`Tarantool garbage collector <configuration_persistence_garbage_collector>`
99+
deletes the oldest snapshot file and any associated WAL files after the new snapshot is taken.
100+
101+
In the example below, the snapshot is created every two hours (every 7200 seconds) until there are three snapshots in the
102+
``snapshot.dir`` directory.
103+
After creating a new snapshot (the fourth one), the oldest snapshot and the corresponding WALs are deleted.
104+
105+
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_snapshot/config.yaml
106+
:language: yaml
107+
:start-at: count:
108+
:end-at: 7200
109+
:dedent:
110+
111+
.. _configuration_persistence_wal:
112+
113+
Configure the write-ahead log
114+
-----------------------------
115+
116+
**Example on GitHub**: `wal <https://github.com/tarantool/doc/tree/latest/doc/code_snippets/snippets/config/instances.enabled/persistence_wal>`_
117+
118+
This section describes how to define WAL settings in the :ref:`wal <configuration_reference_wal>` section of a YAML configuration.
119+
120+
.. _configuration_persistence_wal_mode:
121+
122+
Set the WAL mode
123+
~~~~~~~~~~~~~~~~
124+
125+
The recording to the write-ahead log is enabled by default.
126+
It means that if an instance restart occurs, the data will be recovered.
127+
The recording to the WAL can be configured using the :ref:`wal.mode <configuration_reference_wal_mode>` configuration option.
128+
129+
There are two modes that enable writing to the WAL:
130+
131+
* ``write`` (default) -- enable WAL and write the data without waiting for the data to be flushed to the storage device.
132+
* ``fsync`` -- enable WAL and ensure that the record is written to the storage device.
133+
134+
The example below shows how to specify the ``write`` WAL mode:
135+
136+
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_wal/config.yaml
137+
:language: yaml
138+
:start-at: mode:
139+
:end-at: 'write'
140+
:dedent:
141+
142+
To turn the WAL writer off, set the ``wal.mode`` option to ``none``.
143+
144+
.. _configuration_persistence_wal_dir:
145+
146+
Specify a directory for WAL files
147+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
148+
149+
To configure a directory where the WAL files are stored, use the :ref:`wal.dir <configuration_reference_wal_dir>` configuration option.
150+
The example below shows how to specify a directory for ``instance001`` explicitly:
151+
152+
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_wal/config.yaml
153+
:language: yaml
154+
:start-at: instance001:
155+
:end-at: 'var/lib/{{ instance_name }}/wals'
156+
:dedent:
157+
158+
159+
.. _configuration_persistence_wal_rescan:
160+
161+
Set an interval between scans
162+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
163+
164+
In case of :ref:`replication <replication>` or :ref:`hot standby <configuration_reference_database_hot_standby>` mode,
165+
Tarantool scans for changes in the WAL files every :ref:`wal.dir_rescan_delay <configuration_reference_wal_dir_rescan_delay>`
166+
seconds. The example below shows how to specify the interval between scans:
167+
168+
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_wal/config.yaml
169+
:language: yaml
170+
:start-at: dir_rescan_delay
171+
:end-before: cleanup_delay
172+
:dedent:
173+
174+
.. _configuration_persistence_wal_maxsize:
175+
176+
Set a maximum size for the WAL file
177+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
178+
179+
A new WAL file is created when the current one reaches the :ref:`wal.max_size <configuration_reference_wal_max_size>`
180+
size. The configuration for this option might look as follows:
181+
182+
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_wal/config.yaml
183+
:language: yaml
184+
:start-at: max_size
185+
:end-at: 268435456
186+
:dedent:
187+
188+
.. _configuration_persistence_wal_rescan:
189+
190+
Set a delay for the garbage collector
191+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192+
193+
In Tarantool, the :ref:`checkpoint daemon <configuration_persistence_checkpoint_daemon>`
194+
takes new snapshots at the given interval (see :ref:`snapshot.by.interval <configuration_reference_snapshot_by_interval>`).
195+
After an instance restart, the Tarantool garbage collector deletes the old WAL files.
196+
197+
To delay the immediate deletion of WAL files, use the :ref:`wal.cleanup_delay <configuration_reference_wal_cleanup_delay>`
198+
configuration option. The delay eliminates possible erroneous situations when the master deletes WALs
199+
needed by :ref:`replicas <replication-roles>` after restart.
200+
As a consequence, replicas sync with the master faster after its restart and
201+
don't need to download all the data again.
202+
203+
In the example, the delay is set to 5 hours (18000 seconds):
204+
205+
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_wal/config.yaml
206+
:language: yaml
207+
:start-at: cleanup_delay
208+
:end-at: 18000
209+
:dedent:
210+
211+
.. _configuration_persistence_wal_ext:
212+
213+
Specify the WAL extensions
214+
~~~~~~~~~~~~~~~~~~~~~~~~~~
215+
216+
In Tarantool Enterprise, you can store an old and new tuple for each CRUD operation performed.
217+
A detailed description and examples of the WAL extensions are provided in the :ref:`WAL extensions <wal_extensions>` section.
218+
219+
See also: :ref:`wal.ext.* <configuration_reference_wal_ext>` configuration options.
220+
221+
.. _configuration_persistence_checkpoint_daemon:
222+
223+
Checkpoint daemon
224+
-----------------
225+
226+
The checkpoint daemon (snapshot daemon) is a constantly running :ref:`fiber <app-fibers>`.
227+
The checkpoint daemon creates a schedule for the periodic snapshot creation based on
228+
the :ref:`configuration options <configuration_reference_snapshot_by>` and the speed of file size growth.
229+
If enabled, the daemon makes new :ref:`snapshot <concepts-data_model-persistence>` (``.snap``) files according to this schedule.
230+
231+
The work of the checkpoint daemon is based on the following configuration options:
232+
233+
* :ref:`snapshot.by.interval <configuration_reference_snapshot_by_interval>` -- a new snapshot is taken once in a given period.
234+
* :ref:`snapshot.by.wal_size <configuration_reference_snapshot_by_wal_size>` -- a new snapshot is taken once the size
235+
of all WAL files created since the last snapshot exceeds a given limit.
236+
237+
If necessary, the checkpoint daemon also activates the :ref:`Tarantool garbage collector <configuration_persistence_garbage_collector>`
238+
that deletes old snapshots and WAL files.
239+
240+
.. _configuration_persistence_garbage_collector:
241+
242+
Tarantool garbage collector
243+
---------------------------
244+
245+
Tarantool garbage collector can be activated by the :ref:`checkpoint daemon <configuration_persistence_checkpoint_daemon>`.
246+
The garbage collector tracks the snapshots that are to be :ref:`relayed to a replica <memtx-replication>` or needed
247+
by other consumers. When the files are no longer needed, Tarantool garbage collector deletes them.
248+
249+
.. NOTE::
250+
251+
The garbage collector called by the checkpoint daemon is distinct from the `Lua garbage collector <https://www.lua.org/manual/5.1/manual.html#2.10>`_
252+
which is for Lua objects, and distinct from the Tarantool garbage collector that specializes in :ref:`handling shard buckets <vshard-gc>`.
253+
254+
This garbage collector is called as follows:
255+
256+
* When the number of snapshots reaches the limit of :ref:`snapshot.count <configuration_reference_snapshot_count>` size.
257+
After a new snapshot is taken, Tarantool garbage collector deletes the oldest snapshot file and any associated WAL files.
258+
259+
* When the size of all WAL files created since the last snapshot reaches the limit of :ref:`snapshot.by.wal_size <configuration_reference_snapshot_by_wal_size>`.
260+
Once this size is exceeded, the checkpoint daemon takes a snapshot, then the garbage collector deletes the old WAL files.
261+
262+
If an old snapshot file is deleted, the Tarantool garbage collector also deletes
263+
any :ref:`write-ahead log (.xlog) <internals-wal>` files that meet the following conditions:
264+
265+
* The WAL files are older than the snapshot file.
266+
* The WAL files contain information present in the snapshot file.
267+
268+
Tarantool garbage collector also deletes obsolete vinyl ``.run`` files.
269+
270+
Tarantool garbage collector doesn't delete a file in the following cases:
271+
272+
* A backup is running, and the file has not been backed up
273+
(see :ref:`Hot backup <admin-backups-hot_backup_vinyl_memtx>`).
274+
275+
* Replication is running, and the file has not been relayed to a replica
276+
(see :ref:`Replication architecture <replication-architecture>`),
277+
278+
* A replica is connecting.
279+
280+
* A replica has fallen behind.
281+
The progress of each replica is tracked; if a replica's position is far
282+
from being up to date, then the server stops to give it a chance to catch up.
283+
If an administrator concludes that a replica is permanently down, then the
284+
correct procedure is to restart the server, or (preferably) :ref:`remove the replica from the cluster <replication-remove_instances>`.

doc/concepts/data_model/operations.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,7 @@ resource usage of each function.
150150
important than the others.
151151
* - WAL settings
152152
- The important setting for the write-ahead log is
153-
:ref:`wal_mode <cfg_binary_logging_snapshots-wal_mode>`.
153+
:ref:`wal.mode <configuration_reference_wal_mode>`.
154154
If the setting causes no writing or
155155
delayed writing, this factor is unimportant. If the
156156
setting causes every data-change request to wait

0 commit comments

Comments
 (0)