Skip to content

Commit 95e0690

Browse files
Document on_recovery_state and on_election (#3580)
1 parent 9f1781e commit 95e0690

File tree

10 files changed

+183
-84
lines changed

10 files changed

+183
-84
lines changed
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
local fio = require('fio')
2+
local server = require('luatest.server')
3+
local t = require('luatest')
4+
local g = t.group()
5+
6+
local run_before_cfg = [[
7+
local log = require('log')
8+
local log_recovery_state = function(state)
9+
log.info(state .. ' state reached')
10+
end
11+
box.ctl.on_recovery_state(log_recovery_state)
12+
]]
13+
14+
g.before_each(function(cg)
15+
cg.server = server:new {
16+
workdir = fio.cwd() .. '/tmp',
17+
env = {
18+
['TARANTOOL_RUN_BEFORE_BOX_CFG'] = run_before_cfg,
19+
}
20+
}
21+
cg.server:start()
22+
end)
23+
24+
g.after_each(function(cg)
25+
cg.server:stop()
26+
cg.server:drop()
27+
end)
28+
29+
local function find_in_log(cg, str, must_be_present)
30+
t.helpers.retrying({timeout = 0.3, delay = 0.1}, function()
31+
local found = cg.server:grep_log(str) ~= nil
32+
t.assert(found == must_be_present)
33+
end)
34+
end
35+
36+
g.test_log_contains_reached_states = function(cg)
37+
find_in_log(cg, 'indexes_built state reached', true)
38+
find_in_log(cg, 'snapshot_recovered state reached', true)
39+
find_in_log(cg, 'wal_recovered state reached', true)
40+
find_in_log(cg, 'synced state reached', true)
41+
end

doc/dev_guide/internals/recovery_internals.rst

Lines changed: 46 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
The recovery process
55
--------------------------------------------------------------------------------
66

7-
The recovery process begins when box.cfg{} happens for the
7+
The recovery process begins when ``box.cfg{}`` happens for the
88
first time after the Tarantool server instance starts.
99

1010
The recovery process must recover the databases
@@ -19,43 +19,48 @@ make a checkpoint, and the snapshot operation is rolled back if
1919
anything goes wrong, so vinyl's checkpoint is at least as fresh
2020
as the snapshot file.)
2121

22-
Step 1
23-
Read the configuration parameters in the ``box.cfg{}`` request.
24-
Parameters which affect recovery may include :ref:`work_dir <cfg_basic-work_dir>`,
25-
:ref:`wal_dir <cfg_basic-wal_dir>`, :ref:`memtx_dir <cfg_basic-memtx_dir>`,
26-
:ref:`vinyl_dir <cfg_basic-vinyl_dir>`
27-
and :ref:`force_recovery <cfg_binary_logging_snapshots-force_recovery>`.
28-
29-
Step 2
30-
Find the latest snapshot file. Use its data to reconstruct the in-memory
31-
databases. Instruct the vinyl engine to recover to the latest checkpoint.
32-
33-
There are actually two variations of the reconstruction procedure for memtx
34-
databases, depending on whether the recovery process is "default".
35-
36-
If the recovery process is default (``force_recovery`` is ``false``),
37-
memtx can read data in the snapshot with all indexes disabled.
38-
First, all tuples are read into memory. Then, primary keys are built in bulk,
39-
taking advantage of the fact that the data is already sorted by primary key
40-
within each space.
41-
42-
If the recovery process is non-default (``force_recovery`` is ``true``),
43-
Tarantool performs additional checking. Indexes are enabled at
44-
the start, and tuples are added one by one. This means that any unique-key
45-
constraint violations will be caught, and any duplicates will be skipped.
46-
Normally there will be no constraint violations or duplicates, so these checks
47-
are only made if an error has occurred.
48-
49-
Step 3
50-
Find the WAL file that was made at the time of, or after, the snapshot file.
51-
Read its log entries until the log-entry LSN is greater than the LSN of the
52-
snapshot, or greater than the LSN of the vinyl checkpoint. This is the
53-
recovery process's "start position"; it matches the current state of the
54-
engines.
55-
56-
Step 4
57-
Redo the log entries, from the start position to the end of the WAL. The
58-
engine skips a redo instruction if it is older than the engine's checkpoint.
59-
60-
Step 5
61-
For the memtx engine, re-create all secondary indexes.
22+
**Step 1**
23+
24+
Read the configuration parameters in the ``box.cfg{}`` request.
25+
Parameters which affect recovery may include :ref:`work_dir <cfg_basic-work_dir>`,
26+
:ref:`wal_dir <cfg_basic-wal_dir>`, :ref:`memtx_dir <cfg_basic-memtx_dir>`,
27+
:ref:`vinyl_dir <cfg_basic-vinyl_dir>`
28+
and :ref:`force_recovery <cfg_binary_logging_snapshots-force_recovery>`.
29+
30+
**Step 2**
31+
32+
Find the latest snapshot file. Use its data to reconstruct the in-memory
33+
databases. Instruct the vinyl engine to recover to the latest checkpoint.
34+
35+
There are actually two variations of the reconstruction procedure for memtx
36+
databases, depending on whether the recovery process is "default".
37+
38+
If the recovery process is default (``force_recovery`` is ``false``),
39+
memtx can read data in the snapshot with all indexes disabled.
40+
First, all tuples are read into memory. Then, primary keys are built in bulk,
41+
taking advantage of the fact that the data is already sorted by primary key
42+
within each space.
43+
44+
If the recovery process is non-default (``force_recovery`` is ``true``),
45+
Tarantool performs additional checking. Indexes are enabled at
46+
the start, and tuples are added one by one. This means that any unique-key
47+
constraint violations will be caught, and any duplicates will be skipped.
48+
Normally there will be no constraint violations or duplicates, so these checks
49+
are only made if an error has occurred.
50+
51+
**Step 3**
52+
53+
Find the WAL file that was made at the time of, or after, the snapshot file.
54+
Read its log entries until the log-entry LSN is greater than the LSN of the
55+
snapshot, or greater than the LSN of the vinyl checkpoint. This is the
56+
recovery process's "start position"; it matches the current state of the
57+
engines.
58+
59+
**Step 4**
60+
61+
Redo the log entries, from the start position to the end of the WAL. The
62+
engine skips a redo instruction if it is older than the engine's checkpoint.
63+
64+
**Step 5**
65+
66+
For the memtx engine, re-create all secondary indexes.

doc/reference/reference_lua/box_ctl.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,12 @@ Below is a list of all ``box.ctl`` functions.
4444
* - :doc:`./box_ctl/on_shutdown`
4545
- Create a "shutdown trigger"
4646

47+
* - :doc:`./box_ctl/on_recovery_state`
48+
- Create a trigger executed on different stages of a node recovery or initial configuration
49+
50+
* - :doc:`./box_ctl/on_election`
51+
- Create a :ref:`trigger <triggers>` executed every time the current state of a replica set node in regard to :ref:`leader election <repl_leader_elect>` changes
52+
4753
* - :doc:`./box_ctl/set_on_shutdown_timeout`
4854
- Set a timeout in seconds for the ``on_shutdown`` trigger
4955

@@ -63,6 +69,8 @@ Below is a list of all ``box.ctl`` functions.
6369
box_ctl/wait_rw
6470
box_ctl/on_schema_init
6571
box_ctl/on_shutdown
72+
box_ctl/on_recovery_state
73+
box_ctl/on_election
6674
box_ctl/set_on_shutdown_timeout
6775
box_ctl/is_recovery_finished
6876
box_ctl/promote
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
.. _box_ctl-on_election:
2+
3+
===============================================================================
4+
box.ctl.on_election()
5+
===============================================================================
6+
7+
.. module:: box.ctl
8+
9+
.. function:: on_election(trigger-function)
10+
11+
**Since:** :doc:`2.10.0 </release/2.10.0>`
12+
13+
Create a :ref:`trigger <triggers>` executed every time
14+
the current state of a replica set node in regard to :ref:`leader election <repl_leader_elect>` changes.
15+
The current state is available in the :ref:`box.info.election <box_info_election>` table.
16+
17+
The trigger doesn't accept any parameters.
18+
You can see the changes in ``box.info.election`` and
19+
:ref:`box.info.synchro <box_info_synchro>`.
20+
21+
:param function trigger-function: a trigger function
22+
23+
:return: ``nil`` or a function pointer
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
.. _box_ctl-on_recovery_state:
2+
3+
===============================================================================
4+
box.ctl.on_recovery_state()
5+
===============================================================================
6+
7+
.. module:: box.ctl
8+
9+
.. function:: on_recovery_state(trigger-function)
10+
11+
**Since:** :doc:`2.11.0 </release/2.11.0>`
12+
13+
Create a :ref:`trigger <triggers>` executed on different stages of a node :ref:`recovery <internals-recovery_process>` or initial configuration.
14+
Note that you need to set the ``box.ctl.on_recovery_state`` trigger before the initial :ref:`box.cfg <box_introspection-box_cfg>` call.
15+
16+
:param function trigger-function: a trigger function
17+
18+
:return: ``nil`` or a function pointer
19+
20+
A registered trigger function is run on each of the supported recovery
21+
state and receives the state name as a parameter:
22+
23+
* ``snapshot_recovered``: the node has recovered the snapshot files.
24+
* ``wal_recovered``: the node has recovered the WAL files.
25+
* ``indexes_built``: the node has built secondary indexes for memtx spaces.
26+
This stage might come before any actual data is recovered. This means that the
27+
indexes are available right after the first tuple is recovered.
28+
* ``synced``: the node has synced with enough remote peers.
29+
This means that the node changes the state from :ref:`orphan <internals-replication-orphan_status>` to ``running``.
30+
31+
All these states are passed during the initial ``box.cfg`` call when recovering
32+
from the snapshot and WAL files.
33+
Note that the ``synced`` state might be reached after the initial ``box.cfg`` call finishes.
34+
For example, if :ref:`replication_sync_timeout <cfg_replication-replication_sync_timeout>`
35+
is set to 0, the node finishes ``box.cfg`` without reaching ``synced`` and stays ``orphan``.
36+
Once the node is synced with enough remote peers, the ``synced`` state is reached.
37+
38+
.. NOTE::
39+
40+
When bootstrapping a fresh cluster with no data, all the instances in this cluster
41+
execute triggers on the same stages for consistency.
42+
For example, ``snapshot_recovered`` and ``wal_recovered``
43+
run when the node finishes a cluster's bootstrap or finishes joining to an existing cluster.
44+
45+
46+
**Example:**
47+
48+
The example below shows how to :ref:`log <log-module>` a specified message when each state is reached.
49+
50+
.. literalinclude:: /code_snippets/test/triggers/on_recovery_state_test.lua
51+
:language: lua
52+
:lines: 7-11
53+
:dedent:

doc/reference/reference_lua/box_ctl/on_schema_init.rst

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,6 @@ box.ctl.on_schema_init()
66

77
.. module:: box.ctl
88

9-
The ``box.ctl`` submodule also contains two functions for the two
10-
:ref:`server trigger <triggers>` definitions: ``on_shutdown`` and ``on_schema_init``.
11-
Please, familiarize yourself with the mechanism of trigger functions before using them.
12-
139
.. function:: on_schema_init(trigger-function [, old-trigger-function])
1410

1511
Create a "schema_init :ref:`trigger <triggers>`".

doc/reference/reference_lua/box_ctl/on_shutdown.rst

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,6 @@ box.ctl.on_shutdown()
66

77
.. module:: box.ctl
88

9-
The ``box.ctl`` submodule also contains two functions for the two
10-
:ref:`server trigger <triggers>` definitions: ``on_shutdown`` and ``on_schema_init``.
11-
Please, familiarize yourself with the mechanism of trigger functions before using them.
12-
Details about trigger characteristics are in the :ref:`triggers <triggers-box_triggers>` section.
13-
149
.. function:: on_shutdown(trigger-function [, old-trigger-function])
1510

1611
Create a "shutdown :ref:`trigger <triggers>`".

locale/ru/LC_MESSAGES/dev_guide/internals/recovery_internals.po

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@ msgid "The recovery process"
33
msgstr "Процесс восстановления"
44

55
msgid ""
6-
"The recovery process begins when box.cfg{} happens for the first time after "
6+
"The recovery process begins when ``box.cfg{}`` happens for the first time after "
77
"the Tarantool server instance starts."
88
msgstr ""
9-
"Процесс восстановления начинается, когда box.cfg{} впервые используется "
9+
"Процесс восстановления начинается, когда ``box.cfg{}`` впервые используется "
1010
"после запуска экземпляра Tarantool-сервера."
1111

1212
msgid ""
@@ -32,8 +32,8 @@ msgstr ""
3232
"операция создания снимка откатывается в случае какой-либо ошибки, поэтому "
3333
"контрольная точка vinyl'а будет настолько же актуальной, как и файл снимка.)"
3434

35-
msgid "Step 1"
36-
msgstr "Шаг 1"
35+
msgid "**Step 1**"
36+
msgstr "**Шаг 1**"
3737

3838
msgid ""
3939
"Read the configuration parameters in the ``box.cfg{}`` request. Parameters "
@@ -48,8 +48,8 @@ msgstr ""
4848
"<cfg_basic-memtx_dir>`, :ref:`vinyl_dir <cfg_basic-vinyl_dir>` и "
4949
":ref:`force_recovery <cfg_binary_logging_snapshots-force_recovery>`."
5050

51-
msgid "Step 2"
52-
msgstr "Шаг 2"
51+
msgid "**Step 2**"
52+
msgstr "**Шаг 2**"
5353

5454
msgid ""
5555
"Find the latest snapshot file. Use its data to reconstruct the in-memory "
@@ -95,8 +95,8 @@ msgstr ""
9595
"ограничений или повторяющихся значений, поэтому такие проверки проводятся "
9696
"только в случае ошибки."
9797

98-
msgid "Step 3"
99-
msgstr "Шаг 3"
98+
msgid "**Step 3**"
99+
msgstr "**Шаг 3**"
100100

101101
msgid ""
102102
"Find the WAL file that was made at the time of, or after, the snapshot file."
@@ -111,8 +111,8 @@ msgstr ""
111111
"будет начальной точкой для процесса восстановления, которая соответствует "
112112
"текущему состоянию движков."
113113

114-
msgid "Step 4"
115-
msgstr "Шаг 4"
114+
msgid "**Step 4**"
115+
msgstr "**Шаг 4**"
116116

117117
msgid ""
118118
"Redo the log entries, from the start position to the end of the WAL. The "
@@ -121,8 +121,8 @@ msgstr ""
121121
"Повторить записи журнала с начальной точки до конца WAL. Движок пропускает "
122122
"команду повторения, если данные старше контрольной точки движка."
123123

124-
msgid "Step 5"
125-
msgstr "Шаг 5"
124+
msgid "**Step 5**"
125+
msgstr "**Шаг 5**"
126126

127127
msgid "For the memtx engine, re-create all secondary indexes."
128128
msgstr "Повторно создать все вторичные индексы для движка memtx."

locale/ru/LC_MESSAGES/reference/reference_lua/box_ctl/on_schema_init.po

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,6 @@
22
msgid "box.ctl.on_schema_init()"
33
msgstr ""
44

5-
msgid ""
6-
"The ``box.ctl`` submodule also contains two functions for the two "
7-
":ref:`server trigger <triggers>` definitions: ``on_shutdown`` and "
8-
"``on_schema_init``. Please, familiarize yourself with the mechanism of "
9-
"trigger functions before using them."
10-
msgstr ""
11-
"Встроенный модуль ``box.ctl`` также содержит две функции для определения "
12-
"двух :ref:`серверных триггеров <triggers>`: ``on_shutdown`` и "
13-
"``on_schema_init``. Пожалуйста, ознакомьтесь с механизмом триггерных функций"
14-
" перед их использованием."
15-
165
msgid ""
176
"Create a \"schema_init :ref:`trigger <triggers>`\". The ``trigger-function``"
187
" will be executed when :ref:`box.cfg{} <index-book_cfg>` happens for the "

locale/ru/LC_MESSAGES/reference/reference_lua/box_ctl/on_shutdown.po

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,6 @@
22
msgid "box.ctl.on_shutdown()"
33
msgstr ""
44

5-
msgid ""
6-
"The ``box.ctl`` submodule also contains two functions for the two "
7-
":ref:`server trigger <triggers>` definitions: ``on_shutdown`` and "
8-
"``on_schema_init``. Please, familiarize yourself with the mechanism of "
9-
"trigger functions before using them."
10-
msgstr ""
11-
"Встроенный модуль ``box.ctl`` также содержит две функции для определения "
12-
"двух :ref:`серверных триггеров <triggers>`: ``on_shutdown`` и "
13-
"``on_schema_init``. Пожалуйста, ознакомьтесь с механизмом триггерных функций"
14-
" перед их использованием."
15-
165
msgid ""
176
"Details about trigger characteristics are in the :ref:`triggers <triggers-"
187
"box_triggers>` section."

0 commit comments

Comments
 (0)