Skip to content

[3pt] Add wal_cleanup_delay configuration parameter #2022

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
5 tasks
Tracked by #2665
TarantoolBot opened this issue Mar 31, 2021 · 0 comments · Fixed by #3350
Closed
5 tasks
Tracked by #2665

[3pt] Add wal_cleanup_delay configuration parameter #2022

TarantoolBot opened this issue Mar 31, 2021 · 0 comments · Fixed by #3350
Assignees
Labels
feature A new functionality reference [location] Tarantool manual, Reference part server [area] Task relates to Tarantool's server (core) functionality

Comments

@TarantoolBot
Copy link
Collaborator

TarantoolBot commented Mar 31, 2021

Related dev. issue(s): tarantool/tarantool#5806

Product: Tarantool
Since: 2.8.1
Audience/target: dev, admin
Root document: https://www.tarantool.io/en/doc/latest/reference/configuration/#binary-logging-and-snapshots
SME: @ cyrillos

Details

What should we write about?
Describe the wal_cleanup_delay configuration option:

  • what it is doing
  • what problem does it solve
  • how to choose its value depending on the use case

The wal_cleanup_delay option defines a delay in second
before write ahead log files (*.xlog) are getting started
to prune upon a node restart.

This option is ignored in case if a node is running as
an anonymous replica (replication_anon = true). Similarly
if replication is unused or there are no plans to use
replication at all then this option should not be considered.

An initial problem to solve is the case where a node is operating
so fast that its replicas do not manage to reach the node state
and in case if the node is restarted at this moment (for various
reasons, for example due to power outage) then *.xlog files might
be pruned during restart. In result replicas will not find these
files on the main node and have to reread all data back which
is a very expensive procedure.

Since replicas are tracked via _cluster system space this we use
its content to count subscribed replicas and when all of them are
up and running the cleanup procedure is automatically enabled even
if wal_cleanup_delay is not expired.

The wal_cleanup_delay should be set to:

  • 0 to disable the cleanup delay;
  • >= 0 to wait for specified number of seconds.

By default it is set to 14400 seconds (ie 4 hours).

In case if registered replica is lost forever and timeout is set to
infinity then a preferred way to enable cleanup procedure is not setting
up a small timeout value but rather to delete this replica from _cluster
space manually.

Note that the option does not prevent WAL engine from removing
old *.xlog files if there is no space left on a storage device,
WAL engine can remove them in a force way.

Current state of *.xlog garbage collector can be found in
box.info.gc() output. For example

 tarantool> box.info.gc()
 ---
   ...
   is_paused: false

The is_paused shows if cleanup fiber is paused or not.
Requested by @cyrillos in tarantool/tarantool@2fd51ae.

Definition of done

  • add an option description
  • specify Since version with a link to release notes
  • add links to related docs
  • make sure all option properties (Type, Default, etc.) are specified
  • add an option anchor to the top of the section
@Onvember Onvember added the feature A new functionality label Mar 31, 2021
@NickVolynkin NickVolynkin changed the title Add wal_cleanup_delay configuration parameter [5pt] Add wal_cleanup_delay configuration parameter Apr 14, 2021
@NickVolynkin NickVolynkin added reference [location] Tarantool manual, Reference part server [area] Task relates to Tarantool's server (core) functionality labels Apr 14, 2021
@veod32 veod32 changed the title [5pt] Add wal_cleanup_delay configuration parameter [3pt] Add wal_cleanup_delay configuration parameter Feb 3, 2022
@andreyaksenov andreyaksenov self-assigned this Feb 17, 2023
@veod32 veod32 removed the 3sp label Feb 17, 2023
andreyaksenov added a commit that referenced this issue Feb 27, 2023
Document the 'wal_cleanup_delay' option.

Resolves #2022
p7nov pushed a commit that referenced this issue Mar 24, 2023
Document the 'wal_cleanup_delay' option.

Resolves #2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality reference [location] Tarantool manual, Reference part server [area] Task relates to Tarantool's server (core) functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants