Skip to content

Update the upgrade instructions #3080

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Aug 26, 2022
7 changes: 7 additions & 0 deletions doc/book/admin/_includes/1.6-to-2.x-condition.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
To perform a **live** upgrade from Tarantool 1.6 to a more recent version,
like :doc:`2.8.4 </release/2.8.4>`, :doc:`2.10.1 </release/2.10.1>` and such,
it is necessary to take an intermediate step by upgrading 1.6 -> 1.10 -> 2.x.
This is the only way to perform the upgrade without downtime.

However, a direct upgrade of a replica set from 1.6 to 2.x is also possible, but only
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

direct upgrade of a replica set from 1.6 to 2.x

Better to give a link to the description of this procedure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This text is included both in the direct and stepwise instructions with the include directive. So I think the link won't be necessary.

**with downtime**.
54 changes: 54 additions & 0 deletions doc/book/admin/_includes/script_find_indices.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
local fiber = require('fiber')
local decimal = require('decimal')

local function isnan(val)
return type(val) == 'number' and val ~= val
end

local function isinf(val)
return val == math.huge or val == -math.huge
end

local function vinyl(id)
return box.space[id].engine == 'vinyl'
end

require_rebuild = {}
local iters = 0
for _, v in box.space._index:pairs({512, 0}, {iterator='GE'}) do
local id = v[1]
iters = iters + 1
if iters % 1000 == 0 then
fiber.yield()
end
if vinyl(id) then
local format = v[6]
local check_fields = {}
for _, fmt in pairs(v[6]) do
if fmt[2] == 'number' or fmt[2] == 'scalar' then
table.insert(check_fields, fmt[1] + 1)
end
end
local have_decimal = {}
local have_nan = {}
if #check_fields > 0 then
for k, tuple in box.space[id]:pairs() do
for _, i in pairs(check_fields) do
iters = iters + 1
if iters % 1000 == 0 then
fiber.yield()
end
have_decimal[i] = have_decimal[i] or
decimal.is_decimal(tuple[i])
have_nan[i] = have_nan[i] or isnan(tuple[i]) or
isinf(tuple[i])
if have_decimal[i] and have_nan[i] then
table.insert(require_rebuild, v)
goto out
end
end
end
end
end
::out::
end
25 changes: 25 additions & 0 deletions doc/book/admin/_includes/script_rebuild_indices.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
local log = require('log')

local function rebuild_index(idx)
local index_name = idx[3]
local space_name = box.space[idx[1]].name
log.info("Rebuilding index %s on space %s", index_name, space_name)
if (idx[2] == 0) then
log.error("Cannot rebuild primary index %s on space %s. Please, "..
"recreate the space manually", index_name, space_name)
return
end
log.info("Deleting index %s on space %s", index_name, space_name)
local v = box.space._index:delete{idx[1], idx[2]}
if v == nil then
log.error("Couldn't find index %s on space %s", index_name, space_name)
return
end
log.info("Done")
log.info("Creating index %s on space %s", index_name, space_name)
box.space._index:insert(v)
end

for _, idx in pairs(require_rebuild) do
rebuild_index(idx)
end
21 changes: 21 additions & 0 deletions doc/book/admin/_includes/upgrade_procedure.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
1. Pick any replica in the replica set.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be an RO instance? If yes, let's make this clear.


2. Upgrade this replica to the new Tarantool version. See details in
:ref:`Upgrading a Tarantool instance <admin-upgrades_instance>`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:ref:`Upgrading a Tarantool instance <admin-upgrades_instance>`.
:ref:`Upgrading Tarantool on a standalone instance <admin-upgrades_instance>`.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Учитывая, что мы отсылаем пользователя сделать процедуру "Upgrading Tarantool on a standalone instance", где инстанс
инстанс стоппится, т.е. у него есть явный даунтайм, может имеет смысл пояснить пользователю, почему мы называем процедуру в репликасете "Upgrading ... with no downtime" -- что репликасет продолжает консистентно писать и отдавать данные на оставшихся работающих репликах, а реплику, которую мы стоппанули и проапгрейдили, мы потом снова подключим к репликасету, и она "догонит" остальных -- если я правильно понял идею этого апдегрейд with no downtime.

Может пользователю это и так очевидно, а может и нет.


3. Make sure the replica connected to the rest of the replica set just fine:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Как писал в комменте к "Upgrading Tarantool on a standalone instance", лучше как-то явно написать, что в процедуре апгрейда одного инстанса мы его стоппим, а потом запускаем.
В этом шаге проверяется коннект инстанса ко всем репликам в сете, инстанс явно должен быть в работающем состоянии.


.. code-block:: tarantoolsession

box.info.replication[id].upstream
box.info.replication[id].downstream

The ``status`` field in both outputs should have the value ``follow``.

4. :ref:`Upgrade <admin-upgrades_instance>` all the replicas by repeating steps 1--3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the replicas

RO replicas?

until only the master keeps running the old Tarantool version.

5. Make one of the updated replicas the new master.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make one of the updated replicas the new master

How? What the procedure?

If a user needs to set box.cfg{ read_only = false } on this instance, let's make it clear.
If yes:

  • Can it be done on the working instance or not? It' better to make this clear as well.
  • If we're setting this instance to be the new master (RW), what should we do with the former master and when? My point: if we set box.cfg{ read_only = false } on one of the RO instance making it the master (RW), there will be a period of time when there're 2 RW instances in the replica set -- this one (new master) and the current master until we make it RO.

Sure we have explanation about the read_only option in the config reference https://www.tarantool.io/en/doc/latest/reference/configuration/#cfg-basic-read-only
but it's better to explain the main essential ToDo point here, and then to refer to the reference for other details if necessary.

Check that it continues following and being followed by all other replicas.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How? What the procedure?

If it's by checking

box.info.replication[id].upstream
box.info.replication[id].downstream

like in the step #3 above, let's make it clear by referencing to this step.


6. :ref:`Upgrade <admin-upgrades_instance>` the former master.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment to the step #5 above.
Let's make it more clear if we need to make the former master to be RO -- how to do this and when.

193 changes: 89 additions & 104 deletions doc/book/admin/upgrades.rst
Original file line number Diff line number Diff line change
@@ -1,53 +1,20 @@
.. _admin-upgrades:

================================================================================
Upgrades
================================================================================
========

For information about backwards compatibility,
see the :ref:`compatibility guarantees <compatibility_guarantees>` description.

.. _admin-upgrades_db:

--------------------------------------------------------------------------------
Upgrading a Tarantool database
--------------------------------------------------------------------------------

If you created a database with an older Tarantool version and have now installed
a newer version, make the request ``box.schema.upgrade()``. This updates
Tarantool system spaces to match the currently installed version of Tarantool.

For example, here is what happens when you run ``box.schema.upgrade()`` with a
database created with Tarantool version 1.6.4 to version 1.7.2 (only a small
part of the output is shown):

.. code-block:: tarantoolsession

tarantool> box.schema.upgrade()
alter index primary on _space set options to {"unique":true}, parts to [[0,"unsigned"]]
alter space _schema set options to {}
create view _vindex...
grant read access to 'public' role for _vindex view
set schema version to 1.7.0
---
...

.. _admin-upgrades_instance:

--------------------------------------------------------------------------------
Upgrading a Tarantool instance
--------------------------------------------------------------------------------

Tarantool is backward compatible between two adjacent versions. For example, you
should have no or little trouble when upgrading from Tarantool 1.6 to 1.7, or
from Tarantool 1.7 to 2.x. Meanwhile Tarantool 2.x may have incompatible changes
when migrating from Tarantool 1.6. to 2.x directly.

.. _admin-upgrades_instance_17_to_20:
Upgrading Tarantool on a standalone instance
--------------------------------------------

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
How to upgrade from Tarantool 1.7 to 2.x
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This procedure is for upgrading a standalone Tarantool instance in production.
Notice that this will **always imply a downtime**.
To upgrade **without downtime**, you need several Tarantool servers running in a
replication cluster (see :ref:`below <admin-upgrades_replication_cluster>`).

1. Stop the Tarantool server.

Expand All @@ -58,95 +25,113 @@ How to upgrade from Tarantool 1.7 to 2.x
3. Update the Tarantool server. See installation instructions at Tarantool
`download page <http://tarantool.org/download.html>`_.

4. Launch the updated Tarantool server using ``tarantoolctl`` or ``systemctl``.
After that, make sure to :ref:`finish the upgrade properly <admin-upgrades_db>`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Предлагаю сделать этот пункт снова явно номером #4 в нумерованном списке и привести здесь полный текст процедуры финализации -- или инклюдом включить, или просто текстом написать.

Мотивация:

В этой маленькой инструкции из 4х шагов, мы 3 раза отсылаем пользователя в другие места выполнять другие инструкции. Это всегда неудобно, и лучше сделать по максимуму, чтобы пользователь читал инструкцию на одной странице. Если пп. 2 и 3 еще более или менее норм отсылать, то п. 4 я бы лучше привел здесь в явном виде.

К тому же мы к этой инструкции отсылаем из следущей секции (Upgrading Tarantool in a replica set with no downtime) для апдейта каждого инстанса. Получается уж больно комплексная "матрешка" из редиректов по инструкциям. Это еще один плюс к тому, чтобы сделать максимально инструкцию в одном разделе тут.

Еще один плюс к этой мотивации -- в п.1 мы стоппим инстанс. По логике, в конце мы его должны стартануть. Но в текущем описании не понятно, где. Может в п.3, а может в п.4.
Лучше это сделать как-то явным образом, чтобы читатель видел - в начале мы стоппанули инстанс, в конце запустили. Особенно учитывая то, что к процедуре апдейта standalone инстанс мы обращаемся из шагов инструкции по "Upgrading ... with no downtime" и той инструкции проверяем коннект инстанса к остальным репликам -- логично, что для этого инстанс должен быть запущен, и лучше явно написать в процедуре апгрейда для standalone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Тогда нам нужно обновить доку по box.schema.upgrade :)


.. _admin-upgrades_instance_16_to_20:
.. _admin-upgrades_replication_cluster:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
How to upgrade from Tarantool 1.6 to 2.x
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Upgrading Tarantool in a replica set with no downtime
-----------------------------------------------------

The procedure is fully analogous to
:ref:`upgrading from 1.7 to 2.x <admin-upgrades_instance_17_to_20>`.
Below are the general instructions for upgrading Tarantool in a replica set.
Upgrading from some versions can involve certain specifics. You can find
instructions for individual versions :ref:`in the list below <admin-upgrades_version_specifics>`.

.. _admin-upgrades_instance_16_to_17:
.. important::

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
How to upgrade from Tarantool 1.6 to 1.7
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The only way to upgrade Tarantool from version 1.6, 1.7, or 1.9 to 2.x **without downtime** is
taking an intermediate step by upgrading to 1.10 and then to 2.x.

This procedure is for upgrading a standalone Tarantool instance in production
from 1.6.x to 1.7.x. Notice that this will **always imply a downtime**.
To upgrade **without downtime**, you need several Tarantool servers running in a
replication cluster (see :ref:`below <admin-upgrades_replication_cluster>`).
Before upgrading Tarantool from 1.6 to 2.x, please read about the associated
:ref:`caveats <admin-upgrades-1.6-1.10>`.

Tarantool 1.7 has an incompatible :ref:`.snap <internals-snapshot>` and
:ref:`.xlog <internals-wal>` file format: 1.6 files are
supported during upgrade, but you won’t be able to return to 1.6 after running
under 1.7 for a while. It also renames a few configuration parameters, but old
parameters are supported. The full list of breaking changes is available in
`release notes for Tarantool 1.7 <https://github.com/tarantool/tarantool/releases>`_.
Preparations
~~~~~~~~~~~~

1. Check with application developers whether application files need to be
updated due to incompatible changes (see
`1.7 release notes <https://github.com/tarantool/tarantool/releases>`_).
If yes, back up the old application files.
#. Check the replica set health by running the following code on every instance:

2. Stop the Tarantool server.
.. code-block:: tarantoolsession

3. Make a copy of all data (see an appropriate hot backup procedure in
:ref:`Backups <admin-backups>`) and the package from which the current (old)
version was installed (for rollback purposes).
box.info.ro -- "false" at least on one instance
box.info.status -- should be "running"

If all instances have ``box.info.ro = true``, this means there are no writable nodes.
If you're running Tarantool :doc:`v. 2.10.0 </release/2.10.0>` or later,
you can find out the reason by running ``box.info.ro_reason``.
If it has the value ``orphan``, the instance doesn't see the rest of the replica set.
Similarly, if ``box.info.status`` has the value ``orphan``, the instance doesn't see the rest of the replica set.
First resolve the replication issues and only then continue.

4. Update the Tarantool server. See installation instructions at Tarantool
`download page <http://tarantool.org/download.html>`_.
If you're running Cartridge, you can also check node health in the UI.

5. Update the Tarantool database. Put the request ``box.schema.upgrade()``
inside a :doc:`box.once() </reference/reference_lua/box_once>` function in your Tarantool
:ref:`initialization file <index-init_label>`.
On startup, this will create new system spaces, update data type names (e.g.
num -> unsigned, str -> string) and options in Tarantool system spaces.
#. Make sure each replica connected to the rest of the replica set.
Run ``box.info.replication`` and check the output table.
For each instance ``id``, there are ``upstream`` and ``downstream`` values.
Both of them should have the value ``follow``, except on the instance where you run this code.
This means that the replicas are connected and there are no errors in the data flow.

6. Update application files, if needed.
The value of the ``lag`` field can be less or equal than ``box.cfg.replication_timeout``,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this paragpaph to be the 3d item in the numbered list -- to make the check-out preparation points more clear

  1. ro/rw instances
  2. upstream/downtime
  3. replication lag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it is unnecessary to divide the output of a single command into two separate steps.
Instead, I'll elaborate on the table output.

but it can also be moderately larger.
For example, if ``box.cfg.replication_timeout`` is 1 second, it's generally OK to have a lag of about 10 seconds.
It is up to the user to decide what lag values are fine.

7. Launch the updated Tarantool server using ``tarantoolctl`` or ``systemctl``.
If the replica set is healthy, proceed to the upgrade.

.. _admin-upgrades_replication_cluster:
Upgrade procedure
~~~~~~~~~~~~~~~~~

.. include:: ./_includes/upgrade_procedure.rst

After upgrading the replica set, make sure to run ``box.schema.upgrade()`` on the new master
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this paragraph to be the step #7 -- it's the part of the entire procedure.

as described below in the section ":ref:`Finishing the upgrade <admin-upgrades_db>`".
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
as described below in the section ":ref:`Finishing the upgrade <admin-upgrades_db>`".
as described below in the section ":ref:`Finalizing the upgrade <admin-upgrades_db>`".

There is no need to run ``box.schema.upgrade()`` on every node:
changes are propagated to other nodes via the regular replication mechanism.

--------------------------------------------------------------------------------
Upgrading Tarantool in a replication cluster
--------------------------------------------------------------------------------
Finally, run ``box.snapshot()`` on every node in the replica set
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this paragraph to be the step #8 -- it's the part of the entire procedure.

to make sure that the replicas immediately see the upgraded database state in case of restart.

.. _admin-upgrades_db:

Tarantool 1.7 can work as a :ref:`replica <replication-architecture>`
for Tarantool 1.6 and vice versa. Replicas
perform capability negotiation on handshake, and new 1.7 replication features
are not used with 1.6 replicas. This allows upgrading clustered configurations.
Finalizing the upgrade
----------------------

This procedure allows for a rolling upgrade **without downtime** and works for
any cluster configuration: master-master or master-replica.
1. If you created a database with an older Tarantool version and have now installed
a newer version, make the request ``box.schema.upgrade()``. This updates
Tarantool system spaces to match the currently installed version of Tarantool.

1. Upgrade Tarantool at all replicas (or at any master in a master-master
cluster). See details in
:ref:`Upgrading a Tarantool instance <admin-upgrades_instance>`.
For example, here is what happens when you run ``box.schema.upgrade()`` with a
database created with Tarantool version 1.6.4 to version 1.7.2 (only a small
part of the output is shown):

2. Verify installation on the replicas:
.. code-block:: tarantoolsession

a. Start Tarantool.
tarantool> box.schema.upgrade()
alter index primary on _space set options to {"unique":true}, parts to [[0,"unsigned"]]
alter space _schema set options to {}
create view _vindex...
grant read access to 'public' role for _vindex view
set schema version to 1.7.0
---
...

You can also put the request ``box.schema.upgrade()``
inside a :doc:`box.once() </reference/reference_lua/box_once>` function in your Tarantool
:ref:`initialization file <index-init_label>`.
On startup, this will create new system spaces, update data type names (for example,
``num`` -> ``unsigned``, ``str`` -> ``string``) and options in Tarantool system spaces.

b. Attach to the master and start working as before.
2. Update your application files, if needed.

The master runs the old Tarantool version, which is always compatible with
the next major version.
3. Launch the updated Tarantool server using ``tarantoolctl``, ``tt``, or ``systemctl``.

3. Upgrade the master. The procedure is similar to upgrading a replica.

4. Verify master installation:
.. _admin-upgrades_version_specifics:

a. Start Tarantool with replica configuration to catch up.
Version specifics
-----------------

b. Switch to master mode.
.. toctree::
:maxdepth: 1

5. Upgrade the database on any master node in the cluster. Make the request
``box.schema.upgrade()``. This updates Tarantool system spaces to match the
currently installed version of Tarantool. Changes are propagated to other
nodes via the regular replication mechanism.
upgrades/1.6-1.10
upgrades/1.6-2.0-downtime
upgrades/2.10.1
Loading