UPGRADE 14.4 KB
Newer Older
1 2 3
Upgrade notes
=============

4
.. highlight:: shell-example
5 6 7 8 9 10 11 12

This document details the steps needed to upgrade a cluster to newer versions
of Ganeti.

As a general rule the node daemons need to be restarted after each software
upgrade; if using the provided example init.d script, this means running the
following command on all nodes::

13
    $ /etc/init.d/ganeti restart
14

15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
2.11 and above
--------------

Starting from 2.10 onwards, Ganeti has support for parallely installed versions
and automated upgrades. The default configuration for 2.11 and higher already is
to install as a parallel version without changing the running version. If both
versions, the installed one and the one to upgrade to, are 2.10 or higher, the
actual switch of the live version can be carried out by the following command
on the master node.::

   $ gnt-cluster upgrade --to 2.11

This will carry out the steps described below in the section on upgrades from
2.1 and above. Downgrades to the previous minor version can be done in the same
way, specifiying the smaller version on the ``--to`` argument.

31 32 33 34 35 36 37 38 39 40
Note that ``gnt-cluster upgrade`` only manages the actual switch between
versions as described below on upgrades from 2.1 and above. It does not install
or remove any binaries. Having the new binaries installed is a prerequisite of
calling ``gnt-cluster upgrade`` (and the command will abort if the prerequisite
is not met). The old binaries can be used to downgrade back to the previous
version; once the system administrator decides that going back to the old
version is not needed any more, they can be removed. Addition and removal of
the Ganeti binaries should happen in the same way as for all other binaries on
your system.

41

Helga Velroyen's avatar
Helga Velroyen committed
42 43 44 45 46 47 48 49
2.13
----

When upgrading to 2.13, first apply the instructions of ``2.11 and
above``. 2.13 comes with the new feature of enhanced SSH security
through individual SSH keys. This features needs to be enabled
after the upgrade by::

Klaus Aehlig's avatar
Klaus Aehlig committed
50
   $ gnt-cluster renew-crypto --new-ssh-keys --no-ssh-key-check
Helga Velroyen's avatar
Helga Velroyen committed
51 52 53 54 55 56 57 58 59 60 61

Note that new SSH keys are generated automatically without warning when
upgrading with ``gnt-cluster upgrade``.

If you instructed Ganeti to not touch the SSH setup (by using the
``--no-ssh-init`` option of ``gnt-cluster init``, the changes in the
handling of SSH keys will not affect your cluster.

If you want to be prompted for each newly created SSH key, leave out
the ``--no-ssh-key-check`` option in the command listed above.

62 63 64 65 66 67 68
Note that after a downgrade from 2.13 to 2.12, the individual SSH keys
will not get removed automatically. This can lead to reachability
errors under very specific circumstances (Issue 1008). In case you plan
on keeping 2.12 for a while and not upgrade to 2.13 again soon, we recommend
to replace all SSH key pairs of non-master nodes' with the master node's SSH
key pair.

Helga Velroyen's avatar
Helga Velroyen committed
69

70 71 72 73 74 75 76 77
2.12
----

Due to issue #1094 in Ganeti 2.11 and 2.12 up to version 2.12.4, we
advise to rerun 'gnt-cluster renew-crypto --new-node-certificates'
after an upgrade to 2.12.5 or higher.


78 79 80 81 82 83 84 85 86 87 88 89 90 91
2.11
----

When upgrading to 2.11, first apply the instructions of ``2.11 and
above``. 2.11 comes with the new feature of enhanced RPC security
through client certificates. This features needs to be enabled after the
upgrade by::

   $ gnt-cluster renew-crypto --new-node-certificates

Note that new node certificates are generated automatically without
warning when upgrading with ``gnt-cluster upgrade``.


92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
2.1 and above
-------------

Starting with Ganeti 2.0, upgrades between revisions (e.g. 2.1.0 to 2.1.1)
should not need manual intervention. As a safety measure, minor releases (e.g.
2.1.3 to 2.2.0) require the ``cfgupgrade`` command for changing the
configuration version. Below you find the steps necessary to upgrade between
minor releases.

To run commands on all nodes, the `distributed shell (dsh)
<http://www.netfort.gr.jp/~dancer/software/dsh.html.en>`_ can be used, e.g.
``dsh -M -F 8 -f /var/lib/ganeti/ssconf_online_nodes gnt-cluster --version``.

#. Ensure no jobs are running (master node only)::

107
    $ gnt-job list
108

109 110 111 112
#. Pause the watcher for an hour (master node only)::

    $ gnt-cluster watcher pause 1h

113 114
#. Stop all daemons on all nodes::

115
    $ /etc/init.d/ganeti stop
116 117 118

#. Backup old configuration (master node only)::

119
    $ tar czf /var/lib/ganeti-$(date +\%FT\%T).tar.gz -C /var/lib ganeti
120

121 122 123 124
    (``/var/lib/ganeti`` can also contain exported instances, so make sure to
    backup only files you are interested in. Use ``--exclude export`` for
    example)

125 126 127
#. Install new Ganeti version on all nodes
#. Run cfgupgrade on the master node::

128 129
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --dry-run
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose
130 131 132 133

   (``cfgupgrade`` supports a number of parameters, run it with
   ``--help`` for more information)

134 135 136 137
#. Upgrade the directory permissions on all nodes::

    $ /usr/lib/ganeti/ensure-dirs --full-run

138
#. Create the (missing) required users and make users part of the required
139
   groups on all nodes::
140 141 142

    $ /usr/lib/ganeti/tools/users-setup

Klaus Aehlig's avatar
Klaus Aehlig committed
143 144 145
   This will ask for confirmation. To execute directly, add the ``--yes-do-it``
   option.

146 147
#. Restart daemons on all nodes::

148
    $ /etc/init.d/ganeti restart
149

150 151 152 153
#. Re-distribute configuration (master node only)::

    $ gnt-cluster redist-conf

154
#. If you use file storage, check that the ``/etc/ganeti/file-storage-paths``
155 156
   is correct on all nodes. For security reasons it's not copied
   automatically, but it can be copied manually via::
157 158 159

   $ gnt-cluster copyfile /etc/ganeti/file-storage-paths

160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187
#. Restart daemons again on all nodes::

    $ /etc/init.d/ganeti restart

#. Enable the watcher again (master node only)::

    $ gnt-cluster watcher continue

#. Verify cluster (master node only)::

    $ gnt-cluster verify

Reverting an upgrade
~~~~~~~~~~~~~~~~~~~~

For going back between revisions (e.g. 2.1.1 to 2.1.0) no manual
intervention is required, as for upgrades.

Starting from version 2.8, ``cfgupgrade`` supports ``--downgrade``
option to bring the configuration back to the previous stable version.
This is useful if you upgrade Ganeti and after some time you run into
problems with the new version. You can downgrade the configuration
without losing the changes made since the upgrade. Any feature not
supported by the old version will be removed from the configuration, of
course, but you get a warning about it. If there is any new feature and
you haven't changed from its default value, you don't have to worry
about it, as it will get the same value whenever you'll upgrade again.

188 189 190 191 192 193 194 195 196 197 198
Automatic downgrades
....................

From version 2.11 onwards, downgrades can be done by using the
``gnt-cluster upgrade`` command.::

  gnt-cluster upgrade --to 2.10

Manual downgrades
.................

199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227
The procedure is similar to upgrading, but please notice that you have to
revert the configuration **before** installing the old version.

#. Ensure no jobs are running (master node only)::

    $ gnt-job list

#. Pause the watcher for an hour (master node only)::

    $ gnt-cluster watcher pause 1h

#. Stop all daemons on all nodes::

    $ /etc/init.d/ganeti stop

#. Backup old configuration (master node only)::

    $ tar czf /var/lib/ganeti-$(date +\%FT\%T).tar.gz -C /var/lib ganeti

#. Run cfgupgrade on the master node::

    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade --dry-run
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade

   You may want to copy all the messages about features that have been
   removed during the downgrade, in case you want to restore them when
   upgrading again.

#. Install the old Ganeti version on all nodes
228 229 230 231 232 233 234 235 236 237

   NB: in Ganeti 2.8, the ``cmdlib.py`` file was split into a series of files
   contained in the ``cmdlib`` directory. If Ganeti is installed from sources
   and not from a package, while downgrading Ganeti to a pre-2.8
   version it is important to remember to remove the ``cmdlib`` directory
   from the directory containing the Ganeti python files (which usually is
   ``${PREFIX}/lib/python${VERSION}/dist-packages/ganeti``).
   A simpler upgrade/downgrade procedure will be made available in future
   versions of Ganeti.

238 239 240 241
#. Restart daemons on all nodes::

    $ /etc/init.d/ganeti restart

242 243
#. Re-distribute configuration (master node only)::

244
    $ gnt-cluster redist-conf
245 246 247

#. Restart daemons again on all nodes::

248
    $ /etc/init.d/ganeti restart
249

250 251 252 253
#. Enable the watcher again (master node only)::

    $ gnt-cluster watcher continue

254 255
#. Verify cluster (master node only)::

256
    $ gnt-cluster verify
257

258 259 260 261 262 263 264 265
Specific tasks for 2.11 to 2.10 downgrade
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

After running ``cfgupgrade``, the ``client.pem`` and
``ssconf_master_candidates_certs`` files need to be removed
from Ganeti's data directory on all nodes. While this step is
not necessary for 2.10 to run cleanly, leaving them will cause
problems when upgrading again after the downgrade.
266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316

2.0 releases
------------

2.0.3 to 2.0.4
~~~~~~~~~~~~~~

No changes needed except restarting the daemon; but rollback to 2.0.3 might
require configuration editing.

If you're using Xen-HVM instances, please double-check the network
configuration (``nic_type`` parameter) as the defaults might have changed:
2.0.4 adds any missing configuration items and depending on the version of the
software the cluster has been installed with, some new keys might have been
added.

2.0.1 to 2.0.2/2.0.3
~~~~~~~~~~~~~~~~~~~~

Between 2.0.1 and 2.0.2 there have been some changes in the handling of block
devices, which can cause some issues. 2.0.3 was then released which adds two
new options/commands to fix this issue.

If you use DRBD-type instances and see problems in instance start or
activate-disks with messages from DRBD about "lower device too small" or
similar, it is recoomended to:

#. Run ``gnt-instance activate-disks --ignore-size $instance`` for each
   of the affected instances
#. Then run ``gnt-cluster repair-disk-sizes`` which will check that
   instances have the correct disk sizes

1.2 to 2.0
----------

Prerequisites:

- Ganeti 1.2.7 is currently installed
- All instances have been migrated from DRBD 0.7 to DRBD 8.x (i.e. no
  ``remote_raid1`` disk template)
- Upgrade to Ganeti 2.0.0~rc2 or later (~rc1 and earlier don't have the needed
  upgrade tool)

In the below steps, replace :file:`/var/lib` with ``$libdir`` if Ganeti was not
installed with this prefix (e.g. :file:`/usr/local/var`). Same for
:file:`/usr/lib`.

Execution (all steps are required in the order given):

#. Make a backup of the current configuration, for safety::

317
    $ cp -a /var/lib/ganeti /var/lib/ganeti-1.2.backup
318 319 320

#. Stop all instances::

321
    $ gnt-instance stop --all
322 323 324 325

#. Make sure no DRBD device are in use, the following command should show no
   active minors::

326
    $ gnt-cluster command grep cs: /proc/drbd | grep -v cs:Unconf
327 328 329 330 331

#. Stop the node daemons and rapi daemon on all nodes (note: should be logged
   in not via the cluster name, but the master node name, as the command below
   will remove the cluster ip from the master node)::

332
    $ gnt-cluster command /etc/init.d/ganeti stop
333 334 335 336 337 338

#. Install the new software on all nodes, either from packaging (if available)
   or from sources; the master daemon will not start but give error messages
   about wrong configuration file, which is normal
#. Upgrade the configuration file::

339 340
    $ /usr/lib/ganeti/tools/cfgupgrade12 -v --dry-run
    $ /usr/lib/ganeti/tools/cfgupgrade12 -v
341 342 343 344 345

#. Make sure ``ganeti-noded`` is running on all nodes (and start it if
   not)
#. Start the master daemon::

346
    $ ganeti-masterd
347 348 349

#. Check that a simple node-list works::

350
    $ gnt-node list
351 352 353

#. Redistribute updated configuration to all nodes::

354 355
    $ gnt-cluster redist-conf
    $ gnt-cluster copyfile /var/lib/ganeti/known_hosts
356 357 358 359

#. Optional: if needed, install RAPI-specific certificates under
   :file:`/var/lib/ganeti/rapi.pem` and run::

360
    $ gnt-cluster copyfile /var/lib/ganeti/rapi.pem
361 362 363

#. Run a cluster verify, this should show no problems::

364
    $ gnt-cluster verify
365 366 367

#. Remove some obsolete files::

368 369
    $ gnt-cluster command rm /var/lib/ganeti/ssconf_node_pass
    $ gnt-cluster command rm /var/lib/ganeti/ssconf_hypervisor
370 371 372 373

#. Update the xen pvm (if this was a pvm cluster) setting for 1.2
   compatibility::

374
    $ gnt-cluster modify -H xen-pvm:root_path=/dev/sda
375 376 377

#. Depending on your setup, you might also want to reset the initrd parameter::

378
    $ gnt-cluster modify -H xen-pvm:initrd_path=/boot/initrd-2.6-xenU
379 380 381

#. Reset the instance autobalance setting to default::

382 383 384
    $ for i in $(gnt-instance list -o name --no-headers); do \
       gnt-instance modify -B auto_balance=default $i; \
      done
385 386 387

#. Optional: start the RAPI demon::

388
    $ ganeti-rapi
389 390 391

#. Restart instances::

392
    $ gnt-instance start --force-multiple --all
393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452

At this point, ``gnt-cluster verify`` should show no errors and the migration
is complete.

1.2 releases
------------

1.2.4 to any other higher 1.2 version
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

No changes needed. Rollback will usually require manual edit of the
configuration file.

1.2.3 to 1.2.4
~~~~~~~~~~~~~~

No changes needed. Note that going back from 1.2.4 to 1.2.3 will require manual
edit of the configuration file (since we added some HVM-related new
attributes).

1.2.2 to 1.2.3
~~~~~~~~~~~~~~

No changes needed. Note that the drbd7-to-8 upgrade tool does a disk format
change for the DRBD metadata, so in theory this might be **risky**. It is
advised to have (good) backups before doing the upgrade.

1.2.1 to 1.2.2
~~~~~~~~~~~~~~

No changes needed.

1.2.0 to 1.2.1
~~~~~~~~~~~~~~

No changes needed. Only some bugfixes and new additions that don't affect
existing clusters.

1.2.0 beta 3 to 1.2.0
~~~~~~~~~~~~~~~~~~~~~

No changes needed.

1.2.0 beta 2 to beta 3
~~~~~~~~~~~~~~~~~~~~~~

No changes needed. A new version of the debian-etch-instance OS (0.3) has been
released, but upgrading it is not required.

1.2.0 beta 1 to beta 2
~~~~~~~~~~~~~~~~~~~~~~

Beta 2 switched the config file format to JSON. Steps to upgrade:

#. Stop the daemons (``/etc/init.d/ganeti stop``) on all nodes
#. Disable the cron job (default is :file:`/etc/cron.d/ganeti`)
#. Install the new version
#. Make a backup copy of the config file
#. Upgrade the config file using the following command::

453
    $ /usr/share/ganeti/cfgupgrade --verbose /var/lib/ganeti/config.data
454 455 456 457 458 459

#. Start the daemons and run ``gnt-cluster info``, ``gnt-node list`` and
   ``gnt-instance list`` to check if the upgrade process finished successfully

The OS definition also need to be upgraded. There is a new version of the
debian-etch-instance OS (0.2) that goes along with beta 2.
460 461 462 463 464 465

.. vim: set textwidth=72 :
.. Local Variables:
.. mode: rst
.. fill-column: 72
.. End: