hroller.rst 5.06 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
HROLLER(1) Ganeti | Version @GANETI_VERSION@
============================================

NAME
----

hroller \- Cluster rolling maintenance scheduler for Ganeti

SYNOPSIS
--------

**hroller** {backend options...} [algorithm options...] [reporting options...]

**hroller** \--version


Backend options:

{ **-m** *cluster* | **-L[** *path* **]** | **-t** *data-file* |
**-I** *path* }

Klaus Aehlig's avatar
Klaus Aehlig committed
22 23
**[ --force ]**

24 25
Algorithm options:

26
**[ -G *name* ]**
27
**[ -O *name...* ]**
28
**[ --node-tags** *tag,..* **]**
29 30 31 32
**[ --skip-non-redundant ]**

**[ --offline-maintenance ]**
**[ --ignore-non-redundant ]**
33

34 35 36 37
Reporting options:

**[ -v... | -q ]**
**[ -S *file* ]**
38
**[ --one-step-only ]**
39
**[ --print-moves ]**
40 41 42 43 44 45 46 47

DESCRIPTION
-----------

hroller is a cluster maintenance reboot scheduler. It can calculate
which set of nodes can be rebooted at the same time while avoiding
having both primary and secondary nodes being rebooted at the same time.

48
For backends that support identifying the master node (currently
49
RAPI and LUXI), the master node is scheduled as the last node
50 51
in the last reboot group. Apart from this restriction, larger reboot
groups are put first.
52

53 54 55 56
ALGORITHM FOR CALCULATING OFFLINE REBOOT GROUPS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

hroller will view the nodes as vertices of an undirected graph,
57 58 59 60 61 62 63
with two kind of edges. Firstly, there are edges from the primary
to the secondary node of every instance. Secondly, two nodes are connected
by an edge if they are the primary nodes of two instances that have the
same secondary node. It will then color the graph using a few different
heuristics, and return the minimum-size color set found. Node with the same
color can then simultaneously migrate all instance off to their respective
secondary nodes, and it is safe to reboot them simultaneously.
64

65 66 67
OPTIONS
-------

68
For a description of the standard options check **htools**\(1) and
Klaus Aehlig's avatar
Klaus Aehlig committed
69 70
**hbal**\(1).

71 72 73
\--force
  Do not fail, even if the master node cannot be determined.

74 75 76
\--node-tags *tag,...*
  Restrict to nodes having at least one of the given tags.

77 78 79 80 81
\--full-evacuation
  Also plan moving secondaries out of the nodes to be rebooted. For
  each instance the move is at most a migrate (if it was primary
  on that node) followed by a replace secondary.

82 83
\--skip-non-redundant
  Restrict to nodes not hosting any non-redundant instance.
84

85 86 87 88 89
\--offline-maintenance
  Pretend that all instances are shutdown before the reboots are carried
  out. I.e., only edges from the primary to the secondary node of an instance
  are considered.

90 91 92 93 94 95
\--ignore-non-redundnant
  Pretend that the non-redundant instances do not exist, and only take
  instances with primary and secondary node into account.

\--one-step-only
  Restrict to the first reboot group. Output the group one node per line.
Klaus Aehlig's avatar
Klaus Aehlig committed
96

97
\--print-moves
98
  After each group list for each affected instance a node
99 100 101
  where it can be evacuated to. The moves are computed under the assumption
  that after each reboot group, all instances are moved back to their
  initial position.
102

103 104 105
BUGS
----

106
If instances are online the tool should refuse to do offline rolling
107
maintenances, unless explicitly requested.
108 109 110

End-to-end shelltests should be provided.

111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134
EXAMPLES
--------

Online Rolling reboots, using tags
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Selecting by tags and getting output for one step only can be used for
planing the next maintenance step.
::

   $ hroller --node-tags needsreboot --one-step-only -L
   'First Reboot Group'
    node1.example.com
    node3.example.com

Typically these nodes would be drained and migrated.
::

   $ GROUP=`hroller --node-tags needsreboot --one-step-only --no-headers -L`
   $ for node in $GROUP; do gnt-node modify -D yes $node; done
   $ for node in $GROUP; do gnt-node migrate -f --submit $node; done

After maintenance, the tags would be removed and the nodes undrained.

135

136 137
Offline Rolling node reboot output
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
138

139
If all instances are shut down, usually larger node groups can be found.
140
::
141

142
    $ hroller --offline-maintainance -L
143 144 145 146
    'Node Reboot Groups'
    node1.example.com,node3.example.com,node5.example.com
    node8.example.com,node6.example.com,node2.example.com
    node7.example.com,node4.example.com
147

148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177
Rolling reboots with non-redundant instances
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

By default, hroller plans capacity to move the non-redundant instances
out of the nodes to be rebooted. If requested, apropriate locations for
the non-redundant instances can be shown. The assumption is that instances
are moved back to their original node after each reboot; these back moves
are not part of the output.
::

    $ hroller --print-moves -L
    'Node Reboot Groups'
    node-01-002,node-01-003
      inst-20 node-01-001
      inst-21 node-01-000
      inst-30 node-01-005
      inst-31 node-01-004
    node-01-004,node-01-005
      inst-40 node-01-001
      inst-41 node-01-000
      inst-50 node-01-003
      inst-51 node-01-002
    node-01-001,node-01-000
      inst-00 node-01-002
      inst-01 node-01-003
      inst-10 node-01-005
      inst-11 node-01-004



178 179 180 181 182
.. vim: set textwidth=72 :
.. Local Variables:
.. mode: rst
.. fill-column: 72
.. End: