design-location.rst 5.53 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
Improving location awareness of Ganeti

This document describes an enhancement of Ganeti's instance
placement by taking into account that some nodes are vulnerable
to common failures.

.. contents:: :depth: 4

Current state and shortcomings

Currently, Ganeti considers all nodes in a single node group as
equal. However, this is not true in some setups. Nodes might share
common causes of failure or be even located in different places
with spacial redundancy being a desired feature.

The similar problem for instances, i.e., instances providing the
same external service should not placed on the same nodes, is
solved by means of exclusion tags. However, there is no mechanism
for a good choice of node pairs for a single instance. Moreover,
while instances providing the same service run on different nodes,
they are not spread out location wise.

Proposed changes

We propose to the cluster metric (as used, e.g., by ``hbal`` and ``hail``)
to honor additional node tags indicating nodes that might have a common
cause of failure.

Failure tags

As for exclusion tags, cluster tags will determine which tags are considered
to denote a source of common failure. More precisely, a cluster tag of the
form *htools:nlocation:x* will make node tags starting with *x:* indicate a
common cause of failure, that redundant instances should avoid.

Metric changes

The following components will be added cluster metric, weighed appropriately.

- The number of pairs of an instance and a common-failure tag, where primary
  and secondary node both have this tag.

- The number of pairs of exclusion tags and common-failure tags where there
  exist at least two instances with the given exclusion tag with the primary
  node having the given common-failure tag.

The weights for these components might have to be tuned as experience with these
setups grows, but as a starting point, both components will have a weight of
57 58 59 60 61 62 63 64 65 66
1.0 each. In this way, any common-failure violations are less important than
any hard constraints missed (like instances on offline nodes) so that
the hard constraints will be restored first when balancing a cluster.
Nevertheless, with weight 1.0 the new common-failure components will
still be significantly more important than all the balancedness components
(cpu, disk, memory), as the latter are standard deviations of fractions.
It will also dominate the disk load component which, which, when only taking
static information into account, essentially amounts to counting disks. In
this way, Ganeti will be willing to sacrifice equal numbers of disks on every
node in order to fulfill location requirements.
67 68 69 70 71 72

Appart from changing the balancedness metric, common-failure tags will
not have any other effect. In particular, as opposed to exclusion tags,
no hard guarantees are made: ``hail`` will try allocate an instance in
a common-failure avoiding way if possible, but still allocate the instance
if not.
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111

Additional migration restrictions

Inequality between nodes can also restrict the set of instance migrations
possible. Here, the most prominent example is updating the hypervisor where
usually migrations from the new to the old hypervisor version is not possible.

Migration tags

As for exclusion tags, cluster tags will determine which tags are considered
restricting migration. More precisely, a cluster tag of the form
*htools:migration:x* will make node tags starting with *x:* a migration relevant
node property. Additionally, cluster tags of the form
*htools:allowmigration:y::z* where *y* and *z* are migration tags not containing
*::* specify a unidirectional migration possibility from *y* to *z*.


An instance migration will only be considered by ``htools``, if for all
migration tags *y* present on the node migrated from, either the tag
is also present on the node migrated to or there is a cluster tag
*htools::allowmigration:y::z* and the target node is tagged *z* (or both).


For the simple hypervisor upgrade, where migration from old to new is possible,
but not the other way round, tagging all already upgraded nodes suffices.

Advise only

These tags are of advisory nature only. That is, all ``htools`` will strictly
obey the restrictions imposed by those tags, but Ganeti will not prevent users
from manually instructing other migrations.
112 113 114 115 116 117 118 119 120 121 122 123

Instance pinning

Sometimes, administrators want specific instances located in a particular,
typically geographic, location. To support these kind of requests, instances
can be assigned tags of the form *htools:desiredlocation:x* where *x* is a
failure tag. Those tags indicate the the instance wants to be placed on a
node tagged *x*. To make ``htools`` honor those desires, the metric is extended,
appropriately weighted, by the following component.

124 125 126 127 128 129 130 131 132
- Sum of dissatisfied desired locations number among all cluster instances.
  An instance desired location is dissatisfied when the instance is assigned
  a desired-location tag *x* where the node is not tagged with the location
  tag *x*.

Such metric extension allows to specify multiple desired locations for each
instance. These desired locations may be contradictive as well. Contradictive
desired locations mean that we don't care which one of desired locations will
be satisfied.
133 134 135 136

Again, instance pinning is just heuristics, not a hard enforced requirement;
it will only be achieved by the cluster metrics favouring such placements.