Commit fb6d0a7d authored by Klaus Aehlig's avatar Klaus Aehlig

Add design document on allocation changes for dedicated

Add a design document detailing how allocation is changed
on dedicated clusters in order to avoid not being able
to put a big instance on the cluster as all nodes are taken
up by small instances.
Signed-off-by: default avatarKlaus Aehlig <>
Reviewed-by: default avatarPetr Pudlak <>
parent c8b133cb
......@@ -651,6 +651,7 @@ docinput = \
doc/design-cpu-pinning.rst \
doc/design-cpu-speed.rst \
doc/design-daemons.rst \
doc/design-dedicated-allocation.rst \
doc/design-device-uuid-name.rst \
doc/design-disk-conversion.rst \
doc/design-disks.rst \
Allocation for Partitioned Ganeti
.. contents:: :depth: 4
Current state and shortcomings
The introduction of :doc:`design-partitioned` allowed to
dedicate resources, in particular storage, exclusively to
an instance. The advantage is that such instances have
guaranteed latency that is not affected by other
instances. Typically, those instances are created once
and never moved. Also, typically large chunks (full, half,
or quarter) of a node are handed out to individual
partitioned instances.
Ganeti's allocation strategy is to keep the cluster as
balanced as possible. In particular, as long as empty nodes
are available, new instances, regardless of their size,
will be placed there. Therefore, if a couple of small
instances are placed on the cluster first, it will no longer
be possible to place a big instance on the cluster despite
the total usage of the cluster being low.
Proposed changes
We propose to change the allocation strategy of hail for
node groups that have the ``exclusive_storage`` flag set,
as detailed below; nothing will be changed for non-exclusive
node groups. The new strategy will try to keep the cluster
as available for new instances as possible.
Dedicated Allocation Metric
The instance policy is a set of intervals in which the resources
of the instance have to be. Typical choices for dedicated clusters
have disjoint intervals with the same monotonicity in every dimension.
In this case, the order is obvious. In order to make it well-defined
in every case, we specify that we sort the intervals by the lower
bound of the disk size. This is motivated by the fact that disk is
the most critical aspect of partitioned Ganeti.
For a node the *allocation vector* is the vector of, for each
instance policy interval in decreasing order, the number of
instances minimally compliant with that interval that still
can be placed on that node. For the drbd template, it is assumed
that all newly placed instances have new secondaries.
The *lost-allocations vector* for an instance on a node is the
difference of the allocation vectors for that node before and
after placing that instance on that node. Lost-allocation vectors
are ordered lexicographically, i.e., a loss of an allocation
larger instance size dominates loss of allocations of smaller
instance sizes.
If allocating in a node group with ``exclusive_storage`` set
to true, hail will try to minimise the pair of the lost-allocations
vector and the remaining disk space on the node afer, ordered
Consider the already mentioned scenario were only full, half, and quarter
nodes are given to instances. Here, for the placement of a
quarter-node--sized instance we would prefer a three-quarter-filled node (lost
allocations: 0, 0, 1 and no left overs) over a quarter-filled node (lost
allocations: 0, 0, 1 and half a node left over)
over a half-filled node (lost allocations: 0, 1, 1) over an empty
node (lost allocations: 1, 1, 1). A half-node sized instance, however,
would prefer a half-filled node (lost allocations: 0, 1, 2 and no left-overs)
over a quarter-filled node (lost allocations: 0, 1, 2 and a quarter node left
over) over an empty node (lost allocations: 1, 1, 2).
Note that the presence of additional policy intervals affects the preferences
of instances of other sizes as well. This is by design, as additional available
instance sizes make additional remaining node sizes attractive. If, in the
given example, we would also allow three-quarter-node--sized instances, for
a quarter-node--sized instance it would now be better to be placed on a
half-full node (lost allocations: 0, 0, 1, 1) than on a quarter-filled
node (lost allocations: 0, 1, 0, 1).
......@@ -26,6 +26,7 @@ Design document drafts
.. vim: set textwidth=72 :
.. Local Variables:
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment