From 25ee7fd8456f96954a8015d02db5c2d7e2836ccc Mon Sep 17 00:00:00 2001
From: Michael Hanselmann <hansmi@google.com>
Date: Mon, 6 Jun 2011 17:10:43 +0200
Subject: [PATCH] Update iallocator design for node group-aware operations

A while ago a new ``multi-relocate`` mode was proposed and documented.
As it turned out, the interface had some deficiencies. With this patch
The relocation modes are reduced to two and split into separate
iallocator request modes: node-evacuate and change-group. Some request
and response requirements are clarified in the documentation.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
---
 doc/design-multi-reloc.rst | 62 ++++++++++++++++++++++----------------
 doc/iallocator.rst         | 45 ++++++++++++++++-----------
 2 files changed, 63 insertions(+), 44 deletions(-)

diff --git a/doc/design-multi-reloc.rst b/doc/design-multi-reloc.rst
index f4b581c9e..4f8efe091 100644
--- a/doc/design-multi-reloc.rst
+++ b/doc/design-multi-reloc.rst
@@ -23,38 +23,39 @@ groups so that, for example, it is possible to move a set of instances
 to another group for policy reasons, or completely empty a given group
 to perform maintenance operations.
 
-To implement this, we propose a new ``multi-relocate`` IAllocator call
-that will be able to compute inter-group instance moves, taking into
-account mobility domains as appropriate. The interface proposed below
-should be enough to cover the use cases mentioned above.
+To implement this, we propose the addition of new IAllocator calls to
+compute inter-group instance moves and group-aware node evacuation,
+taking into account mobility domains as appropriate. The interface
+proposed below should be enough to cover the use cases mentioned above.
+
+With the implementation of this design proposal, the previous
+``multi-evacuate`` mode will be deprecated.
 
 .. _multi-reloc-detailed-design:
 
 Detailed design
 ===============
 
-We introduce a new ``multi-relocate`` IAllocator call whose input will
-be a list of instances to move, and a "mode of operation" that will
-determine what groups will be candidates to receive the new instances.
-
-The mode of operation will be one of:
+All requests honor the groups' ``alloc_policy`` attribute.
 
-- *Stay in group*: the instances will be moved off their current nodes,
-  but will stay in the same group; this is what the ``relocate`` call
-  does, but here it can act on multiple instances. (Typically, the
-  source nodes will be marked as drained, to avoid just exchanging
-  instances among them.)
+Changing instance's groups
+--------------------------
 
-- *Change group*: this mode accepts one extra parameter, a list of node
-  group UUIDs; the instances will be moved away from their current
-  group, to any of the groups in this list. If the list is empty, the
-  request is, simply, "change group": the instances are placed in any
-  group but their original one.
+Takes a list of instances and a list of node group UUIDs; the instances
+will be moved away from their current group, to any of the groups in the
+target list. All instances need to have their primary node in the same
+group, which may not be a target group. If the target group list is
+empty, the request is simply "change group" and the instances are placed
+in any group but their original one.
 
-- *Any*: for each instance, any group is valid, including its current
-  one.
+Node evacuation
+---------------
 
-In all modes, the groups' ``alloc_policy`` attribute will be honored.
+Evacuates instances off their primary nodes. The evacuation mode
+can be given as ``primary-only``, ``secondary-only`` or
+``all``. The call is given a list of instances whose primary nodes need
+to be in the same node group. The returned nodes need to be in the same
+group as the original primary node.
 
 .. _multi-reloc-result:
 
@@ -66,8 +67,17 @@ of **replace secondary**, **migration** and **failover** operations
 (when shared storage is used, they will all be failover or migration
 operations within the corresponding mobility domain).
 
-The result is expected to be a list of jobsets. Each jobset contains
-lists of serialized opcodes. Example::
+The result of the operations described above must contain two lists of
+instances and a list of jobsets.
+
+The two lists of instances describe which instances could be
+moved/migrated and which couldn't for some reason ("unsuccessful"). The
+union of the two lists must be equal to the set of instances given in
+the original request.
+
+The list of jobsets contained in the result describe how to actually
+execute the operation. Each jobset contains lists of serialized opcodes.
+Example::
 
   [
     [
@@ -101,8 +111,8 @@ Accepted opcodes:
 Starting with the first set, Ganeti will submit all jobs of a set at the
 same time, enabling execution in parallel. Upon completion of all jobs
 in a set, the process is repeated for the next one. Ganeti is at liberty
-to abort the execution of the relocation after any jobset. In such a
-case the user is notified and can restart the relocation.
+to abort the execution after any jobset. In such a case the user is
+notified and can restart the operation.
 
 .. vim: set textwidth=72 :
 .. Local Variables:
diff --git a/doc/iallocator.rst b/doc/iallocator.rst
index 6ecd62696..e8c371604 100644
--- a/doc/iallocator.rst
+++ b/doc/iallocator.rst
@@ -1,7 +1,7 @@
 Ganeti automatic instance allocation
 ====================================
 
-Documents Ganeti version 2.1
+Documents Ganeti version 2.4
 
 .. contents::
 
@@ -193,11 +193,17 @@ In all cases, it includes:
     ``multi-relocate`` or ``multi-evacuate``. The ``allocate`` request
     is used when a new instance needs to be placed on the cluster. The
     ``relocate`` request is used when an existing instance needs to be
-    moved within its node group, while the ``multi-relocate`` one is
-    able to relocate multiple instances across multiple node groups. The
-    ``multi-evacuate`` protocol requests that the script computes the
-    optimal relocate solution for all secondary instances of the given
-    nodes.
+    moved within its node group.
+
+    The ``multi-evacuate`` protocol used to request that the script
+    computes the optimal relocate solution for all secondary instances
+    of the given nodes. It is now deprecated and should no longer be
+    used.
+
+    The ``change-group`` request is used to relocate multiple instances
+    across multiple node groups. ``node-evacuate`` evacuates instances
+    off their node(s). These are described in a separate :ref:`design
+    document <multi-reloc-detailed-design>`.
 
 For both allocate and relocate mode, the following extra keys are needed
 in the ``request`` dictionary:
@@ -276,23 +282,26 @@ Relocation:
      Ganeti 2.0, this list will always contain a single node, the
      current secondary of the instance); type *list of strings*
 
-As for ``multi-relocate``, it needs the three following request
-arguments:
+As for ``node-evacuate``, it needs the following request arguments:
 
   instances
-    a list of instance names to relocate; type *list of strings*
+    a list of instance names to evacuate; type *list of strings*
+
+  evac_mode
+    specify which instances to evacuate; one of ``primary-only``,
+    ``secondary-only``, ``all``, type *string*
 
-  reloc_mode
-    a string indicating the relocation mode; there are three possible
-    values for this string: *keep_group*, *change_group*, and
-    *any_group*, the semantics or which are explained in :ref:`the
-    design document <multi-reloc-detailed-design>`
+
+``change-group`` needs the following request arguments:
+
+  instances
+    a list of instance names whose group to change; type
+    *list of strings*
 
   target_groups
-    this argument is only accepted when ``reloc_mode``, as explained
-    above, is *change_group*; if present, it must either be the empty
-    list, or contain a list of group UUIDs that should be considered for
-    relocating instances to; type *list of strings*
+    must either be the empty list, or contain a list of group UUIDs that
+    should be considered for relocating instances to; type
+    *list of strings*
 
 Finally, in the case of multi-evacuate, there's one single request
 argument (in addition to ``type``):
-- 
GitLab