Commit b1eb71c7 authored by Dato Simó's avatar Dato Simó

design-autorepair.rst: clarify tag precedence and conflict

This commit clarifies one particular point of the auto-repair workflow:
what to do when multiple, conflicting administrator-set tags exist in an
object; and how tags at different levels (cluster, node group and instance)
interact.

For conflict within an object, we choose to always let the most restrictive
tag win (i.e. the least destructive repair, and the longest suspension
time). For tags at different levels, we follow a simple "nearest tag wins"
rule.
Signed-off-by: default avatarDato Simó <dato@google.com>
Reviewed-by: default avatarIustin Pop <iustin@google.com>
parent e79f576c
......@@ -81,6 +81,14 @@ error condition that requires a more risky or drastic solution, but
never vice versa (if a worse solution is allowed then so is a better
one).
If there are multiple ``ganeti:watcher:autorepair:<type>`` tags in an
object (cluster, node group or instance), the least destructive tag
takes precedence. When multiplicity happens across objects, the nearest
tag wins. For example, if in a cluster with two instances, *I1* and
*I2*, *I1* has ``failover``, and the cluster itself has both
``fix-storage`` and ``reinstall``, *I1* will end up with ``failover``
and *I2* with ``fix-storage``.
ganeti:watcher:autorepair:suspend[:<timestamp>]
+++++++++++++++++++++++++++++++++++++++++++++++
......@@ -102,6 +110,17 @@ It might also be useful to easily have an operation that tags all
instances matching a filter on some charateristic. But again, this
wouldn't be specific to this tag.
If there are multiple
``ganeti:watcher:autorepair:suspend[:<timestamp>]`` tags in an object,
the form without timestamp takes precedence (permanent suspension); or,
if all object tags have a timestamp, the one with the highest timestamp.
When multiplicity happens across objects, the nearest tag wins, as
above. This makes it possible to suspend cluster-enabled repairs with a
single tag in the cluster object; or to suspend them only for a certain
node group or instance. At the same time, it is possible to re-enable
cluster-suspended repairs in a particular instance or group by applying
an enable tag to them.
ganeti:watcher:autorepair:pending:<type>:<id>:<timestamp>:<jobs>
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment