Commits · 8fada090ba4e8917800c1d6fa334ab73085433a3 · itminedu / snf-ganeti

May 13, 2013

QA: factor out some instance management functions · 8fada090

Michele Tartara authored 12 years ago


Some functions for managing instances will have to be used by new upcoming
unit tests, so they are taken out of the instances QA file and put in a new
utilities file accessible by other QA files as well.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

8fada090

Add inst-status-xen to the monitoring daemon · 8a049311

Michele Tartara authored 12 years ago


Enable the monitoring daemon to invoke the Xen instance status data collector.
Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

8a049311

Run the monitoring daemon as root · ebcbcfee

Michele Tartara authored 12 years ago


The monitoring daemon needs to be able to run some commands that require root
access (such as "xm") in order to fulfill its duties.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

ebcbcfee

Export the Instance Status collector report · e8b46463

Michele Tartara authored 12 years ago


It will need to be accessed by the monitoring daemon.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

e8b46463

Add instance status collector to mon-collector man page · 1f53be84

Michele Tartara authored 12 years ago


Add a section related to the new collector.

Also, fix some formatting issue (white spaces, line longer than 80 chars)
in the DRBD collector section.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

1f53be84

Add global status field to the instance status collector · 79731e21

Michele Tartara authored 12 years ago


The global status is computed from the statuses of the single instances.

The output json format is adapted to include this piece of information, as
prescribed by the design document.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

79731e21

Factor out the mergeStatuses function · dd69cd3c

Michele Tartara authored 12 years ago


It will be used by multiple data collectors, not only the DRBD collector.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

dd69cd3c

Monitoring design doc: better specify field names · 42b50796

Michele Tartara authored 12 years ago


The name of the list of instances was not specified.

Also, fix a line that was longer than 80 characters.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

42b50796

Use dcName in mon-collector · 6ab6b19a

Michele Tartara authored 12 years ago


Instead of manually specify the name of the data collectors in mon-collector,
just use the dcName field each of them exports.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

6ab6b19a

Factor out function for building report · 4fe04580

Michele Tartara authored 12 years ago


Instead of building the report as part of the "Main" function, have it
built by its own dedicated function, so that it will be able to export it
directly to the monitoring daemon when needed.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

4fe04580

Export Instance Status collector information · 7660aaf3

Michele Tartara authored 12 years ago


Name, version, format version, category and kind of the Instance Status data
collector are now exported.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

7660aaf3

Include the reason trail in the instance collector output · 17ae9cdb

Michele Tartara authored 12 years ago


Fetch the reason trail from file, failing gracefully if it is not found, and
include it in the output of the instance status data collector.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

17ae9cdb

Determine status of one instance · fc4be2bf

Michele Tartara authored 12 years ago


Added function for determining whether the status of an instance is ok, and to
represent this information in the corresponding field in the report.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

fc4be2bf

Export the actual instance state · d4de2ea8

Michele Tartara authored 12 years ago


Compute the actual state of the instance and export it.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

d4de2ea8

Add the core of the instance status collector · d7e9323b

Michele Tartara authored 12 years ago


Add the Xen instance status data collector with only its core features.
The next commits will add more reporting functionalities.

The access to the collector is made possible through the mon-collector
tool.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

d7e9323b

Add module containing function for getting info from Xen · 45ee8676

Michele Tartara authored 12 years ago

The Xen instance status data collector will require to get some information
from the hypervisor. This commit introduces a module providing such functions.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

45ee8676

Add HS functions for getting the instance reason path · 74b25887

Michele Tartara authored 12 years ago


The getInstReasonFilename is built to resemble the python corresponding
function.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

74b25887

Add dependency on the process library · 74685117

Michele Tartara authored 12 years ago


The tests are already using this library, so it's not really a new build
dependency, but it was not specified esplicitly.

Furthermore, it's going to be used by the instance status collector, so it's
added to the requirements for the monitoring subsystem.

Signed-off-by: Michele Tartara <mtartara@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

74685117

Add example for online rolling reboots using tags · b24e516d

Klaus Aehlig authored 12 years ago


While this use case was described in the design document, and
mentioned several times as motivation for changes in commit messages,
it has never been added to a user-facing documentation. This commit
adds at least an example to the man page.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

b24e516d

May 10, 2013

Extend hroller test to also verify tag-based node selection · 7b2d4001

Klaus Aehlig authored 12 years ago


While the multiple-tags test was added to verify that coloring is done
only after node selection (otherwise it wouldn't be possible to get in
both cases a single reboot group), it can easily be extended to also
verify that the correct nodes are selected by --node-tags.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

7b2d4001

Add a test for online rolling reboot scheduling · 2a1737eb

Klaus Aehlig authored 12 years ago


In the example configuration, the graph constructed by just connecting
primary and secondary instances is two-colorable. However, when taking
conflicting locations of secondary nodes into account, three reboot
groups are needed. Moreover, these reboot groups are not subordinated
to any two-coloring of the first-mentioned graph.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

2a1737eb

Support online-maintenance in hroller · 8d38fb72

Klaus Aehlig authored 12 years ago


Make hroller take into account the nodes (redundant) instances
will be migrated to. This be behavior can be overridden by the
--offline-maintenance option which will make hroller plan under
the assumption that all instances will be shutdown before starting
with the rolling reboots.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

8d38fb72

Support construction of the graph of all reboot constraints · 30fded87

Klaus Aehlig authored 12 years ago


For online rolling reboots, there are two kind of restrictions. First,
we cannot reboot the primary and secondary nodes of an instance
together. Secondly, two nodes cannot be rebooted simultaneously, if
they are the primary nodes of two instances with the same secondary
node. The second condition requires knowledge of all nodes, not only
those the graph is to be constructed on.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

30fded87

Add option --one-step-only to hroller · 2207220d

Klaus Aehlig authored 12 years ago

Add a new option to hroller to only output information about the first
reboot group. Together with the option --node-tags this allows for the
following work flow. First tag all nodes; then repeatedly compute the
first node group, handle these nodes and remove the tags. In between
these steps, other operations can be carried out on the cluster.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

2207220d

Sort reboot groups by size · a39779f6

Klaus Aehlig authored 12 years ago


Make hroller output the node groups not containing the master node
sorted by size, largest group first. The master node still remains
the last node of the last reboot group. In this way, most progress
is made when switching back to normal cluster operations after the
first reboot group.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Thomas Thrainer <thomasth@google.com>

a39779f6

Fix expectation in hroller test · 361f2719

Klaus Aehlig authored 12 years ago


Regular expressions are not shell globs. So "any symbol" is expressed
by a dot, not a question mark. In this case, the confusion lead to a
too liberal expectation, hence the test passed. Fix it nevertheless.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

361f2719

May 09, 2013

Refactor check for exclusive_storage in LUInstanceCreate · 5a9c7c34

Bernardo Dal Seno authored 12 years ago


The order of evaluation of the conditions is changed, so it's easier to add
more (foreseen) checks for exclusive_storage.

Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

5a9c7c34

Refactor disk checks in LUInstanceSetParams · 8064c1af

Bernardo Dal Seno authored 12 years ago


Prereq checks relative disks are grouped together and moved in a separate
method. This reduces the clutter in CheckPrereq().

Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

8064c1af

May 07, 2013

Fix lint errors (redundant bracket) · 004398d0

Klaus Aehlig authored 12 years ago


Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

004398d0

Add a test demonstrating the --node-tags option of hroller · 62441832

Klaus Aehlig authored 12 years ago


The example is a cluster of 6 nodes, paired into 3 group by three
instances. So the whole cluster would need two reboot groups. The two
tags select, in two different ways, one node of each group. So, when
restricting to one tag, a single reboot group suffices, but no
coloring of the whole cluster would achieve this.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

62441832

Add option to hroller to select nodes based on tags · 313fdabc

Klaus Aehlig authored 12 years ago


Add option --node-tags to tell hroller to consider only nodes
with these tags. A use case would be a tag tracking on which
nodes the maintenance has not yet been carried out, e.g., if
rolling reboots are interleaved with other cluster operations.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

313fdabc

Make Rapi backed set node tags correctly · 267bc1f4

Klaus Aehlig authored 12 years ago


Since the htools representation of a node now allows adding
the node tags, populate this field correctly in the Rapi
backend.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

267bc1f4

Make LUXI backed set node tags correctly · f33c06b8

Klaus Aehlig authored 12 years ago


Since the htools representation of a node now allows adding
the node tags, populate this field correctly in the LUXI
backend.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

f33c06b8

Extend the text format to contain node tags · 4b542ebc

Klaus Aehlig authored 12 years ago


In order to allow htools to make use of node tags, add them to the
text format. This is done by adding a new column at the end of the
node lines. If this column is missing, the default value (which
is the empty list) is left unchanged, thus yielding the current
behavior.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

4b542ebc

Extend the Node in the htools to allow adding node tags · 07ea9bf5

Klaus Aehlig authored 12 years ago


Since hroller (and probably other tools in the future) will support
node selection based on node tags, extend the node data structure to
allow adding this information.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

07ea9bf5

Make hroller filter the nodes before coloring the graph · 442d5aae

Klaus Aehlig authored 12 years ago


Hroller used to first compute a coloring of the node graph and then
filter out the nodes that it had to work on. While the only filtering
was according to node groups this did not make a difference, as there
shouldn't be any instance with primary and secondary node on different
node groups. With more elaborate filtering, however, reducing the graph
first can lead to better reboot groups.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

442d5aae

Make mkNodeGraph ignore edges to non-present nodes · 318c0a6c

Klaus Aehlig authored 12 years ago


Change the behavior of mkNodeGraph to tacitly ignore all instances
where one of the nodes is not in the list of nodes. In this way, we
can construct sub-graphs by filtering the nodes and ignoring any
possibly added isolated nodes for the missing indexes.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

318c0a6c

Add tests for the -O option of hroller · e6e2d4a5

Klaus Aehlig authored 12 years ago


In hroller, the option -O can be used to mark certain nodes as offline.
These nodes should then not be part of any reboot group. Add tests
to verify this behavior.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

e6e2d4a5

Update hroller man page · 52278ef9

Klaus Aehlig authored 12 years ago


In commit 7dbe4c72 the new option --force was introduced to
hroller. Change the man page to reflect this change.

Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

52278ef9

Mention DRBD 8.4 support in NEWS · d4b6d97b

Thomas Thrainer authored 12 years ago


Mention the main features of DRBD 8.4 support in the NEWS file.

Signed-off-by: Thomas Thrainer <thomasth@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

d4b6d97b