Commits · ebacb943ea465a23261b9da780b99a6d66739e35 · itminedu / snf-ganeti

Apr 09, 2010

Make watcher request the max coverage · ebacb943

Iustin Pop authored 15 years ago


Since the actions are potentially destructive, we should try to get a
consistent view of the cluster, so it's better to get the most coverage
possible.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

ebacb943

ConfdClient.SendRequest: allow max coverage · cc6484c4

Iustin Pop authored 15 years ago


This patch changes the coverage parameter to allow specification of max
coverage (via -1), versus auto-computation (default, 0) and manual
specification.

Unittests are updated for this case too.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

cc6484c4

Apr 08, 2010

Document the watcher node maintenance feature · 6328fea3

Iustin Pop authored 15 years ago


The patch changes significantly the watcher man page, as it was very
simplistic.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

6328fea3

Watcher: automatic shutdown of orphan resources · 50273051

Iustin Pop authored 15 years ago


This patch changes the watcher so that it maintains (on all nodes) the
list of instances and DRBD devices by shutting down ones that confd
daemons indicate should not be running on this node.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

50273051

Export the maintain_node_health option in ssconf · 5c465a95

Iustin Pop authored 15 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

5c465a95

Add a new cluster parameter maintain_node_health · 3953242f

Iustin Pop authored 15 years ago


This will be used to conditionally enable the watcher node maintenance
feature.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

3953242f

Add a new confd callback (StoreResultCallback) · aa2efc52

Iustin Pop authored 15 years ago


This new callback simply stores (without calling any lower-level
callback) the last result; coupled with the filtering callback, this
ensures that it has the 'best' response after all have been received.

The result can then be retrieved via the GetResponse method.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

aa2efc52

ConfdClient: add synchronous wait for replies mode · bfbbc223

Iustin Pop authored 15 years ago


Currently, there is no way for a user of the confd client library to
know how many replies there should be, whether all have been received,
etc. This is bad since we can't reliably detect the consistency of the
results.

This patch attempts to fix this by adding a synchronous WaitForReply
function that will wait until either a timeout expires, or until a
minimum number of replies have been received (interested users should
add similar functionality for the async case). The callback
functionality will still do call-backs into the user-provided code
during the wait, but after this function has returned, we know that we
received all possible replies.

Note: To account for the interval between initial send of the request,
and calling of this function, we modify the expiration time of the
request.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

bfbbc223

ConfdClient: unify some internal variables · 71e114da

Iustin Pop authored 15 years ago


Currently the requests are tracked in _request and in _expire_requests.
This is conventient, but it restricts the ability to extend the request
tracking, e.g. via packet stats and/or extension of expiration time.

This patch introduces a new simple class _Request that holds all
properties of pending requests; it then uses instances of this class as
values in _request instead of tuples, and removes the _expire_requests.

The only drawback is the change in behaviour of _ExpireRequests:
previously, it used to scan the list only up to the first non-expired
request, after which it aborted. Now it will scan the entire dict, which
(depending on workload) could change the time behaviour. I don't think
this is a problem, as:
- deleting from the head of a list is very expensive (list.pop(0);
  list.append() is an order of magnitude more expensive than deleting
  an element from a dictionary and re-adding it)
- we should have more than tens or hundreds of pending requests; in case
  this assumption changes, we could introduce a no-more-often-than-X
  expiration policy, etc.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

71e114da

Apr 07, 2010

Fix consistency checks in ConfdFilterCallback · 39292d3a

Iustin Pop authored 15 years ago


Commit 49b3fdac added consistency checks, but these are wrongly triggered
for old responses - we need to make sure to check that we have the same
serial.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

39292d3a

Fix utils.WaitForFdCondition inner retry loop · 1b429e2a

Iustin Pop authored 15 years ago


Commit dfdc4060 added WaitForFdCondition which uses utils.Retry without
handling timeout exceptions. This breaks any nested retry loops.

This patch fixes the above function, and also changes utils.Retry to
detect and warn future similar cases. In addition, we add a few small
unittests for utils.Retry.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

1b429e2a

Fix bug introduced in : mkdir mode · cc2f004d

Michael Hanselmann authored 15 years ago


After commit 76e5f8b5, mkdir_mode in utils.RenameFile is
no longer passed to Makedirs. This is fixed by this patch.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

cc2f004d

utils: Move wrapper code around os.makedirs into separate function · 76e5f8b5
Michael Hanselmann authored 15 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
76e5f8b5

Fix unittest for the rapi client library · 2004e673

Iustin Pop authored 15 years ago


Wrong escape, so we make sure to use proper escapes (we want the
backslashes to be embedded, not interpreted). Also change " to ' to be
easier to read.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: David Knowles <dknowles@google.com>

2004e673

Apr 06, 2010

Adding RAPI client library. · 95ab4de9

David Knowles authored 15 years ago


Signed-off-by: David Knowles <dknowles@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
(modified slightly the unittest to account for
 missing httplib2 library)

95ab4de9

Extend ConfdFilterCallback with consistency checks · 49b3fdac

Iustin Pop authored 15 years ago


Note that users of the callback will have to manually check the
attribute.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

49b3fdac

Abstract the confd client creation · 5b349fd1

Iustin Pop authored 15 years ago


Most creation of confd clients will do the same steps: read MC file,
parse it, read HMAC key, etc. We abstract this functionality so that
we don't duplicate the code.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

5b349fd1

Mar 31, 2010

Remove unused import from test file · e065714c

Guido Trotter authored 15 years ago


Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

e065714c

kvm_flag hypervisor parameter · 7ba594c0

Guido Trotter authored 15 years ago


Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

7ba594c0

Move the runas user at execution time · cef34868

Guido Trotter authored 15 years ago


Everything still works the same way, but the user is calculated each
time we start kvm, rather than stored in the config file. This makes it
easier to implement the "pool" security model.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

cef34868

Mar 30, 2010

Send "501 Not Implemented" back when method not found · 33664046

René Nussbaumer authored 15 years ago


Before this was "400 Bad Request" and thus it didn't reflect
the reality.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

33664046

Mar 26, 2010

Adding QA RAPI tests for activate-disks and deactivate-disks calls · e6ce18ac

René Nussbaumer authored 15 years ago


* This also adds support for authenticated RAPI calls
* Other HTTP methods than GET/POST

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

e6ce18ac

Mar 25, 2010

SerializableConfigParser: Make Loads class indep · b39bf4bb

Guido Trotter authored 15 years ago


Currently SerializableConfigParser.Loads is a static method that returns
a SerializableConfigParser. With this patch we change it to a class
method that returns a member of the class. This way a subclass calling
Loads on itself will get its own member, rather than a bare
SerializableConfigParser.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

b39bf4bb

Mar 23, 2010

Unbreak command line job submission · 71834b2a

Guido Trotter authored 15 years ago


A change introduced in 5299e61f modified the contents of
JobExecutor.jobs, missing a place where this tuple was deconstructed.
This creates a traceback in gnt-* <any> --submit, fixed by this patch.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

71834b2a

Allow file storage to be grown · 2c42c5df

Guido Trotter authored 15 years ago


Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

2c42c5df

Write grow support for file storage · 91e2d9ec

Guido Trotter authored 15 years ago


Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

91e2d9ec

Watcher: fix some doc typos · 55c85950

Iustin Pop authored 15 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

55c85950

Watcher: do not warn for missing hooks dir · 10e689d4

Iustin Pop authored 15 years ago


If the hooks dir does not exist, do not warn needlessly. This is similar
to commit a9b7e346 (for backend.py).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

10e689d4

Extend the hypervisor API with name-only shutdown · bbcf7ad0

Iustin Pop authored 15 years ago


Currently the ShutdownInstance method of the hypervisors takes a full
instance object. However, when doing instance shutdowns from the node
only, we don't have a full object, just the name.

To handle this use case, we add a new ‘name’ argument to the method,
which makes the shutdown not use/rely on the ‘instance’ argument. The
KVM and fake hypervisors need a little bit of work, otherwise the change
is straightforward.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

bbcf7ad0

Distribute list of enabled hypervisors in ssconf · 4f7a6a10

Iustin Pop authored 15 years ago


This can be used by nodes to know which hypervisors they are supposed to
support.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

4f7a6a10

ganeti-confd: Call pyinotify flags correctly · 675bf1b7

Guido Trotter authored 15 years ago


The "apparently pylint was right" commit.

Although the pyinotify constants work on old distributions, they fail on
new ones, with new python. Fixing this by calling them in a way that
works everywhere.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

675bf1b7

Fix burnin error when trying to grow a file volume · 728489a3

Guido Trotter authored 15 years ago


Abstract the growable disk types in a ganeti constants, and only run
disk grow, from burnin, on them.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

728489a3

Some epydoc fixes · 3a488770

Iustin Pop authored 15 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

3a488770

A rewrite of LUClusterVerify · 02c521e4

Iustin Pop authored 15 years ago


Per issue 90, current cluster verify is very very brittle. It's one of
the oldest pieces of code, with only additions without cleanups over the
last years.

Among its problems:

- data initialization interspersed with verification of RPC results,
  leading to non-initialized data for some branches
- due to the above, we order strictly some checks and we have the case
  where a bad node time result will skip checking of node volumes
- many many local variables, with each new check adding a new dict,
  leading to a spaghetti of dicts in the main Exec function
- monolithic code, both Exec() and _NodeVerify() do a lot of
  independent checks

This patch does an imperfect rewrite, but at least we gain:

- a clear infrastructure for adding more checks (the new NodeImage
  class, with it's clear and documented fields), and removal of most
  per-node dicts from the Exec() function
- the new NodeImage object should allow better type safety, e.g. by
  allowing pylint to check the actual object attributes rather than
  strings as dict keys
- a-priori initialization of data fields, eliminating the need to
  introduce dependencies between checks
- per-result-key status field, allowing elimination of duplicate error
  messages (where we want)
- split of most independent checks into separate functions, for greater
  clarity

The new code, being new will probably introduce for the short term more
bugs than it removes. However, it should offer a much better way for
extending cluster verify in the future.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

02c521e4

Introduce a bool CLI option type · e7b61bb0

Iustin Pop authored 15 years ago


This option type enforces its value to either True or False, relieving
the scripts from manually parsing the values in each function.

We also update the bash completion code to use the option type if
possible.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

e7b61bb0

Fix backend.VerifyNode behaviour for VG problems · ed904904

Iustin Pop authored 15 years ago


In case LVM is broken, backend.GetVolumeList will raise an RPC exception
(as expected since it's a function exposed over RPC). Therefore we must
be prepared to catch any such exceptions, so that we don't fail the
whole verify call in this case. cmdlib is already prepared to handle
string results for this response key.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

ed904904

Adding missing documentation to make the docs better · 2263aec2

René Nussbaumer authored 15 years ago


Also fixed a typo I noticed.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

2263aec2

Mar 22, 2010

Remove race condition in FileStorage.Create · cdeefd9b

Guido Trotter authored 15 years ago


Rather than checking that the file doesn't exist, and then creating it,
we create it with O_CREAT | O_EXCL, making sure the checking/creation is
atomic.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

cdeefd9b

KVM: Check instances for actual liveness · 263b8de6

Guido Trotter authored 15 years ago


Currently if we find a live process with the pid we saved we assume kvm
is alive. What could happen, though, is that the pidfile has been
reused.

In order to avoid that we change the check to make sure, everywhere,
that the process we see is our actual kvm process. In order to do so we
open its cmdline, and check that it contains the correct instance name
in the -name argument passed to kvm.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

263b8de6

KVM: improve GetInstanceInfo docstring · 4fbb3c60

Guido Trotter authored 15 years ago


Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

4fbb3c60