- Apr 09, 2010
-
-
Iustin Pop authored
Since the actions are potentially destructive, we should try to get a consistent view of the cluster, so it's better to get the most coverage possible. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This patch changes the coverage parameter to allow specification of max coverage (via -1), versus auto-computation (default, 0) and manual specification. Unittests are updated for this case too. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Apr 08, 2010
-
-
Iustin Pop authored
The patch changes significantly the watcher man page, as it was very simplistic. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This patch changes the watcher so that it maintains (on all nodes) the list of instances and DRBD devices by shutting down ones that confd daemons indicate should not be running on this node. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This will be used to conditionally enable the watcher node maintenance feature. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This new callback simply stores (without calling any lower-level callback) the last result; coupled with the filtering callback, this ensures that it has the 'best' response after all have been received. The result can then be retrieved via the GetResponse method. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
Currently, there is no way for a user of the confd client library to know how many replies there should be, whether all have been received, etc. This is bad since we can't reliably detect the consistency of the results. This patch attempts to fix this by adding a synchronous WaitForReply function that will wait until either a timeout expires, or until a minimum number of replies have been received (interested users should add similar functionality for the async case). The callback functionality will still do call-backs into the user-provided code during the wait, but after this function has returned, we know that we received all possible replies. Note: To account for the interval between initial send of the request, and calling of this function, we modify the expiration time of the request. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
Currently the requests are tracked in _request and in _expire_requests. This is conventient, but it restricts the ability to extend the request tracking, e.g. via packet stats and/or extension of expiration time. This patch introduces a new simple class _Request that holds all properties of pending requests; it then uses instances of this class as values in _request instead of tuples, and removes the _expire_requests. The only drawback is the change in behaviour of _ExpireRequests: previously, it used to scan the list only up to the first non-expired request, after which it aborted. Now it will scan the entire dict, which (depending on workload) could change the time behaviour. I don't think this is a problem, as: - deleting from the head of a list is very expensive (list.pop(0); list.append() is an order of magnitude more expensive than deleting an element from a dictionary and re-adding it) - we should have more than tens or hundreds of pending requests; in case this assumption changes, we could introduce a no-more-often-than-X expiration policy, etc. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Apr 07, 2010
-
-
Iustin Pop authored
Commit 49b3fdac added consistency checks, but these are wrongly triggered for old responses - we need to make sure to check that we have the same serial. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Commit dfdc4060 added WaitForFdCondition which uses utils.Retry without handling timeout exceptions. This breaks any nested retry loops. This patch fixes the above function, and also changes utils.Retry to detect and warn future similar cases. In addition, we add a few small unittests for utils.Retry. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
After commit 76e5f8b5, mkdir_mode in utils.RenameFile is no longer passed to Makedirs. This is fixed by this patch. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
Wrong escape, so we make sure to use proper escapes (we want the backslashes to be embedded, not interpreted). Also change " to ' to be easier to read. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
David Knowles <dknowles@google.com>
-
- Apr 06, 2010
-
-
David Knowles authored
Signed-off-by:
David Knowles <dknowles@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> (modified slightly the unittest to account for missing httplib2 library)
-
Iustin Pop authored
Note that users of the callback will have to manually check the attribute. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
Most creation of confd clients will do the same steps: read MC file, parse it, read HMAC key, etc. We abstract this functionality so that we don't duplicate the code. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Mar 31, 2010
-
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Guido Trotter authored
Everything still works the same way, but the user is calculated each time we start kvm, rather than stored in the config file. This makes it easier to implement the "pool" security model. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Mar 30, 2010
-
-
René Nussbaumer authored
Before this was "400 Bad Request" and thus it didn't reflect the reality. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Mar 26, 2010
-
-
René Nussbaumer authored
* This also adds support for authenticated RAPI calls * Other HTTP methods than GET/POST Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Mar 25, 2010
-
-
Guido Trotter authored
Currently SerializableConfigParser.Loads is a static method that returns a SerializableConfigParser. With this patch we change it to a class method that returns a member of the class. This way a subclass calling Loads on itself will get its own member, rather than a bare SerializableConfigParser. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
- Mar 23, 2010
-
-
Guido Trotter authored
A change introduced in 5299e61f modified the contents of JobExecutor.jobs, missing a place where this tuple was deconstructed. This creates a traceback in gnt-* <any> --submit, fixed by this patch. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Iustin Pop authored
If the hooks dir does not exist, do not warn needlessly. This is similar to commit a9b7e346 (for backend.py). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Iustin Pop authored
Currently the ShutdownInstance method of the hypervisors takes a full instance object. However, when doing instance shutdowns from the node only, we don't have a full object, just the name. To handle this use case, we add a new ‘name’ argument to the method, which makes the shutdown not use/rely on the ‘instance’ argument. The KVM and fake hypervisors need a little bit of work, otherwise the change is straightforward. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Iustin Pop authored
This can be used by nodes to know which hypervisors they are supposed to support. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Guido Trotter authored
The "apparently pylint was right" commit. Although the pyinotify constants work on old distributions, they fail on new ones, with new python. Fixing this by calling them in a way that works everywhere. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Guido Trotter authored
Abstract the growable disk types in a ganeti constants, and only run disk grow, from burnin, on them. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
Per issue 90, current cluster verify is very very brittle. It's one of the oldest pieces of code, with only additions without cleanups over the last years. Among its problems: - data initialization interspersed with verification of RPC results, leading to non-initialized data for some branches - due to the above, we order strictly some checks and we have the case where a bad node time result will skip checking of node volumes - many many local variables, with each new check adding a new dict, leading to a spaghetti of dicts in the main Exec function - monolithic code, both Exec() and _NodeVerify() do a lot of independent checks This patch does an imperfect rewrite, but at least we gain: - a clear infrastructure for adding more checks (the new NodeImage class, with it's clear and documented fields), and removal of most per-node dicts from the Exec() function - the new NodeImage object should allow better type safety, e.g. by allowing pylint to check the actual object attributes rather than strings as dict keys - a-priori initialization of data fields, eliminating the need to introduce dependencies between checks - per-result-key status field, allowing elimination of duplicate error messages (where we want) - split of most independent checks into separate functions, for greater clarity The new code, being new will probably introduce for the short term more bugs than it removes. However, it should offer a much better way for extending cluster verify in the future. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This option type enforces its value to either True or False, relieving the scripts from manually parsing the values in each function. We also update the bash completion code to use the option type if possible. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
In case LVM is broken, backend.GetVolumeList will raise an RPC exception (as expected since it's a function exposed over RPC). Therefore we must be prepared to catch any such exceptions, so that we don't fail the whole verify call in this case. cmdlib is already prepared to handle string results for this response key. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
René Nussbaumer authored
Also fixed a typo I noticed. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Mar 22, 2010
-
-
Guido Trotter authored
Rather than checking that the file doesn't exist, and then creating it, we create it with O_CREAT | O_EXCL, making sure the checking/creation is atomic. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Guido Trotter authored
Currently if we find a live process with the pid we saved we assume kvm is alive. What could happen, though, is that the pidfile has been reused. In order to avoid that we change the check to make sure, everywhere, that the process we see is our actual kvm process. In order to do so we open its cmdline, and check that it contains the correct instance name in the -name argument passed to kvm. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-