- Feb 23, 2010
-
-
Michael Hanselmann authored
If activating disks fails for some reason, the watcher didn't catch the exception. With this patch it's caught and logged. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
According to “coverage”, this covers 99% of the code. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 22, 2010
-
-
Guido Trotter authored
This class doesn't need its constructor to be called. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Guido Trotter authored
This function is a generic pythonic version of runparts. We currently use it in the backend HooksRunner, but we'll use it for running different directories as well. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Guido Trotter authored
And save lots of lines of code, in the process Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Guido Trotter authored
This allows to run a command with only the passed in environment, rather than just updating the default one with it. Now with unit testing. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
Jobs submitted via the standard command line utilities didn't give any indication that anything is happening while they were waiting in the job queue (e.g. due to other jobs using all worker threads) or acquiring locks. This could be very confusing for people not familiar with Ganeti's architecture. Now they'll show a message after the first WaitForJobChanges timeout. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
If too many clients try to connect to the master at the same time, some of them might fail if the master doesn't accept the connections fast enough. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
This should be rewritten from a 'change document' (e.g. "Ganeti only supports...") to a 'current implementation document', but in the meantime we can at least update it with the multi-evac changes. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This switches gnt-node to the new opcode, and in the process also enables multi-node arguments for it. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
We add this as a new opcode since we don't want to alter the behaviour of current opcodes/lus. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This is a new mode that request a solution for the evacuation of multiple nodes. The external script will be fed a list of names, and is expected to return a list of [instance, new_node(s)] lists, detailing the evacuation path of each instance. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This patch switches the default result key from 'nodes' to 'result'. The old name is still accepted for backwards-compatiblity, and should be removed in later versions. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Currently the 'name' parameter in the constructor is required (as a non-keyword argument). Since the (to follow) node evac IAllocator mode doesn't have 'name' as a valid argument, we're moving this one into the per-request key, leaving the constructor required arguments more abstract. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This moves the setting of the request member on the in_data, of the request type, and of the branching basef on request type outside of individual functions and directly into the constructor. Since the values we're using externally are identical to the constants.py values, we're also using those directly. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 19, 2010
-
-
Michael Hanselmann authored
Until now this was only done for the master node, though the problem originally fixed in 8f215968 also occurs for other node daemons. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 18, 2010
-
-
Michael Hanselmann authored
* stable-2.1: Fix ssh host key checking with no-key-check
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
This function could be useful in other places and this way we can easily unittest it. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
On fork, the tempfile module's pseudo random generator is not reset. If several processes (e.g. two children or parent and child) try to create a temporary file, they'll conflict. This function can be used to reset the name generator which contains the pseudo random generator. A unittest is included. It is in a separate script because it changes a variable in the tempfile module to speed up the test. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
In case we add a node with “--no-ssh-key-check”, this should override any default yes/ask values in the system-wide (or user) ssh key check. Currently this only works in batch mode, whereas in non-batch we only override a 'no'. The patch fixes SshRunner such that in non-batch mode we enforce the value of StrictHostKeyChecking in all cases. Bug found and initial investigation by Theo Van Dinter. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 17, 2010
-
-
Iustin Pop authored
This should have been done in the _ExpandNodeName patch. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
There's no such thing as OpProgrammerError (I found this as I wrote it in code in another place, and pylint complained). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
snap_disks can contain boolean values. They weren't handled correctly. The error message was “Error while executing backend function: Invalid object passed to FromDict: expected dict, got <type 'bool'>”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
Currently we have lots of duplication of the error-checking (and proper exception raising) around node/instance name expansion. LUCreateInstance is the only place where we have abstracted this. This patch creates two functions (ExpandNodeName and ExpandInstanceName) that will either raise the proper exception or return the expanded name. This allows a lot of cleanup of duplicate code. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 15, 2010
-
-
Michael Hanselmann authored
* origin/stable-2.1: Fix bug introduced in commit 413b7472 Fix locking bug causing high CPU usage Fix confd procotol design description Implement instance rename QA tests Fix "gnt-instance rename" functionality Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
This patch extends commit 7ea7bcf6 by releasing all node locks in disk replace for the early release mode. The rationale behind this is: - LUCreateInstance already releases all node locks while waiting for disk synchronization, and does an instance startup later - WaitForSync only runs (for disk template 'drbd') 'lvs' and read /proc/drbd on the primary node, which should be (modulo bugs in LVM) safe for parallel run In any case, the worst I could foresee is a node having N lvs commands run in parallel on it, while being a primary for disk storage. Based on create instance doing this safely, and the fact that burnin with more than two instances per node is safe, I think this can be applied. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
These are both cleanups and, in the case of _MassageProcData, switching from a weaker RE to a stronger one (we now need cs: in the line, previosuly any line starting with \d+: was accepted). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
In case the old node is offline, we won't be able to talk to it to remove the storage, and in most cases the node is powered off/unreachable. In this case, it makes no sense to delay the storage release, so we enable automatically early_release mode, gaining parallelism during node evacuation. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 11, 2010
-
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This reverts commit 83d9f436. man is still unable to wrap some long lines, so we simply revert this patch (and filter out the specific message in autotools/check-man). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This should fix issue 68: some hooks should be run on more nodes than currently. GrowDisk runs on both nodes, remove run the post hook on the instance's nodes, and failover and migrate run the post hook on the source node too. Thanks to Maxence for the initial investigation and patch. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Per issue 71, the migrate and failover need special variables for keeping the nodes consistent during instance migrations. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
A long PREFIX variable (to configure) will result in very long LOCALSTATEDIR, which when concatenated with lib/ganeti/ (and even more items under it) will go over the 80 char line length we enforce in the man checker. To workaround this, we change two things: - use a specific REPLACE_VARS_MAN which adds breaking points after each slash in paths - replace some <filename> entries with <literallayout> so that docbook generates a non-fill block around them (only a few cases need this after the breaking points are added Note that with normal prefixes (e.g. / or /usr/local) this won't happen. The patch also fixes a wording in the watcher man page. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-