- Feb 10, 2010
-
-
Michael Hanselmann authored
One fix is necessary in gnt-cluster.sgml. Also adding “DELETE_ON_ERROR” target to remove output file if an error occurred while building it (in this case the manpage). This was reported by Iustin Pop in issue 87 and proposed check method taken from Lintian. http://code.google.com/p/ganeti/issues/detail?id=87 Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 09, 2010
-
-
Iustin Pop authored
This patch adds an early_release parameter in the OpReplaceDisks and OpEvacuateNode opcodes, allowing earlier release of storage and more importantly of internal Ganeti locks. The behaviour of the early release is that any locks and storage on all secondary nodes are released early. This is valid for change secondary (where we remove the storage on the old secondary, and release the locks on the old and new secondary) and replace on secondary (where we remove the old storage and release the lock on the secondary node. Using this, on a three node setup: - instance1 on nodes A:B - instance2 on nodes C:B It is possible to run in parallel a replace-disks -s (on secondary) for instances 1 and 2. Replace on primary will remove the storage, but not the locks, as we use the primary node later in the LU to check consistency. It is debatable whether to also remove the locks on the primary node, and thus making replace-disks keep zero locks during the sync. While this would allow greatly enhanced parallelism, let's first see how removal of secondary locks works. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Feb 08, 2010
-
-
Iustin Pop authored
* stable-2.1: TLReplaceDisks: Delay iallocator when evacuating node Implement debug level across OS-related RPC calls Second try to fix LUVerifyCluster LUVerifyCluster: Fix bug with offline nodes utils: Fix retry delay calculator Bump RPC protocol version to 30 Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
When evacuating nodes, the iallocator was run for all instances without taking planned changes into consideration. This patch delays part of CheckPrereq and running the iallocator for node evacuation. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 03, 2010
-
-
Iustin Pop authored
This doesn't implement the full functionality, we need to add the debug level to the opcodes too, but at least won't require changing the RPC calls during the 2.1 series. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
My previous patch, commit 785d142e, fixed the case where a node is marked offline. With this patch it'll also handle other failures correctly. * Hooks Results - ERROR: node node2.example.com: Communication failure in hooks execution: Connection failed (111: Connection refused) Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
[…] * Other Notes - NOTICE: 1 offline node(s) found. * Hooks Results Failure: command execution error: iteration over non-sequence Commit a0c9776a introduced an error simulation mode to LUVerifyCluster. Due to a small mistake, offline nodes weren't skipped when checking the results of verification hooks and iterating over None raises an “iteration over non-sequence” error. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Before this patch, it would always sleep for at least the time specified as the upper limit. Now it actually limits the sleep time. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 01, 2010
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
* origin/stable-2.1: Bump version to 2.1.0~rc5 Fix missing bridge for xen instances Fix flipping MC flag bug ganeti-watcher: ensure confd is running as well
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
* devel-2.0: Three small typos in man pages Conflicts: man/gnt-cluster.sgml (trivial) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
… instead of disk size, which is not as reliable. This actually simplifies the code; but it still leaves the possibility of stack overflows if the disk data structure is corrupted. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
The credit goes again to Lintian. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jan 29, 2010
-
-
Alessandro Cincaglini authored
Xen instances nic definitions miss the target bridge. This bug was introduced in commit 503b97a9. Signed-off-by:
Alessandro Cincaglini <alessandro.ciancaglini@gmail.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Signed-off-by:
Guido Trotter <ultrotter@google.com>
-
- Jan 28, 2010
-
-
Guido Trotter authored
Currently unofflining or undraining an already functional master candidate node, can cause it to demote itself. In order to avoid that we only trigger the self-promotion check if the node is not currently a candidate. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Guido Trotter authored
Ganeti-confd should be running on all 2.1 nodes. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
This patch adds a configure-time parameter that will set the defaults used by all programs, and command-line parameters in the daemons that allow overriding it. Syslog 'yes' enables syslog in addition to file-based logging, 'only' enables syslog and disables file-based syslog. The log entries will be of the form: Jan 27 08:45:04 node2 ganeti-noded[14504]: INFO 172.24.227.5:50850 PUT /jobqueue_update HTTP/1.0 200 Jan 27 08:45:05 node2 ganeti-noded[14505]: INFO 172.24.227.5:50853 PUT /lv_list HTTP/1.0 200 and (for a multi-threaded program): Jan 27 08:51:48 node1 ganeti-masterd[15491]: (MainThread) INFO ganeti-masterd daemon startup Jan 27 08:51:49 node1 ganeti-masterd[15491]: (MainThread) INFO Inspecting job queue Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jan 27, 2010
-
-
Iustin Pop authored
In case the queue dir cannot be create/initialized, currently ganeti-noded exits. This means that a read-only filesystem or a permission error breaks all node daemon functionality, including powercycle. This is not good for the usual failure case for nodes. To workaround this, we don't require successful initialization at node daemon startup; if we can't init the queue dir/lock, we retry at every RPC call requiring a job queue lock, and if we still can't acquire the lock, we raise an exception (which is catched in HandleRequest and transformed into an RPC failure). This allows the node daemon to start in face of queue issues, and the master node to power-cycle it. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
If the open of the lock file fails (due to whatever reason), 'self' won't have the 'fd' attribute, and thus we fail in Close/__del__, which will ruin proper error reporting: IOError: [Errno 30] Read-only file system: '/var/lib/ganeti/queue/lock' Exception exceptions.AttributeError: "'FileLock' object has no attribute 'fd'" in <bound method FileLock.__del__ of <ganeti.utils.FileLock object at 0x2aaaad0bebd0>> ignored Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
In some versions of bash, here-docs and here-strings use temporary files, which means daemon-util needs a writable temporary filesystem. Since echo is a bash builtin anyway, it's simple to switch to it and remove this dependency. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
The variable is called “IMPORT_INDEX”, not “IMPORT_IDX”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Guido Trotter authored
This patch missing @type information for all public methods, modifies one to conform to the rest, and removes some information from @param when it's been expressed in @type. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Balazs Lecz authored
According to http://docs.python.org/reference/datamodel.html#slots * The action of a __slots__ declaration is limited to the class where it is defined. As a result, subclasses will have a __dict__ unless they also define __slots__ (which must only contain names of any /additional/ slots). * If a class defines a slot also defined in a base class, the instance variable defined by the base class slot is inaccessible (except by retrieving its descriptor directly from the base class). This renders the meaning of the program undefined. In the future, a check may be added to prevent this. Signed-off-by:
Balazs Lecz <leczb@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Signed-off-by:
Iustin Pop <iustin@google.com>
-
- Jan 26, 2010
-
-
Iustin Pop authored
* devel-2.0: Fix the mocks.py for 2.0 unittests LURemoveNode safety in face of wrong node list Fix an unsafe formatting bug Ensure all int/float conversions are handled right Conflicts: lib/backend.py - trivial merge lib/cmdlib.py - merge, and took 2.0's version of LURemoveNode BuildHooksEnv lib/mcpu.py - kept ours lib/objects.py - trivial merge lib/utils.py - trivial merge scripts/gnt-backup - kept ours scripts/gnt-instance - trivial merge Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
The recent change to use LogWarning with multiple arguments in mpcu.py/HooksMaster broke the (simple) mock we have in the tests. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jan 25, 2010
-
-
Iustin Pop authored
Ideally we want to/will have per-device DRBD controls of disk/metadata flushes. In the meantime, we want at least a disable of the barrier functionality for cases where one has battery-backed caches. Background: DRBD has four mechanism of handling ordered disk-writes. From the drbdsetup man-page, these are: barrier, flush, drain and none. DRBD prior to 8.2 only has drain and none. This patch makes all 8.x versions of DRBD disable all methods, and revert to none, in case one fully trusts batteries (either UPS for the whole system or battery for NVRAM). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
LURemoveNode runs under the BGL, which means we're guaranteed that the list of nodes as retrieved in CheckPrereq is still valid in BuildHooksEnv. However, we can make Ganeti handle failures in case the locking is broken (or the node list has been modified otherwise) easily, which is better than crashing hard in such a case. This will also fix issue 79, even though that is due to an out-of-tree patch. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This might fix issue 84; in any case, the current situation is that we have a potentially unsafe formatting, which should be fixed. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
int()/float() can raise either ValueError (in case of int("a")), or TypeError (in case of int(None)). We had many bugs over time due to this, and a recent one was just diagnosed, so we go over the codebase and replace all 'except ValueError' with 'except (TypeError, ValueError)' that protect such conversions (there were no 'except TypeError' cases that needed a ValueError added). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jan 22, 2010
-
-
Michael Hanselmann authored
* stable-2.1: Bump version to 2.1.0~rc4 KVM: fix pylint warning KVM: be more resilient on broken migration answers Add unittests for cli.GenerateTable cli: Fix bug when not using headers daemon-util: Fix quoting issue Bump version to 2.1.0~rc3
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
This patch updates the man page of gnt-instance to include the newly added tags filtering. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Signed-off-by:
Iustin Pop <iustin@google.com>
-
Guido Trotter authored
Specify string format arguments as logging function parameters Signed-off-by:
Guido Trotter <ultrotter@google.com>
-
Guido Trotter authored
Before, when doing kvm live migrations we use to accept an "unknown status" but to reject anything that didn't match our regexp. Since we've seen "info migrate" return a completely empty answer, we'll be more tolerant of completely unknown results (while still logging them) and at the same time we'll limit the number of them which we're willing to accept in a row. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jan 21, 2010
-
-
René Nussbaumer authored
This change introduces startup, shutdown, reboot, reinstall by using instance respectively node tags as a selection criteria. Signed-off-by:
René Nussbaumer <rn@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jan 20, 2010
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Commit 9fe72672 added code to not write spaces at the end of each line. Unfortunately it didn't work properly when not printing headers—there would still be spaces. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jan 19, 2010
-
-
Michael Hanselmann authored
This patch fixes a quoting issue in daemon-util: $ EXTRA_MASTERD_ARGS=--no-voting /etc/init.d/ganeti restart […] * ganeti-masterd... /…/ganeti/daemon-util: line 65: local: `--no-voting': not a valid identifier The reason was that the generated variables were not quoted properly and the troublesome line expanded to “local args=$MASTERD_ARGS $EXTRA_MASTERD_ARGS” instead of the correct “local args="$MASTERD_ARGS $EXTRA_MASTERD_ARGS"”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-