- Jun 23, 2008
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
Guido Trotter authored
The failure is because in high load, the parent gets to run before the child has the chance to os._exit(), and therefore it is still running when the parent does the check. The fix removes the chance of this happening by waiting to receive a SIGCHLD (but not calling wait()) before trying to test the pid. Reviewed-by: imsnah
-
Michael Hanselmann authored
In cfgupgrade, we need to extract parts of and build new version numbers. Reviewed-by: iustinp
-
- Jun 19, 2008
-
-
Michael Hanselmann authored
This change allows us to use cleaner dependencies between directories. The build system is basically rewritten in large parts and may contain bugs. Reviewed-by: iustinp
-
- Jun 18, 2008
-
-
Iustin Pop authored
The path to the filename for drbd8 proc data is not correctly computed when using distcheck. The patch duplicates it from the other drbd tests. Reviewed-by: ultrotter
-
Iustin Pop authored
Currently, compute the status of a drbd8 device in GetSyncStatus and return only the values that we need (and fit in the framework of GetSyncStatus). However, the full status details are useful (and needed) in other places, so the patch attempts to improve this situation. We abstract the status of a device outside in a separate class, that knows how to parse contents from /proc/drbd and set easily accessible attributes. We then simplify the GetSyncStatus to use this and return the values that it needs, and add a separate method that returns the full status object. The move to a separate class cleans up a little bit the old sync-progress computation from GetSyncStatus, but it's still many regexes. The patch also adds unittests for a few statuses, and modifies one BaseDRBD call to accept a custom filename instead of '/proc/drbd' to ease unittests. Reviewed-by: imsnah
-
- May 07, 2008
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
- May 01, 2008
-
-
Guido Trotter authored
Reviewed-by: iustinp
-
- Apr 28, 2008
-
-
Manuel Franceschini authored
This patch changes the code executed when testing the signal handling of RunCmd. Since sh does not always point to bash (e.g. on Ubuntu, where it points to /bin/dash) this test might fail due to the returned exit code is different so the received signal is not correctly detected. Additionally fix the docstring of testSignal. Reviewed-by: iustinp
-
- Mar 18, 2008
-
-
Michael Hanselmann authored
The whole Ganeti cluster has a single SSH key. Its fingerprint is written to Ganeti's known_hosts file, together with an alias. This allows us to always use that alias instead of the real hostname, making management of the known_hosts file much easier. This patch does not handle an upgrade from an earlier version. Reviewed-by: ultrotter
-
- Mar 11, 2008
-
-
Iustin Pop authored
This patch modifies TcpPing and its callers to make the source address selection optional. Usually, the kernel will know better what source address to use, just in some cases we want to enforce a given source address so it makes sense to make this optional. Reviewed-by: ultrotter
-
- Mar 04, 2008
-
-
Guido Trotter authored
A LockSet is mostly useful when it has some locks in it. On the other hand there are cases in which it must function even when empty. For example if a cluster has no instances in it there's no reason why locking all of them shouldn't work anyway. This patch adds test code for that situation and implements the necessary fixes to make it work. Reviewed-by: imsnah
-
Guido Trotter authored
This check that no operation had been performed before release() was missing in the test code. Adding it. Reviewed-by: imsnah
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
Guido Trotter authored
This new functionality makes it possible to acquire a whole set, by passing "None" to the acquire() function as the list of elements. This will avoid new additions to the set, and then acquire all the current elements. The list of all elements acquired will be returned at the end. Deletions can still happen during the acquire process and we'll deal with it by just skipping the deleted elements: it's effectively as if they were deleted before we called the function. After we've finished though we hold all the elements, so no more deletes can be performed before we release them. Any call to release() will then first of all release the "set-level" lock if we're holding it, and then all or some of the locks we have. Some new tests checks that this feature works as intended. Reviewed-by: imsnah
-
Guido Trotter authored
Lockset's remove() function used to return a list of locks we failed to remove. Rather than doing this we'll return a list of removed locks, so it's more similar to how acquire() behaves. This patch also fixes the relevant unit tests. Reviewed-by: imsnah
-
Guido Trotter authored
In a LockSet acquire() returned True on success. This code changes that to return a set containing the names of the elements acquired. This is still a true value if we acquired any lock but is slightly more useful (because if needed one has access to this data without querying for it). The only change happens if acquiring no locks, which though is a usage which should not normally happen because it has no practical use. The patch also changes a some tests to check that the new format is respected. Reviewed-by: imsnah
-
Guido Trotter authored
Includes some locking-related constants and explanations on how the LockManager should be used, the class itself and its test cases. The class includes: - a basic constructor - functions to acquire and release lists of locks at the same level - functions to add and remove list of locks at modifiable levels - dynamic checks against out-of-order acquisitions and other illegal ops Its testing library checks that the LockManager behaves correctly and that the external assumptions it relies on are respected. Reviewed-by: imsnah
-
- Feb 28, 2008
-
-
Guido Trotter authored
A LockSet represents locking for a set of resources of the same type. A thread can acquire multiple resources at the same time, and release some or all of them, but cannot acquire more resources incrementally at different times without releasing all of them in between. Internally a LockSet uses a SharedLock for each resource to be able to grant both exclusive and shared acquisition. It also supports safe addition and removal of resources at runtime. Acquisitions are ordered alphabetically in order to grant them to be deadlock-free. A lot of assumptions about how the code interacts are made in order to grant both safety and speed; in order to document all of them the code features pretty lenghty comments. The test suit tries to catch most common interactions but cannot really tests tight race conditions, for which we still need to rely on human checking. This is the second basic building block for the Ganeti Lock Manager. Instance and Node locks will be put in LockSets to manage their acquisition and release. Reviewed-by: imsnah
-
- Feb 26, 2008
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
- Feb 21, 2008
-
-
Guido Trotter authored
- Check that even a shared acquire() fails on a deleted lock - Check that delete() fails on a lock you share (must own it or nothing) These are assumptions I build on in future code, so better check for them. Currently no code change is necessary for them to be valid. Reviewed-by: iustinp
-
- Feb 20, 2008
-
-
Guido Trotter authored
The _doItDelete helper code was supposed to be used to dispatch threads that deleted the SharedLock. It actually just acquired it exclusively. This remained unnoticed as the helper thread is just used to test interaction, not the delete code by itself, and delete requires an exclusive acquire anyway. Reviewed-by: imsnah
-
- Feb 19, 2008
-
-
Guido Trotter authored
This new operation lets a lock be cleanly deleted. The lock will be exclusively held before deletion, and after it pending and future acquires will raise an exception. Other SharedLock operations are modify to deal with delete() and to avoid code duplication. This patch also adds unit testing for the new function and its interaction with the other lock features. The helper threads are sligtly modified to handle and report the condition of a deleted lock. As a bonus a non-related unit test about not supporting non-blocking mode yet has been added as well. This feature will be used by the LockSet in order to support deadlock-free delete of resources. This in turn will be useful to gracefully handle the removal of instances and nodes from the cluster dealing with the fact that other operations may be pending on them. Reviewed-by: iustinp
-
- Feb 18, 2008
-
-
Guido Trotter authored
Use the actual class name rather than a spaced version of it. Reviewed-by: iustinp
-
- Feb 08, 2008
-
-
Guido Trotter authored
Adding a locking.py file for the ganeti locking library. Its first component is the implementation of a non-recursive blocking shared lock complete with a testing library. Reviewed-by: imsnah, iustinp
-
- Jan 18, 2008
-
-
Iustin Pop authored
In revision 459 I added a bug in the make dist rule in the sense that the archive will include *all* of test/data directory, including the .svn directory if it exists. This patch fixes that problem and adds a distcheck hook that tests for such errors in the future (files/directories matching the .svn and .git patterns). It also fixes a typo in the NEWS file. Reviewed-by: imsnah
-
- Jan 07, 2008
-
-
Iustin Pop authored
This patch fixes the ‘make distcheck’ breakage caused by missing test data in the archive and missing handling of builddir!=srcdir case. Reviewed-by: schreiberal
-
Iustin Pop authored
This patch changes the bdev.DRBD8._GetDevInfo to take a string instead of a minor, separates the `drbdsetup show` invocation into a new separate method (bdev.DRBD8._GetShowData) and modifies the rest of the DRBD8 class to make the appropriate calls. It also adds a unittest script and data files for testing various cases of device output. Reviewed-by: imsnah
-
- Dec 03, 2007
-
-
Alexander Schreiber authored
Reviewed-by: imsnah
-
- Nov 20, 2007
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
- Nov 16, 2007
-
-
Michael Hanselmann authored
Reviewed-by: schreiberal
-
- Nov 14, 2007
-
-
Michael Hanselmann authored
Reviewed-by: schreiberal
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
Michael Hanselmann authored
Reviewed-by: schreiberal, ultrotter
-
- Nov 13, 2007
-
-
Michael Hanselmann authored
Reviewed-by: schreiberal
-
- Nov 12, 2007
-
-
Michael Hanselmann authored
- Combine hostname and aliases on one line - Fix bug with wrongfully removed newline characters - Use wrapper for SetEtcHostsEntry in cmdlib Reviewed-by: iustin
-
Michael Hanselmann authored
-
Michael Hanselmann authored
Reviewed-by: TODO
-
- Nov 09, 2007
-
-
Iustin Pop authored
Currently, an unreachable node (or one that return undetermined failure) in the hooks pre-phase will abort the curren operation. This is not good, as a down node could prevent many operation on the cluster. This patch changes a RPC-level failure (and not a hook execution failure) into a warning. It also modifies the related test cases. This fixes issue 11. Reviewed-by: ultrotter
-