- Mar 19, 2008
-
-
Iustin Pop authored
Currently in order to deal with tmpfs /var/run, we create the BDEV_CACHE_DIR in the init script. However, that does not cover all the cases, and it's not a proper place to deal with it: for example, dealing with not initialized clusters and the master node is more complicated. Therefore, this patch does: - make ganeti-noded create the directory automatically - make ganeti-noded error out if it can't create it or it's already there but not a directory - remove the creation from the init.d script Reviewed-by: ultrotter
-
- Mar 18, 2008
-
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
Michael Hanselmann authored
This replaces very old code. Reviewed-by: ultrotter
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
Michael Hanselmann authored
This allows callers to allocate a pseudo-TTY easily. Reviewed-by: ultrotter
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
Michael Hanselmann authored
This renames some functions and does some minor codestyle cleanup. Reviewed-by: ultrotter
-
Michael Hanselmann authored
The whole Ganeti cluster has a single SSH key. Its fingerprint is written to Ganeti's known_hosts file, together with an alias. This allows us to always use that alias instead of the real hostname, making management of the known_hosts file much easier. This patch does not handle an upgrade from an earlier version. Reviewed-by: ultrotter
-
Guido Trotter authored
Reviewed-by: imsnah
-
Guido Trotter authored
Previously if a shared thread was notified, together with the rest, and was not fast enough in waking up and acquiring the lock, another one could release it, decide there were no more sharers, and let an exclusive one in instead. With this patch we make sure all the shared holders which were waiting have passed, before declaring it's time to make an exclusive one pass. This also allows us to reintroduce a slight variation of the assertion removed in r665, which makes our code safer. Reviewed-by: imsnah
-
- Mar 11, 2008
-
-
Guido Trotter authored
The -s option when changing secondary node on a drbd template is implied, and thus optional. Specify this in the manpage. Reviewed-by: iustinp
-
Iustin Pop authored
Make the cluster init fail if the IP to which the cluster name resolved is already reachable by the master node. This is not a foolproof solution, but it allows a cheap method of detecting simple mistakes. It will also disallow using the master node name as cluster name (which is something good). The only drawbacks that I see are: - you are not allowed to do this, which might come in handy in cluster upgrades; but since we support rename, this is mitigated - cluster init takes longer now (+the timeout value, set to 5 seconds), but since this is a one-off operation, it should be ok Reviewed-by: ultrotter
-
Iustin Pop authored
This patch modifies TcpPing and its callers to make the source address selection optional. Usually, the kernel will know better what source address to use, just in some cases we want to enforce a given source address so it makes sense to make this optional. Reviewed-by: ultrotter
-
- Mar 06, 2008
-
-
Guido Trotter authored
The "quick" online help just reported the option to change secondary node. Add the ones to just replace the disk locally on-primary or on-secondary. It is of course impossible to espress in one line everything needed to use this command, but at least now the most common options are spelled out immediately. Reviewed-by: iustinp, imsnah
-
- Mar 05, 2008
-
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
- Mar 04, 2008
-
-
Guido Trotter authored
A LockSet is mostly useful when it has some locks in it. On the other hand there are cases in which it must function even when empty. For example if a cluster has no instances in it there's no reason why locking all of them shouldn't work anyway. This patch adds test code for that situation and implements the necessary fixes to make it work. Reviewed-by: imsnah
-
Guido Trotter authored
This check that no operation had been performed before release() was missing in the test code. Adding it. Reviewed-by: imsnah
-
Guido Trotter authored
Reviewed-by: imsnah
-
Guido Trotter authored
r644 contained some cleanup code for LockSet. Among other things it removed a syntax error that allowed an assertion that previously wan't really checked to trigger. It turns out that even though the spirit of that assertion was correct its actual implementation was wrong. While it's true that no sharers must be waiting if an exclusive holder is not present it might happen that when all the sharers wake up one of them releases the lock before some other even has had a chance to run. In this case __shr_wait would still be greater than 0, even if the sharer is not actually waiting, just pending a wakeup to proceed. Thus, removing the assertion in question. Reviewed-by: imsnah
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
Guido Trotter authored
This new functionality makes it possible to acquire a whole set, by passing "None" to the acquire() function as the list of elements. This will avoid new additions to the set, and then acquire all the current elements. The list of all elements acquired will be returned at the end. Deletions can still happen during the acquire process and we'll deal with it by just skipping the deleted elements: it's effectively as if they were deleted before we called the function. After we've finished though we hold all the elements, so no more deletes can be performed before we release them. Any call to release() will then first of all release the "set-level" lock if we're holding it, and then all or some of the locks we have. Some new tests checks that this feature works as intended. Reviewed-by: imsnah
-
Guido Trotter authored
This patch adds a try/except area around most of the acquire() code (everything after the intial condition checks). Since the except: clause contains just a 'raise' nothing really changes except the indentation of the code. This is done in a separate commit to insulate and make clearer what the real code changes done in the upcoming patch are. Reviewed-by: imsnah
-
Guido Trotter authored
Previously the private version of the __names function returned directly a set. We'll keep this in the public interface but change the private version to a list in order to be able to sort() its result and then loop on it, even though we'll need to do this with the usual care that some keys may disappear in between. Reviewed-by: imsnah
-
Guido Trotter authored
Lockset's remove() function used to return a list of locks we failed to remove. Rather than doing this we'll return a list of removed locks, so it's more similar to how acquire() behaves. This patch also fixes the relevant unit tests. Reviewed-by: imsnah
-
Guido Trotter authored
In a LockSet acquire() returned True on success. This code changes that to return a set containing the names of the elements acquired. This is still a true value if we acquired any lock but is slightly more useful (because if needed one has access to this data without querying for it). The only change happens if acquiring no locks, which though is a usage which should not normally happen because it has no practical use. The patch also changes a some tests to check that the new format is respected. Reviewed-by: imsnah
-
Guido Trotter authored
This patch changes nothing to the functionality of a LockSet. Rather than trying to do the whole for loop we try each of its steps. This opens the way to handle differently a single failure. Reviewed-by: imsnah
-
Guido Trotter authored
Includes some locking-related constants and explanations on how the LockManager should be used, the class itself and its test cases. The class includes: - a basic constructor - functions to acquire and release lists of locks at the same level - functions to add and remove list of locks at modifiable levels - dynamic checks against out-of-order acquisitions and other illegal ops Its testing library checks that the LockManager behaves correctly and that the external assumptions it relies on are respected. Reviewed-by: imsnah
-
- Feb 29, 2008
-
-
Iustin Pop authored
Currently the cluster destroy doesn't remove the master role, which means that the IP address of the cluster remains assigned to the master node. This patch fixes this and also a docstring in backend.StopMaster(). Reviewed-by: imsnah
-
Iustin Pop authored
Reviewed-by: imsnah
-
Iustin Pop authored
This one-liner fixes the cluster rename operation. As a side note, we should have a QA test for this too. Reviewed-by: imsnah
-
- Feb 28, 2008
-
-
Guido Trotter authored
This patch makes acquire() first look up all the locks in the dict and then try to acquire them later. The advantage is that if a lockname is already wrong since the beginning we won't need to first queue and acquire other locks to find this out. Of course since there is no locking between the two steps a delete() could still happen in between, but SharedLocks are safe in this regard and will just make the .acquire() operation fail if this unfortunate condition happens. Since the right way to check if an instance/node exists and make sure it won't stop existing after that is acquiring its lock this improves the common case (checking for an incorrect name) while not penalizing correctness, or performance as would happen if we kept a lock for the whole process. Reviewed-by: iustinp
-
Guido Trotter authored
A LockSet represents locking for a set of resources of the same type. A thread can acquire multiple resources at the same time, and release some or all of them, but cannot acquire more resources incrementally at different times without releasing all of them in between. Internally a LockSet uses a SharedLock for each resource to be able to grant both exclusive and shared acquisition. It also supports safe addition and removal of resources at runtime. Acquisitions are ordered alphabetically in order to grant them to be deadlock-free. A lot of assumptions about how the code interacts are made in order to grant both safety and speed; in order to document all of them the code features pretty lenghty comments. The test suit tries to catch most common interactions but cannot really tests tight race conditions, for which we still need to rely on human checking. This is the second basic building block for the Ganeti Lock Manager. Instance and Node locks will be put in LockSets to manage their acquisition and release. Reviewed-by: imsnah
-
Guido Trotter authored
Some options were missing in the gnt-cluster init man page. This patch adds them, removes an empty line, and clarifies a bit more some requirements. Reviewed-by: schreiberal
-
Guido Trotter authored
Even if the target instance is down or we are not checking for IP conflicts changing an instance name to a new one which is already in the cluster is doomed to fail, because in a lot of places (among which figures the mind of most users/admins) instance names are assumed to be unique. Reviewed-by: imsnah
-
Alexander Schreiber authored
Reviewed-by: imsnah
-
- Feb 27, 2008
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
- Feb 26, 2008
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-