Commits · 0214b0c0859cef8c17216d912d4e2b1271b812d2 · itminedu / snf-ganeti

Mar 19, 2008

Make ganeti-noded create BDEV_CACHE_DIR automatically · 0214b0c0

Iustin Pop authored 17 years ago

Currently in order to deal with tmpfs /var/run, we create the
BDEV_CACHE_DIR in the init script. However, that does not cover all the
cases, and it's not a proper place to deal with it: for example, dealing
with not initialized clusters and the master node is more complicated.

Therefore, this patch does:
  - make ganeti-noded create the directory automatically
  - make ganeti-noded error out if it can't create it or it's already
    there but not a directory
  - remove the creation from the init.d script

Reviewed-by: ultrotter

0214b0c0

Mar 18, 2008

Use constants for “ssh” and “scp” binaries instead of magic values · fff33d70
Michael Hanselmann authored 17 years ago
```
Reviewed-by: ultrotter
```
fff33d70
Use new ssh.WriteKnownHostsFile function · f408b346
Michael Hanselmann authored 17 years ago
```
This replaces very old code.

Reviewed-by: ultrotter
```
f408b346
Use new cluster alias in known_hosts file · 1ff08570
Michael Hanselmann authored 17 years ago
```
Reviewed-by: ultrotter
```
1ff08570
Use new “tty” parameter on SshRunner.BuildCmd for “gnt-instance console” · b047857b
Michael Hanselmann authored 17 years ago
```
Reviewed-by: ultrotter
```
b047857b
Add “tty” parameter to SshRunner.BuildCmd · 8f07f831
Michael Hanselmann authored 17 years ago
```
This allows callers to allocate a pseudo-TTY easily.

Reviewed-by: ultrotter
```
8f07f831
Order SSH options alphabetically · bf3d57b8
Michael Hanselmann authored 17 years ago
```
Reviewed-by: ultrotter
```
bf3d57b8

Move SSH functions into a class · c92b310a

Michael Hanselmann authored 17 years ago

This renames some functions and does some minor codestyle cleanup.

Reviewed-by: ultrotter

c92b310a

Add function to write cluster SSH key to known_hosts file · 75a5f456

Michael Hanselmann authored 17 years ago

The whole Ganeti cluster has a single SSH key. Its fingerprint is
written to Ganeti's known_hosts file, together with an alias. This
allows us to always use that alias instead of the real hostname,
making management of the known_hosts file much easier.

This patch does not handle an upgrade from an earlier version.

Reviewed-by: ultrotter

75a5f456

Locking: remove an empty space at End Of Line · 21a6c826
Guido Trotter authored 17 years ago
```
Reviewed-by: imsnah
```
21a6c826

Increase SharedLock fairness · 4d686df8

Guido Trotter authored 17 years ago

Previously if a shared thread was notified, together with the rest, and was not
fast enough in waking up and acquiring the lock, another one could release it,
decide there were no more sharers, and let an exclusive one in instead. With
this patch we make sure all the shared holders which were waiting have passed,
before declaring it's time to make an exclusive one pass.

This also allows us to reintroduce a slight variation of the assertion removed
in r665, which makes our code safer.

Reviewed-by: imsnah

4d686df8

Mar 11, 2008

Specify better gnt-instance(8) replace-disks · 6536dfa1

Guido Trotter authored 17 years ago

The -s option when changing secondary node on a drbd template is implied, and
thus optional. Specify this in the manpage.

Reviewed-by: iustinp

6536dfa1

Disable cluster init with a reachable IP · 411f8ad0

Iustin Pop authored 17 years ago

Make the cluster init fail if the IP to which the cluster name resolved
is already reachable by the master node. This is not a foolproof
solution, but it allows a cheap method of detecting simple mistakes.

It will also disallow using the master node name as cluster name (which
is something good).

The only drawbacks that I see are:
  - you are not allowed to do this, which might come in handy in cluster
    upgrades; but since we support rename, this is mitigated
  - cluster init takes longer now (+the timeout value, set to 5
    seconds), but since this is a one-off operation, it should be ok

Reviewed-by: ultrotter

411f8ad0

Modify utils.TcpPing to make source address optional · b15d625f

Iustin Pop authored 17 years ago

This patch modifies TcpPing and its callers to make the source address
selection optional. Usually, the kernel will know better what
source address to use, just in some cases we want to enforce a given
source address so it makes sense to make this optional.

Reviewed-by: ultrotter

b15d625f

Mar 06, 2008

Fix gnt-instance replace-disks online help · 457697bc

Guido Trotter authored 17 years ago

The "quick" online help just reported the option to change secondary node. Add
the ones to just replace the disk locally on-primary or on-secondary. It is of
course impossible to espress in one line everything needed to use this command,
but at least now the most common options are spelled out immediately.

Reviewed-by: iustinp, imsnah

457697bc

Mar 05, 2008
- Replace custom file writing code with utils.WriteFile · 41a57aab
  Michael Hanselmann authored 17 years ago
```
Reviewed-by: ultrotter
```
  41a57aab
- Codestyle fixes: adding a few empty lines · 7c0d6283
  Michael Hanselmann authored 17 years ago
```
Reviewed-by: ultrotter
```
  7c0d6283
Mar 04, 2008

LockSet: handle empty case · b2dabfd6

Guido Trotter authored 17 years ago

A LockSet is mostly useful when it has some locks in it. On the other hand
there are cases in which it must function even when empty. For example if a
cluster has no instances in it there's no reason why locking all of them
shouldn't work anyway. This patch adds test code for that situation and
implements the necessary fixes to make it work.

Reviewed-by: imsnah

b2dabfd6

LockSet: add missing check code · b5c0e9d9

Guido Trotter authored 17 years ago

This check that no operation had been performed before release() was missing in
the test code. Adding it.

Reviewed-by: imsnah

b5c0e9d9

LockSet: collapse two try/except into one · ea3f80bf
Guido Trotter authored 17 years ago
```
Reviewed-by: imsnah
```
ea3f80bf

SharedLock: remove wrong assertion in code · 9a39f854

Guido Trotter authored 17 years ago

r644 contained some cleanup code for LockSet. Among other things it removed a
syntax error that allowed an assertion that previously wan't really checked to
trigger. It turns out that even though the spirit of that assertion was correct
its actual implementation was wrong.

While it's true that no sharers must be waiting if an exclusive holder is not
present it might happen that when all the sharers wake up one of them releases
the lock before some other even has had a chance to run. In this case
__shr_wait would still be greater than 0, even if the sharer is not actually
waiting, just pending a wakeup to proceed.

Thus, removing the assertion in question.

Reviewed-by: imsnah

9a39f854

Codestyle updates for locking code · cdb08f44
Michael Hanselmann authored 17 years ago
```
Reviewed-by: ultrotter
```
cdb08f44

LockSet: make acquire() able to get the whole set · 3b7ed473

Guido Trotter authored 17 years ago

This new functionality makes it possible to acquire a whole set, by passing
"None" to the acquire() function as the list of elements. This will avoid new
additions to the set, and then acquire all the current elements. The list of
all elements acquired will be returned at the end.

Deletions can still happen during the acquire process and we'll deal with it by
just skipping the deleted elements: it's effectively as if they were deleted
before we called the function. After we've finished though we hold all the
elements, so no more deletes can be performed before we release them.

Any call to release() will then first of all release the "set-level" lock if
we're holding it, and then all or some of the locks we have.

Some new tests checks that this feature works as intended.

Reviewed-by: imsnah

3b7ed473

LockSet: encapsulate acquire() in try-except · 806e20fd

Guido Trotter authored 17 years ago

This patch adds a try/except area around most of the acquire() code (everything
after the intial condition checks). Since the except: clause contains just a
'raise' nothing really changes except the indentation of the code.

This is done in a separate commit to insulate and make clearer what the real
code changes done in the upcoming patch are.

Reviewed-by: imsnah

806e20fd

Make LockSet.__names() return a list, not a set · 0cf257c5

Guido Trotter authored 17 years ago

Previously the private version of the __names function returned directly a set.
We'll keep this in the public interface but change the private version to a
list in order to be able to sort() its result and then loop on it, even though
we'll need to do this with the usual care that some keys may disappear in
between.

Reviewed-by: imsnah

0cf257c5

LockSet: improve remove() api · 3f404fc5

Guido Trotter authored 17 years ago

Lockset's remove() function used to return a list of locks we failed to remove.
Rather than doing this we'll return a list of removed locks, so it's more
similar to how acquire() behaves. This patch also fixes the relevant unit tests.

Reviewed-by: imsnah

3f404fc5

LockSet: make acquire() return the set of names · 0cc00929

Guido Trotter authored 17 years ago

In a LockSet acquire() returned True on success. This code changes that to
return a set containing the names of the elements acquired. This is still a
true value if we acquired any lock but is slightly more useful (because if
needed one has access to this data without querying for it). The only change
happens if acquiring no locks, which though is a usage which should not
normally happen because it has no practical use.

The patch also changes a some tests to check that the new format is respected.

Reviewed-by: imsnah

0cc00929

LockSet: invert try/for nesting in acquire() · 8b68f394

Guido Trotter authored 17 years ago

This patch changes nothing to the functionality of a LockSet. Rather than
trying to do the whole for loop we try each of its steps. This opens the way to
handle differently a single failure.

Reviewed-by: imsnah

8b68f394

Initial GanetiLockManager implementation · 7ee7c0c7

Guido Trotter authored 17 years ago

Includes some locking-related constants and explanations on how the
LockManager should be used, the class itself and its test cases.

The class includes:
  - a basic constructor
  - functions to acquire and release lists of locks at the same level
  - functions to add and remove list of locks at modifiable levels
  - dynamic checks against out-of-order acquisitions and other illegal ops

Its testing library checks that the LockManager behaves correctly and that the
external assumptions it relies on are respected.

Reviewed-by: imsnah

7ee7c0c7

Feb 29, 2008

Fix master role stop on cluster destroy · c9064964

Iustin Pop authored 17 years ago

Currently the cluster destroy doesn't remove the master role, which
means that the IP address of the cluster remains assigned to the master
node.

This patch fixes this and also a docstring in backend.StopMaster().

Reviewed-by: imsnah

c9064964

Implement QA tests for gnt-cluster rename · caea3b32
Iustin Pop authored 17 years ago
```
Reviewed-by: imsnah
```
caea3b32

Fix cluster rename operation · 488b540d

Iustin Pop authored 17 years ago

This one-liner fixes the cluster rename operation. As a side note, we
should have a QA test for this too.

Reviewed-by: imsnah

488b540d

Feb 28, 2008

LockSet: make acquire() fail faster on wrong locks · e6c200d6

Guido Trotter authored 17 years ago

This patch makes acquire() first look up all the locks in the dict and then try
to acquire them later. The advantage is that if a lockname is already wrong
since the beginning we won't need to first queue and acquire other locks to
find this out.

Of course since there is no locking between the two steps a delete() could
still happen in between, but SharedLocks are safe in this regard and will just
make the .acquire() operation fail if this unfortunate condition happens.

Since the right way to check if an instance/node exists and make sure it won't
stop existing after that is acquiring its lock this improves the common case
(checking for an incorrect name) while not penalizing correctness, or
performance as would happen if we kept a lock for the whole process.

Reviewed-by: iustinp

e6c200d6

LockSet implementation and unit tests · aaae9bc0

Guido Trotter authored 17 years ago

A LockSet represents locking for a set of resources of the same type. A thread
can acquire multiple resources at the same time, and release some or all of
them, but cannot acquire more resources incrementally at different times
without releasing all of them in between.

Internally a LockSet uses a SharedLock for each resource to be able to grant
both exclusive and shared acquisition. It also supports safe addition and
removal of resources at runtime. Acquisitions are ordered alphabetically in
order to grant them to be deadlock-free. A lot of assumptions about how the
code interacts are made in order to grant both safety and speed; in order to
document all of them the code features pretty lenghty comments.

The test suit tries to catch most common interactions but cannot really tests
tight race conditions, for which we still need to rely on human checking.

This is the second basic building block for the Ganeti Lock Manager. Instance
and Node locks will be put in LockSets to manage their acquisition and release.

Reviewed-by: imsnah

aaae9bc0

Fix the gnt-cluster init man page · f3b100e1

Guido Trotter authored 17 years ago

Some options were missing in the gnt-cluster init man page.  This patch adds
them, removes an empty line, and clarifies a bit more some requirements.

Reviewed-by: schreiberal

f3b100e1

Don't allow renaming to an existing instance · 7bde3275

Guido Trotter authored 17 years ago

Even if the target instance is down or we are not checking for IP conflicts
changing an instance name to a new one which is already in the cluster is
doomed to fail, because in a lot of places (among which figures the mind of
most users/admins) instance names are assumed to be unique.

Reviewed-by: imsnah

7bde3275

Clarify online help for xc-instance reinstall. · 5336d63d
Alexander Schreiber authored 17 years ago
```
Reviewed-by: imsnah
```
5336d63d

Feb 27, 2008
- Use constants.ETC_HOSTS instead of string for /etc/hosts · 107711b0
  Michael Hanselmann authored 17 years ago
```
Reviewed-by: iustinp
```
  107711b0
- Distribute lib/locking.py · 7324ad4c
  Michael Hanselmann authored 17 years ago
```
Reviewed-by: ultrotter
```
  7324ad4c
Feb 26, 2008
- Split GanetiUnitTest into testutils.py · c9c4f19e
  Michael Hanselmann authored 17 years ago
```
Reviewed-by: iustinp
```
  c9c4f19e