Commits · bd028152c6323f4d3107d123cbf220ca37c42465 · itminedu / snf-ganeti

May 29, 2008

Documentation: cleanup of local/remote_raid1 · bd028152

Iustin Pop authored 16 years ago

Since we have removed support for local and remote raid1, update the man
pages and guides to reflect the new situation.

Reviewed-by: imsnah

bd028152

May 24, 2008

Distribute dumb-allocator in examples · 447b2066

Guido Trotter authored 16 years ago

When creating the ganeti tarball the dumb allocator was left out.
Shipping it alongside the other examples.

Reviewed-by: iustinp

447b2066

May 15, 2008

Update command line help and manpages with mandatory options · bdb7d4e8
Michael Hanselmann authored 16 years ago
```
Reviewed-by: ultrotter
```
bdb7d4e8

document cluster verify --no-nsplus1-mem option · 3cf7c9fa

Guido Trotter authored 16 years ago

Add this recently added option to the gnt-cluster man page before
releasing 1.2.4.

Reviewed-by: imsnah

3cf7c9fa

Fix drbd show parser to handle valueless keywords · 63012024

Guido Trotter authored 16 years ago

It turns out in some cases there can exist keywords without an
associated value exported by drbdsetup show. This patch makes the value
part optional in our parser, so that if it's not present the parsing
result will contain an array with just the keyword in it. This is not a
problem since we check all keyword names before accessing their values,
so we won't mistakenly try to access the value of a valueless keyword.

Reviewed-by: iustinp

63012024

Split drbd command creation and execution · 333411a7

Guido Trotter authored 16 years ago

Make _AssembleDisk more similar to _AssembleNet by splitting the
generation of the drbdsetup command and its execution. While not
changing anything this makes it easier to manipulate the command just in
certain cases, which in the future we'll need to do.

Reviewed-by: iustinp

333411a7

May 13, 2008

Small style fixes · 8d59409f
Iustin Pop authored 16 years ago
```
[Trunk version]

Reviwed-by: imsnah
```
8d59409f

Implement node daemon conectivity tests · 9d4bfc96

Iustin Pop authored 16 years ago

This patch adds in gnt-cluster verify checks for inter-node tcp
communication checks on the node daemon port for both the primary and
(if defined) secondary networks.

The output looks like (4-node cluster, one with the secondary interface
down):
* Verifying node node1.example.com
  - ERROR: tcp communication with node 'node3.example.com': failure using the secondary interface(s)
* Verifying node node2.example.com
  - ERROR: tcp communication with node 'node3.example.com': failure using the secondary interface(s)
* Verifying node node3.example.com
  - ERROR: tcp communication with node 'node1.example.com': failure using the secondary interface(s)
  - ERROR: tcp communication with node 'node2.example.com': failure using the secondary interface(s)
  - ERROR: tcp communication with node 'node4.example.com': failure using the secondary interface(s)
* Verifying node node4.example.com
  - ERROR: tcp communication with node 'node3.example.com': failure using the secondary interface(s)

Reviewed-by: imsnah

9d4bfc96

Forward-port changes made to readd in 1.2 · 102b115b

Michael Hanselmann authored 16 years ago

qa_node.py: Fix typo in message
cmdlib.py: Don't add readded node to node list
ganeti-qa.py: Make sure readd isn't done for master node

Reviewed-by: iustinp

102b115b

CLI: retry: remove command opts/args in "gnt-X" · 4e713df6

Iustin Pop authored 16 years ago

This new version of the patch removes only the listing of the usage in
the "gnt-X" list, but keeps the strings in since we'll want to enhance
and use them in "gnt-X $cmd --help".

Reviewed-by: ultrotter

4e713df6

Revert "CLI: remove command opts/args in "gnt-X"" · 9a033156
Iustin Pop authored 16 years ago
```
This reverts commit 976.

Reviewed-by: ultrotter
```
9a033156

CLI: remove command opts/args in "gnt-X" · 57d0151e

Iustin Pop authored 16 years ago

[Forward-port of the 1.2 branch patch]

This patch removes all the parameters and options from the output
"gnt-X" (i.e. the subcommand list for command). This is done in order to
uniformize the output, currently only some parameters are shown and they
are not always consistent (e.g. required versus important parameters).

Reviewed-by: ultrotter

57d0151e

Watcher: do not activate disks for started instances · eee1fa2d

Iustin Pop authored 16 years ago

Currently the watcher runs first the instance startup and then the
boot-id method of disk reactivation. However, irrelevant of the fact
that a node has rebooted or not, if we just started an instance, there's
no need for its disks to be activated again, since the start instance
has done that (if it is at all possible).

The patch modifies the watcher to remember all started instances and not
run activate-disks for them.

Reviewed-by: ultrotter

eee1fa2d

Watcher: do not activate disks for admin_down · 0c0f834d

Iustin Pop authored 16 years ago

Currently the watcher does activate disks (via bootid mechanisms) even
for admin_down instances.  This patch logs and skips over these
instances.

Reviewed-by: ultrotter

0c0f834d

Reduce chance of ssh failures in verify cluster · b544cfe0

Iustin Pop authored 16 years ago

The cluster verify builds a sorted list of nodes and passes that to all
the nodes (in parallel) for ssh checks. This means that for a cluster
with N nodes, there will be approximately N simultaneous connections to
the first node, then to the second node, etc. This, coupled with the
ssh daemon's “MaxStartups” parameter, can create false alarms about ssh
connectivity.

This patch randomizes the node list in the backend (therefore, each node
should have it's own order of ssh-ing to the other nodes) and the chance
of these alarms should be reduced.

Reviewed-by: ultrotter

b544cfe0

May 12, 2008

bdev: always log command output if it failed · 6c896e2f

Iustin Pop authored 16 years ago

Currently many error handling code paths in bdev.py log only
result.fail_reason (i.e. exit code or signal that killed the command)
but not its output. This makes debugging very hard.

The patch changes all places where we only log fail_reason to also log
result.output.

Reviewed-by: ultrotter

6c896e2f

May 10, 2008

DRBD: Fix another bug in diskless activation · ab6cc81c

Iustin Pop authored 17 years ago

DRBD8 requires that we pass ‘--create-device’ to the first command that
wants to activate a new DRBD minor. We do this currently when we run the
“drbdsetup ... disk” command which we run before the network setup.

But if the LVs are missing, we skip the ‘disk’ subcommand and run only
the ‘net’ one, so it might be that the activation fails because the
minor we selected was never created in the first place.

The patch adds the required parameter to the DRBD8._AssembleNet() call.
Since it's a no-op for existing minors, it should not create any
problems (tested and works both with configured and unconfigured
minors).

Reviewed-by: ultrotter

ab6cc81c

May 09, 2008

Remove utils.CheckDaemonAlive and use “xm info” instead · e3e66f02

Michael Hanselmann authored 17 years ago

There are a couple of reasons for doing so:
- /proc is not OS independent, it's only supported by Linux (there are
  emulations on other systems, but those might differ from the way
  Linux represents data).
- Checking a daemon's state doesn't necessarily mean it's usable.
  Connecting to the socket using “xm info” is much safer.
- Reduce code size.

Reviewed-by: iustinp

e3e66f02

May 08, 2008

Improve DRBD8.Open's docstring a bit more · f860ff4e
Guido Trotter authored 17 years ago
```
Reviewed-by: iustinp
```
f860ff4e
Fix comment typo in bdev.py · 7b62772e
Guido Trotter authored 17 years ago
```
Reviewed-by: iustinp
```
7b62772e

Fix DRBD8 diskless assembling · bf25af3b

Iustin Pop authored 17 years ago

The algorithm for attaching to existing DRBD devices is not trivial. It
has four alternatives, and there is a bug in the last one when we have
diskless devices.

The last case (local disk info matches but remote/network configuration
doesn't match) we disconnect from the network and reattach with the
correct info. We do this because correct local device has higher
priority than remote device.

However, the test we use (self._MatchesLocal) can succeed in two cases:
  - we have a disk and it's the same as the one attached
  - we don't have a disk and the drbd is in diskless mode

But this creates problems for the fourth case as when we already have
one diskless DRBD, activating then next one will do:
  - _MatchesLocal? yes, because both config data and system have no
    disks (with the effect that all diskless devices are identical)
  - _MatchesRemote? no, because this disk is configured to its current
    remote peer, not to our new one

The fix is trivial, although the algorithm not: we only allow overriding
the network configuration when the disk information matches and we are
not diskless, by adding the <"local_dev" in info'> test.

Reviewed-by: ultrotter

bf25af3b

May 07, 2008
- Add unittest for constants · eeb1d86a
  Michael Hanselmann authored 17 years ago
```
Reviewed-by: iustinp
```
  eeb1d86a
- Use new ssconf function to check configuration version · 243cdbcc
  Michael Hanselmann authored 17 years ago
```
Upgrades will be handled in future patches.

Reviewed-by: iustinp
```
  243cdbcc
May 06, 2008
- Use dict instead of if/elif map for hypervisor classes · a9369c6e
  Michael Hanselmann authored 17 years ago
```
Reviewed-by: iustinp
```
  a9369c6e
- Rename hypervisor code to lowercase filenames · a2d32034
  Michael Hanselmann authored 17 years ago
```
Reviewed-by: iustinp
```
  a2d32034
May 05, 2008

Generate devel/upload during build time from template · 94f3875d
Michael Hanselmann authored 17 years ago
```
- Use variable with prefix instead of grep and sed
- Always run with /bin/bash

Reviewed-by: ultrotter
```
94f3875d

Export the number of cpus to iallocator scripts · 4337cf1b

Iustin Pop authored 17 years ago

Now that we have the number of cpus available from the hypervisors, we
can export this to the iallocator scripts.

Reviewed-by: ultrotter

4337cf1b

Minor doc/help update · 872c949f

Iustin Pop authored 17 years ago

This shortens the help output in gnt-node so that the output looks
nicer, and improves the manual page for gnt-instance with the new
'status' field.

Reviewed-by: ultrotter

872c949f

Improve the gnt-* list field selection · 48c4dfa8

Iustin Pop authored 17 years ago

This patch allows the '-o' option to the list subcommands to add more
fields to the default list instead of replacing the default list by
prefixing the fields list with '+'.

The patch also moves the listing (in the help output) of the default
field list from hardcoded to built at runtime from the actual list.

Reviewed-by: ultrotter

48c4dfa8

Add node cpu count to gnt-node list · e8a4c138

Iustin Pop authored 17 years ago

This patch adds the backend and frontend changes needed for being able
to list the cpu count.

Reviewed-by: ultrotter

e8a4c138

Wrap exception in _DistributeConfig code · 9ff994da

Guido Trotter authored 17 years ago

nodelist.remove(X) could potentially raise a ValueError (even if the chance
that the current node is not in the list are pretty scarce, and its absence
should raise a red flag anyway). If this happens let things go on, as that's
what the code which previously distributed the config did.

Reviewed-by: iustinp

9ff994da

Simplify target generation in DistributeConfig · 41362e70

Guido Trotter authored 17 years ago

Currently we get the list of nodes, and for each one extract all its info, and
just to exclude it if the name matches ours. Since the list of nodes is a list
of names just use .remove() to exclude ourself from it, and use that list
directly.

Reviewed-by: iustinp

41362e70

May 02, 2008

ssconf: update the SetKey docstring · 8498462b

Guido Trotter authored 17 years ago

SetKey is used, other than for adding new nodes, in another few cases. Update
the docstring to reflect this, so we don't mislead people reading it.

Reviewed-by: iustinp

8498462b

Delete hypervisor.py · 310bbdde

Guido Trotter authored 17 years ago

This completes the changes in r898 by actually getting rid of the old unused
hypervisor.py code which was left in the code tree.

Reviewed-by: iustinp

310bbdde

May 01, 2008
- ganeti-masterd: Some docstrings work · ce862cd5
  Guido Trotter authored 17 years ago
```
- Add a docstring to IOServer's constructor
- Add argument description to PoolWorker's and JobRunner's ones

Reviewed-by: iustinp
```
  ce862cd5
- locking: remove obsolete comment · dcf315e2
  Guido Trotter authored 17 years ago
```
Reviewed-by: iustinp
```
  dcf315e2
Apr 30, 2008

Remove deprecated disk templates from doc · 808753d4

Manuel Franceschini authored 17 years ago

Since local_raid1 and remote_raid1 are deprecated they are removed
from the docs. This patch removes some old documentation sections
and bumps the documented version from 1.2 to 1.3.

Reviewed-by: iustinp

808753d4

hooks.sgml: Add cluster-verify hooks information · 470e7e06
Guido Trotter authored 17 years ago
```
Reviewed-by: iustinp
```
470e7e06

Add cluster-verify hooks · d8fff41c

Guido Trotter authored 17 years ago

Only post-hooks are run on cluster verify, and then their output is sent back
to the LU, which upon failure displays it to the user and changes the result of
the execution to a failure.

Reviewed-by: iustinp

d8fff41c

Add a LU Hooks notification function · 1fce5219

Guido Trotter authored 17 years ago

Previously LUs could be failed by pre-hooks, and post-hooks just had effects by
themselves. This patch allows a LU to define the HooksCallBack function if it
wants to know about its hooks' results and alter its results in response.

The ChainOpCode execution path contains some commented out hooks code, which
this patch modifies to run the HooksCallBack function, so this is not forgot if
it ever gets uncommented out.

Reviewed-by: iustinp

1fce5219