- May 29, 2008
-
-
Iustin Pop authored
Since we have removed support for local and remote raid1, update the man pages and guides to reflect the new situation. Reviewed-by: imsnah
-
- May 24, 2008
-
-
Guido Trotter authored
When creating the ganeti tarball the dumb allocator was left out. Shipping it alongside the other examples. Reviewed-by: iustinp
-
- May 15, 2008
-
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
Guido Trotter authored
Add this recently added option to the gnt-cluster man page before releasing 1.2.4. Reviewed-by: imsnah
-
Guido Trotter authored
It turns out in some cases there can exist keywords without an associated value exported by drbdsetup show. This patch makes the value part optional in our parser, so that if it's not present the parsing result will contain an array with just the keyword in it. This is not a problem since we check all keyword names before accessing their values, so we won't mistakenly try to access the value of a valueless keyword. Reviewed-by: iustinp
-
Guido Trotter authored
Make _AssembleDisk more similar to _AssembleNet by splitting the generation of the drbdsetup command and its execution. While not changing anything this makes it easier to manipulate the command just in certain cases, which in the future we'll need to do. Reviewed-by: iustinp
-
- May 13, 2008
-
-
Iustin Pop authored
[Trunk version] Reviwed-by: imsnah
-
Iustin Pop authored
This patch adds in gnt-cluster verify checks for inter-node tcp communication checks on the node daemon port for both the primary and (if defined) secondary networks. The output looks like (4-node cluster, one with the secondary interface down): * Verifying node node1.example.com - ERROR: tcp communication with node 'node3.example.com': failure using the secondary interface(s) * Verifying node node2.example.com - ERROR: tcp communication with node 'node3.example.com': failure using the secondary interface(s) * Verifying node node3.example.com - ERROR: tcp communication with node 'node1.example.com': failure using the secondary interface(s) - ERROR: tcp communication with node 'node2.example.com': failure using the secondary interface(s) - ERROR: tcp communication with node 'node4.example.com': failure using the secondary interface(s) * Verifying node node4.example.com - ERROR: tcp communication with node 'node3.example.com': failure using the secondary interface(s) Reviewed-by: imsnah
-
Michael Hanselmann authored
qa_node.py: Fix typo in message cmdlib.py: Don't add readded node to node list ganeti-qa.py: Make sure readd isn't done for master node Reviewed-by: iustinp
-
Iustin Pop authored
This new version of the patch removes only the listing of the usage in the "gnt-X" list, but keeps the strings in since we'll want to enhance and use them in "gnt-X $cmd --help". Reviewed-by: ultrotter
-
Iustin Pop authored
This reverts commit 976. Reviewed-by: ultrotter
-
Iustin Pop authored
[Forward-port of the 1.2 branch patch] This patch removes all the parameters and options from the output "gnt-X" (i.e. the subcommand list for command). This is done in order to uniformize the output, currently only some parameters are shown and they are not always consistent (e.g. required versus important parameters). Reviewed-by: ultrotter
-
Iustin Pop authored
Currently the watcher runs first the instance startup and then the boot-id method of disk reactivation. However, irrelevant of the fact that a node has rebooted or not, if we just started an instance, there's no need for its disks to be activated again, since the start instance has done that (if it is at all possible). The patch modifies the watcher to remember all started instances and not run activate-disks for them. Reviewed-by: ultrotter
-
Iustin Pop authored
Currently the watcher does activate disks (via bootid mechanisms) even for admin_down instances. This patch logs and skips over these instances. Reviewed-by: ultrotter
-
Iustin Pop authored
The cluster verify builds a sorted list of nodes and passes that to all the nodes (in parallel) for ssh checks. This means that for a cluster with N nodes, there will be approximately N simultaneous connections to the first node, then to the second node, etc. This, coupled with the ssh daemon's “MaxStartups” parameter, can create false alarms about ssh connectivity. This patch randomizes the node list in the backend (therefore, each node should have it's own order of ssh-ing to the other nodes) and the chance of these alarms should be reduced. Reviewed-by: ultrotter
-
- May 12, 2008
-
-
Iustin Pop authored
Currently many error handling code paths in bdev.py log only result.fail_reason (i.e. exit code or signal that killed the command) but not its output. This makes debugging very hard. The patch changes all places where we only log fail_reason to also log result.output. Reviewed-by: ultrotter
-
- May 10, 2008
-
-
Iustin Pop authored
DRBD8 requires that we pass ‘--create-device’ to the first command that wants to activate a new DRBD minor. We do this currently when we run the “drbdsetup ... disk” command which we run before the network setup. But if the LVs are missing, we skip the ‘disk’ subcommand and run only the ‘net’ one, so it might be that the activation fails because the minor we selected was never created in the first place. The patch adds the required parameter to the DRBD8._AssembleNet() call. Since it's a no-op for existing minors, it should not create any problems (tested and works both with configured and unconfigured minors). Reviewed-by: ultrotter
-
- May 09, 2008
-
-
Michael Hanselmann authored
There are a couple of reasons for doing so: - /proc is not OS independent, it's only supported by Linux (there are emulations on other systems, but those might differ from the way Linux represents data). - Checking a daemon's state doesn't necessarily mean it's usable. Connecting to the socket using “xm info” is much safer. - Reduce code size. Reviewed-by: iustinp
-
- May 08, 2008
-
-
Guido Trotter authored
Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
Iustin Pop authored
The algorithm for attaching to existing DRBD devices is not trivial. It has four alternatives, and there is a bug in the last one when we have diskless devices. The last case (local disk info matches but remote/network configuration doesn't match) we disconnect from the network and reattach with the correct info. We do this because correct local device has higher priority than remote device. However, the test we use (self._MatchesLocal) can succeed in two cases: - we have a disk and it's the same as the one attached - we don't have a disk and the drbd is in diskless mode But this creates problems for the fourth case as when we already have one diskless DRBD, activating then next one will do: - _MatchesLocal? yes, because both config data and system have no disks (with the effect that all diskless devices are identical) - _MatchesRemote? no, because this disk is configured to its current remote peer, not to our new one The fix is trivial, although the algorithm not: we only allow overriding the network configuration when the disk information matches and we are not diskless, by adding the <"local_dev" in info'> test. Reviewed-by: ultrotter
-
- May 07, 2008
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
Michael Hanselmann authored
Upgrades will be handled in future patches. Reviewed-by: iustinp
-
- May 06, 2008
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
- May 05, 2008
-
-
Michael Hanselmann authored
- Use variable with prefix instead of grep and sed - Always run with /bin/bash Reviewed-by: ultrotter
-
Iustin Pop authored
Now that we have the number of cpus available from the hypervisors, we can export this to the iallocator scripts. Reviewed-by: ultrotter
-
Iustin Pop authored
This shortens the help output in gnt-node so that the output looks nicer, and improves the manual page for gnt-instance with the new 'status' field. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch allows the '-o' option to the list subcommands to add more fields to the default list instead of replacing the default list by prefixing the fields list with '+'. The patch also moves the listing (in the help output) of the default field list from hardcoded to built at runtime from the actual list. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch adds the backend and frontend changes needed for being able to list the cpu count. Reviewed-by: ultrotter
-
Guido Trotter authored
nodelist.remove(X) could potentially raise a ValueError (even if the chance that the current node is not in the list are pretty scarce, and its absence should raise a red flag anyway). If this happens let things go on, as that's what the code which previously distributed the config did. Reviewed-by: iustinp
-
Guido Trotter authored
Currently we get the list of nodes, and for each one extract all its info, and just to exclude it if the name matches ours. Since the list of nodes is a list of names just use .remove() to exclude ourself from it, and use that list directly. Reviewed-by: iustinp
-
- May 02, 2008
-
-
Guido Trotter authored
SetKey is used, other than for adding new nodes, in another few cases. Update the docstring to reflect this, so we don't mislead people reading it. Reviewed-by: iustinp
-
Guido Trotter authored
This completes the changes in r898 by actually getting rid of the old unused hypervisor.py code which was left in the code tree. Reviewed-by: iustinp
-
- May 01, 2008
-
-
Guido Trotter authored
- Add a docstring to IOServer's constructor - Add argument description to PoolWorker's and JobRunner's ones Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
- Apr 30, 2008
-
-
Manuel Franceschini authored
Since local_raid1 and remote_raid1 are deprecated they are removed from the docs. This patch removes some old documentation sections and bumps the documented version from 1.2 to 1.3. Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
Guido Trotter authored
Only post-hooks are run on cluster verify, and then their output is sent back to the LU, which upon failure displays it to the user and changes the result of the execution to a failure. Reviewed-by: iustinp
-
Guido Trotter authored
Previously LUs could be failed by pre-hooks, and post-hooks just had effects by themselves. This patch allows a LU to define the HooksCallBack function if it wants to know about its hooks' results and alter its results in response. The ChainOpCode execution path contains some commented out hooks code, which this patch modifies to run the HooksCallBack function, so this is not forgot if it ever gets uncommented out. Reviewed-by: iustinp
-