- Nov 24, 2011
-
-
Michael Hanselmann authored
* stable-2.5: ConfigWriter: Fix epydoc error Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Andrea Spadaccini <spadaccio@google.com>
-
Michael Hanselmann authored
The parameter is called “mods”, not “modes”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Andrea Spadaccini <spadaccio@google.com> (cherry picked from commit 1730d4a1)
-
Michael Hanselmann authored
* devel-2.4: LUGroupAssignNodes: Fix node membership corruption Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
* stable-2.5: LUGroupAssignNodes: Fix node membership corruption Fix pylint warning on unreachable code LUNodeEvacuate: Disallow migrating all instances at once LUNodeEvacuate: Locking fixes Fix error when removing node Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Note: This bug only manifests itself in Ganeti 2.5, but since the problematic code also exists in 2.4, I decided to fix it there. If a node was assigned to a new group using “gnt-group assign-nodes” the node object's group would be changed, but not the duplicate member list in the group object. The latter is an optimization to require fewer locks for other operations. The per-group member list is only kept in memory and not written to disk. Ganeti 2.5 starts to make use of the data kept in the per-group member list and consequently fails when it is out of date. The following commands can be used to reproduce the issue in 2.5 (in 2.4 the issue was confirmed using additional logging): $ gnt-group add foo $ gnt-group assign-nodes foo $(gnt-node list --no-header -o name) $ gnt-cluster verify # Fails with KeyError This patch moves the code modifying node and group objects into “config.ConfigWriter” to do the complete operation under the config lock, and also to avoid making use of side-effects of modifying objects without calling “ConfigWriter.Update”. A unittest is included. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> (cherry picked from commit 218f4c3d)
-
Michael Hanselmann authored
Note: This bug only manifests itself in Ganeti 2.5, but since the problematic code also exists in 2.4, I decided to fix it there. If a node was assigned to a new group using “gnt-group assign-nodes” the node object's group would be changed, but not the duplicate member list in the group object. The latter is an optimization to require fewer locks for other operations. The per-group member list is only kept in memory and not written to disk. Ganeti 2.5 starts to make use of the data kept in the per-group member list and consequently fails when it is out of date. The following commands can be used to reproduce the issue in 2.5 (in 2.4 the issue was confirmed using additional logging): $ gnt-group add foo $ gnt-group assign-nodes foo $(gnt-node list --no-header -o name) $ gnt-cluster verify # Fails with KeyError This patch moves the code modifying node and group objects into “config.ConfigWriter” to do the complete operation under the config lock, and also to avoid making use of side-effects of modifying objects without calling “ConfigWriter.Update”. A unittest is included. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Commit c50452c3 added an exception when all instances should be evacuated off a node, but did so in a way which made pylint complain about unreachable code. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 23, 2011
-
-
Michael Hanselmann authored
There is a design issue in the iallocator interface which prevents us from doing this. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Until now the iallocator constants for node evacuation (IALLOCATOR_NEVAC_*) were also used for the opcode. However, it turned out this was due to a misunderstanding and is incorrect. This patch adds new constants (with the same values) and changes the affected places. Fortunately the RAPI client already used good names, so no changes are necessary. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Signed-off-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
When evacuating a node, only an assertion without informative text was used to check if the necessary node locks had been acquired. This was on top of evaluating the list of nodes without having a node group lock, so this was changed as well. Also update some exception messages to include “retry the operation”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
ConfigWriter.GetAllInstancesInfo returns a dictionary, not a list. Removing a node would fail with “too many values to unpack”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Nov 18, 2011
-
-
Michael Hanselmann authored
* stable-2.5: htools: rework message display construction hbal: handle empty node groups Document OpNodeMigrate's result for RAPI Fail if node/group evacuation can't evacuate instances LUInstanceRename: Compare name with name LUClusterRepairDiskSizes: Acquire instance locks in exclusive mode Update NEWS for 2.5.0~rc4 Bump version to 2.5.0~rc4 jqueue: Allow zero jobs to be submitted at once hail: don't select the primary as new secondary hail: add an extra safety check in relocate Bump version to 2.5.0~rc3 Conflicts: configure.ac: Trivial Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
* devel-2.4: Ensure unused ports return to the free port pool Re-wrap a paragraph to eliminate a sphinx warning Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Nov 17, 2011
-
-
Michael Hanselmann authored
After iallocator ran we can release any unused node locks. Since they must be in exclusive mode this should improve parallelization during instance creation. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 16, 2011
-
-
Iustin Pop authored
While diagnosing some (unrelated) memory usage in htools, I've stumbled upon some very bad behaviour in checkData: mapAccum is non-strict, and the tuple we use also, so that results in the list of list of messages being very bad space-wise (hundreds of MB of memory for a simulated cluster with thousands of nodes, all with errors). The new, explicit reuse of the old message list has a linear memory behaviour. The only downside is that messages are listed in the reverse order (which I'll fix on master). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This patch changes an internal assert (which can only be triggered when a node group is empty) into properly handling this case (and returning empty node/instance lists). While we could handle this in the backend (Cluster.splitNodeGroup) this would actually mean than we change the behaviour for a cluster with just two node groups, once of which is empty (where today we don't require a node group argument). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Nov 15, 2011
-
-
Michael Hanselmann authored
- Commit b7a1c816 changed the LU to generate jobs - Mention documented results in NEWS Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 14, 2011
-
-
Vangelis Koukis authored
Ensure ports previously allocated by calling ConfigWriter's AllocatePort() are returned to the pool of free ports when no longer needed: * Return the network_port of an instance when it is removed * Return the port used by a DRBD-based disk when it is removed Signed-off-by:
Vangelis Koukis <vkoukis@grnet.gr> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
This just makes sure that the paragraph doesn't contains lines that start with :, which make Sphinx (1.0.7) complain. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Nov 10, 2011
-
-
Guido Trotter authored
These are triggered by our "stay-compatible" approach. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Nov 08, 2011
-
-
Michael Hanselmann authored
If an instance can't be evacuated, only a message would be printed. With this change the operation always aborts. Newly added unittests check for this behaviour. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 04, 2011
-
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Andrea Spadaccini <spadaccio@google.com>
-
Michael Hanselmann authored
… instead of object with name. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Instances are modified if their disk size doesn't match. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 03, 2011
-
-
Michael Hanselmann authored
Mention that instances can be passed on the CLI when “--help” is used. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Andrea Spadaccini <spadaccio@google.com>
-
- Nov 02, 2011
-
-
Andrea Spadaccini authored
Move the contents of the PATH environment variable for hooks to constants, and use its value in the code and in the hooks documentation. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Nov 01, 2011
-
-
Andrea Spadaccini authored
A failed gnt-cluster (de)activate-master-ip would not produce any output to the user. This patch adds code that checks for the results of the RPCs and raise an exception if appropriate. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Andrea Spadaccini authored
Reviewed-by:
Michael Hanselmann <hansmi@google.com> Signed-off-by:
Andrea Spadaccini <spadaccio@google.com>
-
Andrea Spadaccini authored
Reviewed-by:
Michael Hanselmann <hansmi@google.com> Signed-off-by:
Andrea Spadaccini <spadaccio@google.com>
-
Andrea Spadaccini authored
Add the RunLocalHooks decorator, that allows the execution of hooks locally. Also, add a RunLocalHooks method to HooksRunner, to adapt the signature of HooksRunner.RunHooks to the one expected by HooksMaster, and also to check that the hooks are being executed locally. Reviewed-by:
Michael Hanselmann <hansmi@google.com> Signed-off-by:
Andrea Spadaccini <spadaccio@google.com>
-
Andrea Spadaccini authored
- remove any dependence on Logical Units from the HooksMaster; - add a new function parameter to the constructor, a function that is expected to convert the results of the hooks execution in a format understood by the HooksMaster; - add a factory method that builds a HooksMaster from a LU, keeping the interface of the old constructor of HooksMaster; - remove usage of Processor.hmclass from external classes, introducing the Processor.BuildHooksMaster method; - update unit tests. Reviewed-by:
Michael Hanselmann <hansmi@google.com> Signed-off-by:
Andrea Spadaccini <spadaccio@google.com>
-
- Oct 31, 2011
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Stephen Shirley <diamond@google.com>
-
- Oct 27, 2011
-
-
Michael Hanselmann authored
I forgot this in the previous patch. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Andrea Spadaccini <spadaccio@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
* stable-2.4: Update NEWS and increase to 2.4.5 Conflicts: configure.ac: Trivial Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
René Nussbaumer authored
Conflicts: configure.ac (trivial) Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
If cmdlib.LUNodeMigrate was called for a node without primary instances it would try to submit an empty list of jobs. This was never visible via CLI as there we check the list of primary instances first. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Andrea Spadaccini authored
In FakeHooksRpcSuccess, the data parameter of the RpcResult constructor was not enclosed in a tuple. While this does not make the test fail, it must be fixed. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Oct 26, 2011
-
-
Iustin Pop authored
This just adds the primary node of the instance as 'non-allocable' during the choosing of the new secondary. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> (cherry picked from commit 7073b3a8) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-