- Jan 26, 2012
-
-
Bernardo Dal Seno authored
Some Python scripts in /usr/lib/ganeti/ were getting the wrong permissions (their 'x' bit was cleared). This patch fixes that behavior. This patch renames the variable 'dist_tools_PYTHON' to 'python_scripts'. Some Python scripts were listed in the 'dist_tools_PYTHON' variable, but as said scripts have no .py extension in their names, Automake treated the scripts as data files, and hence no 'x' bit. Now the Python scripts are processed by the rules created for the 'dist_tools_SCRIPTS' variable, and such rules don't depend on file name extensions. Signed-off-by:
Bernardo Dal Seno <bdalseno@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> (cherry picked from commit cc120286)
-
Bernardo Dal Seno authored
Permissions for the directories created during install depended on the umask of the user running the script. Now umask is reset inside the script to remove such dependency. Signed-off-by:
Bernardo Dal Seno <bdalseno@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> (cherry picked from commit 0f796800)
-
- Jan 25, 2012
-
-
Michael Hanselmann authored
This patch attempts to fix a number of issues with “gnt-cluster verify” in presence of multiple node groups and DRBD8 instances split over nodes in more than one group. - Look up instances in a group only by their primary node (otherwise split instances would be considered when verifying any of their node's groups) - When gathering additional nodes for LV checks, just compare instance's node's groups with the currently verified group instead of comparing against the primary node's group - Exclude nodes in other groups when calculating N+1 errors and checking logical volumes Not directly related, but a small error text is also clarified. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jan 20, 2012
-
-
Guido Trotter authored
Cleanup just updates the config with the correct location of the instance, or informs of its down status, but never starts it. As such there's no point in checking for enough free memory. Actually this check could prevent a perfectly safe cleanup operation if a node is busy. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jan 09, 2012
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
* devel-2.4: Add UnescapeAndSplit unittest for multi-escapes Fix a bug in command line option parsing code ConfigWriter: Fix epydoc error LUGroupAssignNodes: Fix node membership corruption Ensure unused ports return to the free port pool Re-wrap a paragraph to eliminate a sphinx warning Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jan 06, 2012
-
-
Guido Trotter authored
This of course was working for all the rcs, but broke with 1.0 itself. In addition: - split between running kvm --version and parsing its output - unittest parsing for various known --help outputs - updated NEWS file - happy 2012 wishes - the hope to finish this patch before it's time to say happy easter :) Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Dec 21, 2011
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
When an opcode is about to be processed its dependencies are evaluated using “_JobDependencyManager.CheckAndRegister”. Due to its nature that function requires a lock on the manager's internal structures. All of this happens while the job queue lock is held in shared mode (required for the job processor). When a job has been processed any pending dependencies are re-added to the job workerpool. Before this patch that would require the manager's lock and then, for adding the jobs, the job queue lock. Since this is in reverse order it will lead to deadlocks. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 30, 2011
-
-
Iustin Pop authored
This would have caught the bug in the first place. Argh, hand-generated test cases! Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Nikos Skalkotos authored
Fix bug affecting command line options of "keyval" type. Although escaping commands with \ is supported, it is is not applied to the input recursively. Signed-off-by:
Nikos Skalkotos <skalkoto@grnet.gr> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 24, 2011
-
-
Michael Hanselmann authored
The parameter is called “mods”, not “modes”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Andrea Spadaccini <spadaccio@google.com> (cherry picked from commit 1730d4a1)
-
Michael Hanselmann authored
The parameter is called “mods”, not “modes”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Andrea Spadaccini <spadaccio@google.com>
-
Michael Hanselmann authored
Note: This bug only manifests itself in Ganeti 2.5, but since the problematic code also exists in 2.4, I decided to fix it there. If a node was assigned to a new group using “gnt-group assign-nodes” the node object's group would be changed, but not the duplicate member list in the group object. The latter is an optimization to require fewer locks for other operations. The per-group member list is only kept in memory and not written to disk. Ganeti 2.5 starts to make use of the data kept in the per-group member list and consequently fails when it is out of date. The following commands can be used to reproduce the issue in 2.5 (in 2.4 the issue was confirmed using additional logging): $ gnt-group add foo $ gnt-group assign-nodes foo $(gnt-node list --no-header -o name) $ gnt-cluster verify # Fails with KeyError This patch moves the code modifying node and group objects into “config.ConfigWriter” to do the complete operation under the config lock, and also to avoid making use of side-effects of modifying objects without calling “ConfigWriter.Update”. A unittest is included. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> (cherry picked from commit 218f4c3d)
-
Michael Hanselmann authored
Note: This bug only manifests itself in Ganeti 2.5, but since the problematic code also exists in 2.4, I decided to fix it there. If a node was assigned to a new group using “gnt-group assign-nodes” the node object's group would be changed, but not the duplicate member list in the group object. The latter is an optimization to require fewer locks for other operations. The per-group member list is only kept in memory and not written to disk. Ganeti 2.5 starts to make use of the data kept in the per-group member list and consequently fails when it is out of date. The following commands can be used to reproduce the issue in 2.5 (in 2.4 the issue was confirmed using additional logging): $ gnt-group add foo $ gnt-group assign-nodes foo $(gnt-node list --no-header -o name) $ gnt-cluster verify # Fails with KeyError This patch moves the code modifying node and group objects into “config.ConfigWriter” to do the complete operation under the config lock, and also to avoid making use of side-effects of modifying objects without calling “ConfigWriter.Update”. A unittest is included. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Commit c50452c3 added an exception when all instances should be evacuated off a node, but did so in a way which made pylint complain about unreachable code. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 23, 2011
-
-
Michael Hanselmann authored
There is a design issue in the iallocator interface which prevents us from doing this. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
When evacuating a node, only an assertion without informative text was used to check if the necessary node locks had been acquired. This was on top of evaluating the list of nodes without having a node group lock, so this was changed as well. Also update some exception messages to include “retry the operation”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
ConfigWriter.GetAllInstancesInfo returns a dictionary, not a list. Removing a node would fail with “too many values to unpack”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Nov 16, 2011
-
-
Iustin Pop authored
While diagnosing some (unrelated) memory usage in htools, I've stumbled upon some very bad behaviour in checkData: mapAccum is non-strict, and the tuple we use also, so that results in the list of list of messages being very bad space-wise (hundreds of MB of memory for a simulated cluster with thousands of nodes, all with errors). The new, explicit reuse of the old message list has a linear memory behaviour. The only downside is that messages are listed in the reverse order (which I'll fix on master). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This patch changes an internal assert (which can only be triggered when a node group is empty) into properly handling this case (and returning empty node/instance lists). While we could handle this in the backend (Cluster.splitNodeGroup) this would actually mean than we change the behaviour for a cluster with just two node groups, once of which is empty (where today we don't require a node group argument). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Nov 15, 2011
-
-
Michael Hanselmann authored
- Commit b7a1c816 changed the LU to generate jobs - Mention documented results in NEWS Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 14, 2011
-
-
Vangelis Koukis authored
Ensure ports previously allocated by calling ConfigWriter's AllocatePort() are returned to the pool of free ports when no longer needed: * Return the network_port of an instance when it is removed * Return the port used by a DRBD-based disk when it is removed Signed-off-by:
Vangelis Koukis <vkoukis@grnet.gr> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
This just makes sure that the paragraph doesn't contains lines that start with :, which make Sphinx (1.0.7) complain. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Nov 08, 2011
-
-
Michael Hanselmann authored
If an instance can't be evacuated, only a message would be printed. With this change the operation always aborts. Newly added unittests check for this behaviour. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 04, 2011
-
-
Michael Hanselmann authored
… instead of object with name. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Instances are modified if their disk size doesn't match. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 27, 2011
-
-
Michael Hanselmann authored
I forgot this in the previous patch. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Andrea Spadaccini <spadaccio@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
* stable-2.4: Update NEWS and increase to 2.4.5 Conflicts: configure.ac: Trivial Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
If cmdlib.LUNodeMigrate was called for a node without primary instances it would try to submit an empty list of jobs. This was never visible via CLI as there we check the list of primary instances first. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 26, 2011
-
-
Iustin Pop authored
This just adds the primary node of the instance as 'non-allocable' during the choosing of the new secondary. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> (cherry picked from commit 7073b3a8) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
If we select the primary as new secondary, better to fail than return wrong data to Ganeti. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> (cherry picked from commit f25508be) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 21, 2011
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 20, 2011
-
-
René Nussbaumer authored
On a master failover some of the archive dirs might have wrong permissions in the non-root model. This is due to the nature of noded still running as root and the job queue is synced that way. This patch will fix this behaviour by setting the permissions accordingly. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 19, 2011
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
If an instance had actually a missing disk, the type check would fail. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 18, 2011
-
-
Michael Hanselmann authored
Commit e1f23243 changed te LU and opcode for node evacuation to receive a “mode” parameter (among other things). Commit de40437a changed the RAPI code accordingly, but did so for an earlier version of the first patch. Obviously this couldn't work, so here's the fix. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-