- Apr 16, 2012
-
-
Michael Hanselmann authored
Before this patch, a node evacuation submitted with high priority would only compute the solution at that priority, but the actual evacuation ran at normal priority. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Apr 11, 2012
-
-
Iustin Pop authored
Sorry, didn't catch this before… Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> (cherry picked from commit 54b010ca) Signed-off-by:
Michael Hanselmann <hansmi@google.com>
-
Dimitris Aragiorgis authored
Commit 3b3b1bca does not entirely fix the bug introduced in commit f396ad8c. It fixes consistency of config data in permanent storage, but does not ensure consistency in data held in runtime memory of masterd. The bug of duplicate ports is still triggered when LUInstanceRemove() invokes _RemoveDisks() and this returns False (in case call_blockdev_remove RPC fails). The drbd ports get returned in the pool, but execution is aborted and RemoveInstance() is never invoked. Due to the fact that port handling is not done with TemporaryReservationManager, ensure that ports are released, only if disk related config data is deleted. In _RemoveDisks() release ports only if all RPCs succeed. Extend _RemoveDisks() to include ignore_failures argument passed by _RemoveInstance() to handle the ports appropriately. Signed-off-by:
Dimitris Aragiorgis <dimara@grnet.gr> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Dimitris Aragiorgis authored
Commit f396ad8c returns the TCP port used by DRBD disk back to the TCP/UDP port pool using AddTcpUdpPort(). However, AddTcpUdpPort() writes the config on every invocation, using _WriteConfig(). This causes two problems: * it causes critical errors logged by VerifyConfig(), after the DRBD disk removal, and until the actual instance removal. * if the code following AddTcpUdpPort() fails, the port is already returned back the pool, which causes the port to have duplicates (inconsistent config). AddTcpUdpPort() is invoked in three cases: * during InstanceRemove() through _RemoveDisks(). * during InstanceSetParams() in case of disk removal. * during InstanceSetParams() through _ConvertDrbdToPlain(). This commit fixes the problem by removing the _WriteConfig() call from AddTcpUdpPort(), delegate it to Update() via the TemporaryReservationManager and ensure AddTcpUdpPort() precedes Update(). Signed-off-by:
Dimitris Aragiorgis <dimara@grnet.gr> [iustin@google.com: small comments adjustements] Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> (cherry picked from commit 3b3b1bca)
-
- Mar 30, 2012
-
-
Iustin Pop authored
Sorry, didn't catch this before… Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Mar 29, 2012
-
-
Dimitris Aragiorgis authored
Commit f396ad8c returns the TCP port used by DRBD disk back to the TCP/UDP port pool using AddTcpUdpPort(). However, AddTcpUdpPort() writes the config on every invocation, using _WriteConfig(). This causes two problems: * it causes critical errors logged by VerifyConfig(), after the DRBD disk removal, and until the actual instance removal. * if the code following AddTcpUdpPort() fails, the port is already returned back the pool, which causes the port to have duplicates (inconsistent config). AddTcpUdpPort() is invoked in three cases: * during InstanceRemove() through _RemoveDisks(). * during InstanceSetParams() in case of disk removal. * during InstanceSetParams() through _ConvertDrbdToPlain(). This commit fixes the problem by removing the _WriteConfig() call from AddTcpUdpPort(), delegate it to Update() via the TemporaryReservationManager and ensure AddTcpUdpPort() precedes Update(). Signed-off-by:
Dimitris Aragiorgis <dimara@grnet.gr> [iustin@google.com: small comments adjustements] Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Mar 28, 2012
-
-
Bernardo Dal Seno authored
Fixed a typo so that now LUOobCommand acquires the BLG in shared mode, as intended. Signed-off-by:
Bernardo Dal Seno <bdalseno@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Also move the version check into prereq to abort before alter cluster state if the version mismatch. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
This patch moves the “call_version” to a new RPC client definition and then adds a new runner using the DNS resolver for getting the host address. The standard “BootstrapRunner”, where the call was before, tries to resolve node names using ssconf first, which doesn't work properly when re-adding a node with a new primary IP address. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Mar 23, 2012
-
-
Iustin Pop authored
Fix a typo introduced in commit c85b15c1, which breaks epydoc. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
René Nussbaumer authored
There are other ways to leave the cluster in a broken state than just the version check. However they are not very trivial to fix in 2.5. So leave it up to 2.6 for a nicer fix. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com> (cherry picked from commit e2ea8de1)
-
René Nussbaumer authored
There are other ways to leave the cluster in a broken state than just the version check. However they are not very trivial to fix in 2.5. So leave it up to 2.6 for a nicer fix. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Mar 22, 2012
-
-
Michael Hanselmann authored
This requires acquiring the node group locks in shared mode. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
The “cur_group_uuid” parameter is optional to prepare for using the factorized code from LUInstanceQueryData. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
While debugging another issue we realized that LUClusterQuery forks. This turned out to be the “platform.architecture” function from the Python library. It uses the “file” command to determine the architecture of the Python binary. This patch adds two new functions to the “runtime” module to get this information once per process instead of doing it every single time LUClusterQuery is used. Forking is a no-go in a multi-threaded environment anyway. A future change will also have to change the terminology in “gnt-cluster info”: it reports the binary architecture simply as “architecture”, when it's actually the binaries' architecture. Kernel and userland can be different. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
Michael Hanselmann authored
Instead of acquiring the BGL in exclusive mode (which blocks all other operations), we acquire all locks for groups, nodes and instances in shared mode before verifying the configuration. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
- Mar 20, 2012
-
-
Alexander Schreiber authored
Trivial fix for a typo in message output of LUInstanceSetParams Signed-off-by:
Alexander Schreiber <als@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Mar 19, 2012
-
-
Iustin Pop authored
If a specific list of groups has been requested, then the code used that, without transforming it to a (frozen)set first, which results in: unsupported operand type(s) for &: 'list' and 'frozenset' Trivial fix is to do that in the 'then' branch. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Mar 15, 2012
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
hail now expects correctly that relocate_from is of equal length with the number of required nodes (fixme: there's a lot of not well documented behaviour here… not nice for any other potential IAllocators). As such, we _need_ to pass just the instance's primary node. Additionally, update the iallocator doc to correctly specify what this list (`relocate_from`) contains. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
This adapts the Ganeti side to export the spindle_usage Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 27, 2012
-
-
Iustin Pop authored
Since we run the post-hooks explicitly in the Exec() function (via _RunPostHook) after we removed the target node from the config, we will get a: WARNING Node 'node2', which is about to be removed, was not found in the list of all nodes in the logs every time we remove a node. The patch just removes the warning, as actually invalid configurations (for the pre hook) will be checked correctly elsewhere. Additionally, the docstrings for BuildHooksEnv and BuildHooksNodes are corrected/switched. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Feb 21, 2012
-
-
Iustin Pop authored
… to make even more obvious what's the difference between a declared lock level with an empty list of locks and no lock level. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Iustin Pop authored
Strangely, these were not exported at all before. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 17, 2012
-
-
Michael Hanselmann authored
Forgot “enumerate”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
When modifications are made, disks may not have the same index anymore. Updating all disks fixes this. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
When adding an item the index given to the callback function would be incorrect under certain conditions. This patch also adds assertions and more tests. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Feb 15, 2012
-
-
Michael Hanselmann authored
There has been a lot of duplicated code in _GenerateDiskTemplate, and some cases of very similar, but not quite same duplicates. This patch merges them. Generating a disk's “logical_id” attribute is done via a lambda/function. Maybe the ID's could be pre-computed and stored in a list. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
This is in preparation to de-duplicating significant chunks of code in cmdlib._GenerateDiskTemplate. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Feb 14, 2012
-
-
Michael Hanselmann authored
The callback is expected to return a two-valued tuple. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 13, 2012
-
-
Michael Hanselmann authored
… instead of passing the list of changes as a parameter. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Unfortunately this got a bit messier than I intended, but then again it cleans up a lot of messy code with heaps of local variables (“this_nic_override”) and LU attributes (“nic_pnew”, “nic_pinst”). Most of these variables were index by a number, or one of the constants.DDM_* constants. This patch moves the code for adding/modifying/removing a NIC/disk to dedicated, small functions. The previously added generic algorithm for applying changes to containers is then used to actually change the instance's network interfaces or disks based on the requested modifications. The LU now supports adding/removing disks/NICs in arbitrary positions. The compuation of all network interface changes has been moved to CheckPrereq, so that its result can be used for hooks. For this to work without side-effects, the NIC objects need to be copied (only done if there are actual changes). The command line utility still needs to be updated. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
While preparing this patch series I identified at least three different implementations of the algorithm for adding/removing/changing NICs/disks. These two functions and corresponding unittests provide a generic implementation with added support for adding/removing arbitrary disks or NICs. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Disk changes aren't allowed at the same time as a disk template change. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 01, 2012
-
-
Michael Hanselmann authored
“INSTANCE_DOWN” is still being used. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
With this patch marking an instance already marked offline (or online) as offline/online again becomes a no-op. Also removed the unused INSTANCE_UP variable. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jan 31, 2012
-
-
Michael Hanselmann authored
Instead of having two separate parameters, a single boolean parameter is used. Unfortunately we need a third state to say “no change”, so the value can be None, True or False (similar to other parameters). There are no user interface changes. New QA tests are added, too. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jan 27, 2012
-
-
Guido Trotter authored
Since instances can be started, failed over and migrated with less than their maximum memory N+1 will use the minimum memory for verification. Note that this accounts only for the instances being moved being resized, and not for the ones already on the node, as ganeti will not automatically resize other instances on the target node now when trying to start/failover/migrate an instance. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-