- Sep 04, 2012
-
-
René Nussbaumer authored
There was an issue with the recent ipolicy introduction which lead to a bug in gnt-debug iallocator. It was not providing the spindle_use field and therefore it wont let you create a valid iallocator request. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 03, 2012
-
-
Iustin Pop authored
To help developing Ganeti on newer distributions, let's try to fix pylint warnings/errors. I'm using pylint from current Debian wheezy: pylint 0.25.1, astng 0.23.1, common 0.58.0, and we have 3 things that needs fixing. First, a really wide "except", with the silencing in the wrong place. I'm not sure why this doesn't have "except Exception", so let's add it. However, pylint still complains about "Catching too general exception", even though we do want to catch both system and our exception, so let's add a silence for W0703. It's true that we shouldn't catch KeyboardInterrupt and friends, but that should be cleaned up on the master branch. Second, pylint complains about "redefining name builtin tuple", because we do some pattern matching in the except blocks in netutils. This seems to be a false positive, but let's clean the code around this. And finally, type inference again goes bad, so let's silence E1103 with its "boolean doesn't have 'get' method". After this, I can run "make lint", and by extension "make commit-check" on Debian Wheezy, yay! We might be able to bump our required pylint versions to something not ancient… Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Aug 22, 2012
-
-
Constantinos Venetsanopoulos authored
Currently, hail fails with FailDisk when trying to add an instance of type: 'file', 'sharedfile' and 'rbd'. This is due to a "0" or None value in the corresponding dict inside _ComputeDiskSize, which results in a "O" or non Int value of the exported 'disk_space_total' parameter. This in turn makes hail fail, when trying to process the value: - with "Unable to read Int" if value is None (file) - with FailDisk if value is 0 (sharedfile, rbd) The latter happens because the 0 value doesn't match the instance's IPolicy, since it is lower than the minimum disk size. The second problem still exists when using adoption with 'plain' and 'blockdev' template and will be addressed in another commit. Signed-off-by:
Constantinos Venetsanopoulos <cven@grnet.gr> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 15, 2012
-
-
Iustin Pop authored
Due to an oversight, the pause/resume sync RPC calls in _WipeDisks lack the verification of the overall RPC status, and directly iterate over the payload. The code actually doing the wipe does verify correctly the results. This can result in jobs failing with a hard to diagnose: OpExecError ['NoneType' object is not iterable] instead of proper "RPC failed" message. This patch adds a hard check on the pause call, but for the resume call it just logs a warning if the RPC failed; the rationale being that if we can't contact the node for pausing the sync, it's likely wiping will fail too, but after the wipe has been done, we can continue. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Jul 26, 2012
-
-
Iustin Pop authored
In case LUClusterVerifyGroup is run on a group which doesn't contain the master node, the following could happen: - master node is selected due to the explicit check - if the order of nodes in the 'absent_nodes' list is such that the master node is the first in it, then we'll select (again) the master node - passing duplicate nodes to RPC calls will break due to RPC internals; this should be fixed separately, but in the meantime we just refrain from passing such duplicates This patch should not change the semantics of the code, since it wasn't guaranteed even before that we find a vm_capable node. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
Iustin Pop authored
Commit 904b3bfe tried to fix the deletion of custom ndparams from group, but instead broke both modification and deletion: because we run ForceDictType on self.op.ndparams instead of the updated new_ndparams, we can neither delete nor set properly spindle_count (since it won't be coerced to int). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jul 19, 2012
-
-
René Nussbaumer authored
On node groups we don't have the std field. However, the InstancePolicy object always verifies that the std value is within a given range. As we fill it up with defaults if not set (as it happens to be on node groups) and the min value is higher than the default std value (taken from constants.py) we fail. We overcome this situation by simply let the function know if we want to verify the std value at all. If we don't want to verify std, we just set it to a compliant value (min_v) and continue. We also slightly adapt the error message provided, as we don't have std values on groups. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 13, 2012
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Agata Murawska <agatamurawska@google.com>
-
- Jul 11, 2012
-
-
Agata Murawska authored
When we delete DRBD disks from some instance, we do not want to get errors due to nodes other than that instance's primary being offline. Signed-off-by:
Agata Murawska <agatamurawska@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Jul 07, 2012
-
-
Iustin Pop authored
Currently, this is not allowed, so one can't run a replace-disks; this breaks any non-invasive method of recovering the redundancy of the instance if its disks are already stopped (but it still works if the disks on the primary are active). So let's fix this inconsistency. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Jul 05, 2012
-
-
Iustin Pop authored
Currently, _RedistributeAncillaryFiles computes two lists: the list of online nodes (for all files redistribution), and the list of vm_capable nodes, for hypervisor-specific files. However, the vm_capable list includes offline nodes too, leading to warning messages: WARNING: Copy of file /etc/xen/xend-config.sxp to node node13.example.com failed: Node is marked offline We fix this by trivially intersecting the vm_capable list with the online one. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
René Nussbaumer authored
This error does not show up until we exceed the pool of master candidates and have nodes which are not master candidates. The background is that we check for master-ip-setup script on master candidates and expect them not to be on the other nodes. However, we distribute a default master-ip-script which break this assumption. Furthermore, there's no reason why the file should just exists on the master candidates. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jun 27, 2012
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Agata Murawska <agatamurawska@google.com>
-
René Nussbaumer authored
This annotates the disks for the blockdev_remove where it is appropriate. It leaves out 2 cases were we can't reliably annotate disk parameters due to lack of knowledge what we should annotate. Those cases affects only lvs used for drbd, so it doesn't affect the bug reported by Constantinos. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Agata Murawska <agatamurawska@google.com>
-
René Nussbaumer authored
This is also related to the bug reported by Constantinos, as we've only one getmirrorstatus_multi call in whole cmdlib, we just annotate them while we are building the disk list. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Agata Murawska <agatamurawska@google.com>
-
René Nussbaumer authored
Not annotating them works for DRBD but not for RBD as reported by Constantinos. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Agata Murawska <agatamurawska@google.com>
-
- Jun 20, 2012
-
-
Iustin Pop authored
_PrepareNicModification returns the invalid type, which triggers an assert resulting in a mysterious error: Failure: command execution error: Without any explanation. We fix this by removing the return value from _PrepareNicModification, and instead returning the expected type (since it differs per create/modification) from the (existing) wrappers for this function. We don't need to return actual changes from this function as _ApplyNicMods is the function that computes/returns the formatted changes. Signed-off-by:
Iustin Pop <iustin@google.com> Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jun 15, 2012
-
-
René Nussbaumer authored
This prevents from setting for example drbd options on the plain disk template. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jun 14, 2012
-
-
Iustin Pop authored
As reported on the devel mailing list by Christos Stavrakakis, creation of plain instances is broken when the --no-wait-for-sync flag is passed, because in that case WaitForSync is not called, hence SetDiskID is not called at all, resulting in a None physical_id being passed to backend. We fix that by explicitly calling SetDiskID, which will cover the pause/resume and os_add RPC calls. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jun 08, 2012
-
-
Iustin Pop authored
This has been reported internally 3-4 times already, and the current version (from 8b437a6e) is still not good enough, it seems. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jun 01, 2012
-
-
Iustin Pop authored
Commit 2e04d454 introduced the new offline state for the instance state, but being a big monolithic patch it sneaked in something that doesn't make sense. The checks for extra instances (either wrongly up or just unknown) are done purely on a name-basis, not on objects, so the types there are wrong. Furthermore, they have no relation to the admin state of the instance, so we just drop the entire if block. We keep the increment of the offline instance count, but move it to a different loop over instances. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- May 22, 2012
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
This is adapted from the design doc. Also fixes a typo in cmdlib.py. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- May 15, 2012
-
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This function did the opposite: was computing which old instance violated the specs but no longer do it now. new - old is the expected behaviour. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Currently, we only get: instance3: ['disk-size value 512 is not in range [1024, 1048576]' which doesn't explain which disk we are talking about. This patch extends the verification functions to take an additional parameter that qualifies the disk: instance3: ['disk-size/0 value 512 is not in range [1024, 1048576]' Future patch will make the formatting of the list better. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- May 14, 2012
-
-
Michael Hanselmann authored
If an instance is imported with a different name, network settings may have to be changed. Since import scripts may not already to the right thing, we decided to run the rename script. The same technique is already used for inter-cluster instance moves. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
This was a pretty non-obvious bug. A cluster looks sane after gnt-cluster init, however on a daemon restart the diskparameters had the default filled in. The same applies to gnt-group add. This is due to the nature that UpgradeConfig() from NodeGroups did just populate them with defaults if something was set on it. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- May 11, 2012
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- May 10, 2012
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Similiar to blockdev_create we sometimes do find on children. This fixes those cases. However, this is not very nice. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
This is due to the nature of bdev. We spread some logic into cmdlib and deal for example with it's children recursively. This makes it hard to annotate the disk parameters in a generic way as we don't always deal with the top most disk. But the disk parameters are depending on the top device not on the children. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
The following (blockdev) RPC calls are not converted yet (as they are not straight forward or need more research): * bdev_sizes * blockdev_remove * blockdev_shutdown * blockdev_removechildren * blockdev_close * blockdev_getsize * drbd_disconnect_net * blockdev_rename (has already a special encoder, needs further research if needed at all) * blockdev_getmirrorstatus (not sure if we have everywhere a clear link to the instance the disk belongs) * blockdev_getmirrorstatus_multi (same here, further research) Then special cases where we take care later in the patch series: * blockdev_create (special cased) * blockdev_find (special cased, like blockdev_create) Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Apr 26, 2012
-
-
Iustin Pop authored
This also improves the log messages for the (default) relative mode ("by %s to %s"). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-