Commits · 091232220479c4b35c69d19d53a729c9874d8fda · itminedu / snf-ganeti

Sep 04, 2012

René Nussbaumer authored 12 years ago


There was an issue with the recent ipolicy introduction which lead to a
bug in gnt-debug iallocator. It was not providing the spindle_use field
and therefore it wont let you create a valid iallocator request.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

09123222

Sep 03, 2012

Fix warnings/errors with newer pylint · 8ad0da1e

Iustin Pop authored 12 years ago


To help developing Ganeti on newer distributions, let's try to fix
pylint warnings/errors. I'm using pylint from current Debian wheezy:
pylint 0.25.1, astng 0.23.1, common 0.58.0, and we have 3 things that
needs fixing.

First, a really wide "except", with the silencing in the wrong
place. I'm not sure why this doesn't have "except Exception", so let's
add it. However, pylint still complains about "Catching too general
exception", even though we do want to catch both system and our
exception, so let's add a silence for W0703. It's true that we
shouldn't catch KeyboardInterrupt and friends, but that should be
cleaned up on the master branch.

Second, pylint complains about "redefining name builtin tuple",
because we do some pattern matching in the except blocks in
netutils. This seems to be a false positive, but let's clean the code
around this.

And finally, type inference again goes bad, so let's silence E1103
with its "boolean doesn't have 'get' method".

After this, I can run "make lint", and by extension "make
commit-check" on Debian Wheezy, yay! We might be able to bump our
required pylint versions to something not ancient…

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

8ad0da1e

Aug 22, 2012

Fix computation of disk sizes in _ComputeDiskSize · 6a3166cb

Constantinos Venetsanopoulos authored 12 years ago


Currently, hail fails with FailDisk when trying to add an instance
of type: 'file', 'sharedfile' and 'rbd'.

This is due to a "0" or None value in the corresponding dict inside
_ComputeDiskSize, which results in a "O" or non Int value of the
exported 'disk_space_total' parameter. This in turn makes hail fail,
when trying to process the value:

 - with "Unable to read Int" if value is None (file)
 - with FailDisk if value is 0 (sharedfile, rbd)

The latter happens because the 0 value doesn't match the instance's
IPolicy, since it is lower than the minimum disk size.

The second problem still exists when using adoption with 'plain'
and 'blockdev' template and will be addressed in another commit.

Signed-off-by: Constantinos Venetsanopoulos <cven@grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

6a3166cb

Aug 15, 2012

Add verification of RPC results in _WipeDisks · f08e5132

Iustin Pop authored 12 years ago


Due to an oversight, the pause/resume sync RPC calls in _WipeDisks
lack the verification of the overall RPC status, and directly iterate
over the payload. The code actually doing the wipe does verify
correctly the results. This can result in jobs failing with a hard to
diagnose:

OpExecError ['NoneType' object is not iterable]

instead of proper "RPC failed" message.

This patch adds a hard check on the pause call, but for the resume
call it just logs a warning if the RPC failed; the rationale being
that if we can't contact the node for pausing the sync, it's likely
wiping will fail too, but after the wipe has been done, we can
continue.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

f08e5132

Jul 26, 2012

Fix issue in LUClusterVerifyGroup with multi-group clusters · 350506c6

Iustin Pop authored 12 years ago


In case LUClusterVerifyGroup is run on a group which doesn't contain
the master node, the following could happen:

- master node is selected due to the explicit check
- if the order of nodes in the 'absent_nodes' list is such that the
  master node is the first in it, then we'll select (again) the master
  node
- passing duplicate nodes to RPC calls will break due to RPC
  internals; this should be fixed separately, but in the meantime we
  just refrain from passing such duplicates

This patch should not change the semantics of the code, since it
wasn't guaranteed even before that we find a vm_capable node.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>

350506c6

Fix node group modification of node parameters · 4bf27dab

Iustin Pop authored 12 years ago


Commit 904b3bfe tried to fix the deletion of custom ndparams from
group, but instead broke both modification and deletion: because we
run ForceDictType on self.op.ndparams instead of the updated
new_ndparams, we can neither delete nor set properly spindle_count
(since it won't be coerced to int).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

4bf27dab

Jul 19, 2012

Fix setting ipolicy on node groups · 8b057218

René Nussbaumer authored 12 years ago


On node groups we don't have the std field. However, the InstancePolicy
object always verifies that the std value is within a given range. As we
fill it up with defaults if not set (as it happens to be on node groups)
and the min value is higher than the default std value (taken from
constants.py) we fail.

We overcome this situation by simply let the function know if we want to
verify the std value at all. If we don't want to verify std, we just set
it to a compliant value (min_v) and continue.

We also slightly adapt the error message provided, as we don't have std
values on groups.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

8b057218

Jul 13, 2012

Allow reinstall even when secondaries are offline · 96c3d5d4

René Nussbaumer authored 12 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

96c3d5d4

Jul 11, 2012

Ignore offline node errors when removing disks · 03e5cdd5

Agata Murawska authored 12 years ago


When we delete DRBD disks from some instance, we do not want to get
errors due to nodes other than that instance's primary being offline.

Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

03e5cdd5

Jul 07, 2012

Allow instance disc activation with offline secondaries · d908ba61

Iustin Pop authored 12 years ago


Currently, this is not allowed, so one can't run a replace-disks; this
breaks any non-invasive method of recovering the redundancy of the
instance if its disks are already stopped (but it still works if the
disks on the primary are active). So let's fix this inconsistency.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

d908ba61

Jul 05, 2012

Fix redistribution of files w.r.t. offline nodes · cc706abc

Iustin Pop authored 12 years ago


Currently, _RedistributeAncillaryFiles computes two lists: the list of
online nodes (for all files redistribution), and the list of
vm_capable nodes, for hypervisor-specific files. However, the
vm_capable list includes offline nodes too, leading to warning
messages:

  WARNING: Copy of file /etc/xen/xend-config.sxp to node node13.example.com failed: Node is marked offline

We fix this by trivially intersecting the vm_capable list with the
online one.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

cc706abc

Fix cluster verify error on master-ip-setup script · 770461fe

René Nussbaumer authored 12 years ago


This error does not show up until we exceed the pool of master
candidates and have nodes which are not master candidates.

The background is that we check for master-ip-setup script on master
candidates and expect them not to be on the other nodes. However, we
distribute a default master-ip-script which break this assumption.
Furthermore, there's no reason why the file should just exists on the
master candidates.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

770461fe

Jun 27, 2012

Annotate disks upon blockdev_shutdown · 55de1d68

René Nussbaumer authored 12 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

55de1d68

Annotate disks on blockdev_remove · 4504bfcb

René Nussbaumer authored 12 years ago


This annotates the disks for the blockdev_remove where it is
appropriate. It leaves out 2 cases were we can't reliably annotate disk
parameters due to lack of knowledge what we should annotate. Those cases
affects only lvs used for drbd, so it doesn't affect the bug reported by
Constantinos.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

4504bfcb

Annotate disk params on blockdev_getmirrorstatus_multi · b5cbddd9

René Nussbaumer authored 12 years ago


This is also related to the bug reported by Constantinos,
as we've only one getmirrorstatus_multi call in whole cmdlib, we just
annotate them while we are building the disk list.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

b5cbddd9

Annotate disk parameters on blockdev_getmirrorstatus · 70817cee

René Nussbaumer authored 12 years ago


Not annotating them works for DRBD but not for RBD as reported by
Constantinos.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

70817cee

Jun 20, 2012

Fix bug in instance net changes · 80b898f9

Iustin Pop authored 12 years ago


_PrepareNicModification returns the invalid type, which triggers an
assert resulting in a mysterious error:

Failure: command execution error:

Without any explanation. We fix this by removing the return value from
_PrepareNicModification, and instead returning the expected type
(since it differs per create/modification) from the (existing)
wrappers for this function. We don't need to return actual changes
from this function as _ApplyNicMods is the function that
computes/returns the formatted changes.

Signed-off-by: Iustin Pop <iustin@google.com>
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

80b898f9

Jun 15, 2012

Verify the options on diskparameters · e4a4391d

René Nussbaumer authored 12 years ago


This prevents from setting for example drbd options on the plain disk
template.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e4a4391d

Jun 14, 2012

Fix creation of plain instances with --no-wait-for-sync · d8960502

Iustin Pop authored 12 years ago


As reported on the devel mailing list by Christos Stavrakakis,
creation of plain instances is broken when the --no-wait-for-sync flag
is passed, because in that case WaitForSync is not called, hence
SetDiskID is not called at all, resulting in a None physical_id being
passed to backend.

We fix that by explicitly calling SetDiskID, which will cover the
pause/resume and os_add RPC calls.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

d8960502

Jun 08, 2012

Improve error message for auto-promote/node modify · b59092f7

Iustin Pop authored 12 years ago


This has been reported internally 3-4 times already, and the current
version (from 8b437a6e) is still not good enough, it seems.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

b59092f7

Jun 01, 2012

Fix a type issue and bad logic in cluster verification · e375fb61

Iustin Pop authored 13 years ago


Commit 2e04d454 introduced the new offline state for the instance
state, but being a big monolithic patch it sneaked in something that
doesn't make sense.

The checks for extra instances (either wrongly up or just unknown) are
done purely on a name-basis, not on objects, so the types there are
wrong. Furthermore, they have no relation to the admin state of the
instance, so we just drop the entire if block. We keep the increment
of the offline instance count, but move it to a different loop over
instances.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e375fb61

May 22, 2012

Make it possible to reset vcpu/spindle ratio to default · cd415612
René Nussbaumer authored 13 years ago
```
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
cd415612

Add man page documentation for cpu_mask hv parameter · ff39194f

Iustin Pop authored 13 years ago


This is adapted from the design doc.

Also fixes a typo in cmdlib.py.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

ff39194f

May 15, 2012

Beautify a couple of error messages · ea0f78c8

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

ea0f78c8

Fix _ComputeNewInstanceViolations logic · 0fd5547a

Iustin Pop authored 13 years ago


This function did the opposite: was computing which old instance
violated the specs but no longer do it now. new - old is the expected
behaviour.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

0fd5547a

Beautify disk ipolicy violations in cluster-verify · 0c2e59ac

Iustin Pop authored 13 years ago


Currently, we only get:

  instance3: ['disk-size value 512 is not in range [1024, 1048576]'

which doesn't explain which disk we are talking about. This patch
extends the verification functions to take an additional parameter
that qualifies the disk:

  instance3: ['disk-size/0 value 512 is not in range [1024, 1048576]'

Future patch will make the formatting of the list better.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

0c2e59ac

May 14, 2012

LUInstanceCreate: Run rename script on instance import · e78a6817

Michael Hanselmann authored 13 years ago


If an instance is imported with a different name, network settings may have to
be changed. Since import scripts may not already to the right thing, we decided
to run the rename script. The same technique is already used for inter-cluster
instance moves.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

e78a6817

gnt-group add: Fix diskparam fill · 7228ca91

René Nussbaumer authored 13 years ago


This was a pretty non-obvious bug. A cluster looks sane after
gnt-cluster init, however on a daemon restart the diskparameters had the
default filled in. The same applies to gnt-group add. This is due to the
nature that UpgradeConfig() from NodeGroups did just populate them with
defaults if something was set on it.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

7228ca91

gnt-group modify: Fix an update issue with diskparams · b3230b32
René Nussbaumer authored 13 years ago
```
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
b3230b32

May 11, 2012
- query: Expose diskparamters through query · 2c758845
  René Nussbaumer authored 13 years ago
```
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
  2c758845
- gnt-cluster info: Print and format disk parameters · f9bbf32b
  René Nussbaumer authored 13 years ago
```
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
  f9bbf32b
May 10, 2012

apidoc: Fix some typos and errors introduced by my previous patches · af9fb4cc
René Nussbaumer authored 13 years ago
```
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
af9fb4cc
LUGroup*: Fix inheritance of disk parameters · b3f0d718
René Nussbaumer authored 13 years ago
```
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
b3f0d718

Special case blockdev_find · 769b0bde

René Nussbaumer authored 13 years ago


Similiar to blockdev_create we sometimes do find on children. This fixes
those cases. However, this is not very nice.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

769b0bde

Special case blockdev_create · fc5bb0fe

René Nussbaumer authored 13 years ago


This is due to the nature of bdev. We spread some logic into cmdlib and
deal for example with it's children recursively. This makes it hard to
annotate the disk parameters in a generic way as we don't always deal
with the top most disk. But the disk parameters are depending on the top
device not on the children.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

fc5bb0fe

cmdlib: Adding annotation helper for special cases · 887b52e8
René Nussbaumer authored 13 years ago
```
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
887b52e8
cmdlib: Remove all diskparams calculations not required anymore · 99ccf8b9
René Nussbaumer authored 13 years ago
```
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
99ccf8b9

cmdlib: Adapt the rpc calls · 62bfbc7d

René Nussbaumer authored 13 years ago


The following (blockdev) RPC calls are not converted yet (as they are
not straight forward or need more research):

* bdev_sizes
* blockdev_remove
* blockdev_shutdown
* blockdev_removechildren
* blockdev_close
* blockdev_getsize
* drbd_disconnect_net
* blockdev_rename (has already a special encoder, needs further research
  if needed at all)
* blockdev_getmirrorstatus (not sure if we have everywhere a clear link
  to the instance the disk belongs)
* blockdev_getmirrorstatus_multi (same here, further research)

Then special cases where we take care later in the patch series:

* blockdev_create (special cased)
* blockdev_find (special cased, like blockdev_create)

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

62bfbc7d

rpc: Adding helper to annotate disk params · cd46491f

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

cd46491f

Apr 26, 2012

Add 'absolute' grow-disk mode at OpCode/LU level · e7f99087

Iustin Pop authored 13 years ago


This also improves the log messages for the (default) relative mode
("by %s to %s").

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e7f99087