Commits · 9b99be280663a7281f6269ece7ba2f31add385ef · itminedu / snf-ganeti

Oct 11, 2012

verify-disks: Explicitely state nothing has to be done · 9b99be28

Michael Hanselmann authored 12 years ago


Example output:
$ gnt-cluster verify-disks
Submitted jobs 4327
Waiting for job 4327 ...
No disks need to be activated.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

9b99be28

Oct 05, 2012

Better list of replace-disks arguments + typos fixed · 50c1e351

Bernardo Dal Seno authored 12 years ago


The man page and the bultin-in help for gnt-instance replace-disks were
inconsistent. Also fixed some typos in man pages.

Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

50c1e351

jqueue: Look at archived jobs when watching · e4cf42d4

Michael Hanselmann authored 12 years ago


First: This enables the use of “gnt-job watch $id” for archived jobs.

Now, the reason for actually making this work is that during
sufficiently large group or node evacuations jobs are archived before
the client gets to poll for their output. This led to situations where
the jobs would finish successfully, but the client reported an error
because it couldn't see the job anymore.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
(cherry picked from commit 04569469)

e4cf42d4

Oct 03, 2012

Show old primary/secondary node on disk replacement · f0f8d060

Michael Hanselmann authored 12 years ago


People unfamiliar with Ganeti's internals might be confused with the
different hostnames showing up later in the process.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>

f0f8d060

gnt-instance reinstall: Don't always exit with success · 64be07b1

Michael Hanselmann authored 12 years ago


If one or more jobs failed the exit status should be set accordingly.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>

64be07b1

LUClusterVerify: Ignore /proc/drbd if DRBD is disabled · 2ef3383e

Michael Hanselmann authored 12 years ago


This fixes issue 190. The problem was that the check for DRBD was
enabled if LVM storage is used and didn't depend at all on whether DRBD
is enabled.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
(cherry picked from commit 3d8ae327)

2ef3383e

Sep 27, 2012

Always_failover doesn't require --allow-failover anymore · 320a5dae

Bernardo Dal Seno authored 12 years ago


If an administrator sets always_failover, it means that there is no need
for another explicit approval to failover instead of migrating.

Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
(cherry picked from commit b5f0b5cc)

Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

320a5dae

Sep 12, 2012

rpc: Remove duplicated logic, fix unittests · 0e2b7c58

Michael Hanselmann authored 12 years ago


Commit 5fce6a89 changed RpcRunner._InstDict to add the disk parameters
on all encoded instances. It didn't remove a special case in
“_InstDictOspDp”. Update and fix unittests as well.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

0e2b7c58

Annotate disk params on instance_start · 5fce6a89

Constantinos Venetsanopoulos authored 12 years ago


We call _GatherAndLinkBlockDevs during the process, which in turn
calls _RecursiveFindBD. This needs disk parameters to work.

See also commit b8291e00.

This was reported by Ansgar and Damien.

Signed-off-by: Constantinos Venetsanopoulos <cven@grnet.gr>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

5fce6a89

cmdlib: Handle locking.ALL_SET correctly when copying locks · ef86bf28

Michael Hanselmann authored 12 years ago


When locks are copied “locking.ALL_SET” must be handled separately
(ALL_SET has the value None). Reported by Constantinos Venetsanopoulos
who saw failover for RDB-based instances not working.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

ef86bf28

Sep 04, 2012

Fix gnt-debug iallocator · 09123222

René Nussbaumer authored 12 years ago


There was an issue with the recent ipolicy introduction which lead to a
bug in gnt-debug iallocator. It was not providing the spindle_use field
and therefore it wont let you create a valid iallocator request.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

09123222

Sep 03, 2012

Fix warnings/errors with newer pylint · 8ad0da1e

Iustin Pop authored 12 years ago


To help developing Ganeti on newer distributions, let's try to fix
pylint warnings/errors. I'm using pylint from current Debian wheezy:
pylint 0.25.1, astng 0.23.1, common 0.58.0, and we have 3 things that
needs fixing.

First, a really wide "except", with the silencing in the wrong
place. I'm not sure why this doesn't have "except Exception", so let's
add it. However, pylint still complains about "Catching too general
exception", even though we do want to catch both system and our
exception, so let's add a silence for W0703. It's true that we
shouldn't catch KeyboardInterrupt and friends, but that should be
cleaned up on the master branch.

Second, pylint complains about "redefining name builtin tuple",
because we do some pattern matching in the except blocks in
netutils. This seems to be a false positive, but let's clean the code
around this.

And finally, type inference again goes bad, so let's silence E1103
with its "boolean doesn't have 'get' method".

After this, I can run "make lint", and by extension "make
commit-check" on Debian Wheezy, yay! We might be able to bump our
required pylint versions to something not ancient…

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

8ad0da1e

Fix decorator uses which crash newer pylint · fc3f75dd

Iustin Pop authored 12 years ago


Pylint version:

  pylint 0.25.1,
  astng 0.23.1, common 0.58.0

crashes when passing the fully-qualified decorator name with:

  File "/usr/lib/pymodules/python2.7/pylint/checkers/base.py", line 161, in visit_function
    if not redefined_by_decorator(node):
  File "/usr/lib/pymodules/python2.7/pylint/checkers/base.py", line 116, in redefined_by_decorator
    decorator.expr.name == node.name):
AttributeError: 'Getattr' object has no attribute 'name'

I found out that simply using a shortened name will 'fix' this issue,
so let's do this to allow running newer pylint versions.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

fc3f75dd

Aug 22, 2012

Fix computation of disk sizes in _ComputeDiskSize · 6a3166cb

Constantinos Venetsanopoulos authored 12 years ago


Currently, hail fails with FailDisk when trying to add an instance
of type: 'file', 'sharedfile' and 'rbd'.

This is due to a "0" or None value in the corresponding dict inside
_ComputeDiskSize, which results in a "O" or non Int value of the
exported 'disk_space_total' parameter. This in turn makes hail fail,
when trying to process the value:

 - with "Unable to read Int" if value is None (file)
 - with FailDisk if value is 0 (sharedfile, rbd)

The latter happens because the 0 value doesn't match the instance's
IPolicy, since it is lower than the minimum disk size.

The second problem still exists when using adoption with 'plain'
and 'blockdev' template and will be addressed in another commit.

Signed-off-by: Constantinos Venetsanopoulos <cven@grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

6a3166cb

Aug 15, 2012

Add verification of RPC results in _WipeDisks · f08e5132

Iustin Pop authored 12 years ago


Due to an oversight, the pause/resume sync RPC calls in _WipeDisks
lack the verification of the overall RPC status, and directly iterate
over the payload. The code actually doing the wipe does verify
correctly the results. This can result in jobs failing with a hard to
diagnose:

OpExecError ['NoneType' object is not iterable]

instead of proper "RPC failed" message.

This patch adds a hard check on the pause call, but for the resume
call it just logs a warning if the RPC failed; the rationale being
that if we can't contact the node for pausing the sync, it's likely
wiping will fail too, but after the wipe has been done, we can
continue.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

f08e5132

Aug 10, 2012

Fix double use of PRIORITY_OPT in gnt-node migrate · 7db596df

Iustin Pop authored 12 years ago


This breaks the command, as optparse considers that an error.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

7db596df

Jul 27, 2012

Fix 'explicitely' common typo · 2ed0e208

Iustin Pop authored 12 years ago


It seems that 'explicitely' is wrong, and that the right form is
'explicitly'. This is just fixing the typo plus adjusting affected
paragraphs.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

2ed0e208

Jul 26, 2012

Fix issue in LUClusterVerifyGroup with multi-group clusters · 350506c6

Iustin Pop authored 12 years ago


In case LUClusterVerifyGroup is run on a group which doesn't contain
the master node, the following could happen:

- master node is selected due to the explicit check
- if the order of nodes in the 'absent_nodes' list is such that the
  master node is the first in it, then we'll select (again) the master
  node
- passing duplicate nodes to RPC calls will break due to RPC
  internals; this should be fixed separately, but in the meantime we
  just refrain from passing such duplicates

This patch should not change the semantics of the code, since it
wasn't guaranteed even before that we find a vm_capable node.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>

350506c6

Fix node group modification of node parameters · 4bf27dab

Iustin Pop authored 12 years ago


Commit 904b3bfe tried to fix the deletion of custom ndparams from
group, but instead broke both modification and deletion: because we
run ForceDictType on self.op.ndparams instead of the updated
new_ndparams, we can neither delete nor set properly spindle_count
(since it won't be coerced to int).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

4bf27dab

Jul 24, 2012

Fix boot=on flag for CDROMs · 24be50e0

Iustin Pop authored 12 years ago


This generalises commit 4304964a to cdroms too, since they have
somewhat the same logic. We just abstract the needs_boot_flag into a
separate variable, and then reuse it in the cdrom section.

Note that the logic of what 'if=' type to pass to KVM was very
convoluted, and (I think) incorrect; I went and cleaned it to be more
consistent.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

24be50e0

KVM: only pass boot flag once · 2b846304

Iustin Pop authored 12 years ago


This addresses issue 230: passing two methods of booting to KVM can,
depending on the KVM version, confuse it.

Note that commit 4304964a introduced a partial fix for this (but only
for disks, and keyed on KVM versions). However, it didn't fix cdrom
booting, which still fails with the same error, so let's fix it more
generically; we still leave the per-disk check since that is about
-boot c versus -drive …,boot=on rather than two boot methods.

Patch is based on the one submitted by Vladimir Mencl, many thanks!

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

2b846304

Jul 19, 2012

Fix setting ipolicy on node groups · 8b057218

René Nussbaumer authored 12 years ago


On node groups we don't have the std field. However, the InstancePolicy
object always verifies that the std value is within a given range. As we
fill it up with defaults if not set (as it happens to be on node groups)
and the min value is higher than the default std value (taken from
constants.py) we fail.

We overcome this situation by simply let the function know if we want to
verify the std value at all. If we don't want to verify std, we just set
it to a compliant value (min_v) and continue.

We also slightly adapt the error message provided, as we don't have std
values on groups.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

8b057218

Fix --no-headers for the new list-drbd command · 2da31181

Iustin Pop authored 12 years ago


Sorry, I forgot that with GenerateTable this needs to be handled
manually. Fixed now and tested in both ways.

(But to be honest, this should be abstracted in GenerateTable, instead
of the 'if' test in all its callers.)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

2da31181

Add a new gnt-node command list-drbd · 7acbda7b

Iustin Pop authored 12 years ago


This uses confd to query the DRBD minors, which is very special; no
other command currently does so.

Since the backend is only implemented in the Haskell version of confd,
we have checks that 1) confd is enable, and 2) hs confd is also
enabled. If by mistake people do manage to query Python confd, the
error message will be clean:

  Query gave non-ok status '2': not implemented

So nothing breaks in an "ugly" way.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

7acbda7b

Add a new unused confd query · 792f8e55

Iustin Pop authored 12 years ago


This is not implemented currently.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

792f8e55

Fix a docstring in bdev's DRBD8 class · 5c755a4d

Iustin Pop authored 12 years ago


It seems this was not updated since the move to static minors…

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

5c755a4d

Jul 18, 2012

Ensure that disk.params is always defined (and a dict) · 5dbee5ea

Iustin Pop authored 12 years ago


Commit cce46164 fixed upgrading from other 2.6 versions, but
accidentally broke upgrading from 2.5 (disk.params was left as None,
which breaks FillDict). The simplest way to handle params is to always
set them to an empty dict (disregarding what they currently contain).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

5dbee5ea

Another small consistency fix with if branches · 1b5b1c49

René Nussbaumer authored 12 years ago


While looking at the testability of this piece of code, I found another
consistency problem. We have two if branches instead of one, with
elif's.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

1b5b1c49

Fix inconsistency in the LUXI protocol w.r.t. args · 734a2a7c

René Nussbaumer authored 12 years ago


This inconsistency was found during rebalancing. Hbal failed because,
Ganeti couldn't load the opcode. After digging through the cause, an
inconsistency with the "args" field in the LUXI protocol was triggered
by the TemplateHaskell side where it's done uniformed.

For SubmitJob and SubmitManyJobs we treat args as one argument,
containing the job definition. In every other LUXI call args is actually
a list of arguments. This patch fixes this consistency.

This change is NOT backwards compatible.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

734a2a7c

Jul 17, 2012

Fix UpgradeConfig of Disk object regards disk params · cce46164

René Nussbaumer authored 12 years ago


This bug was found during disk parameter debugging. While looking at the
config some values seem present on the disk parameters, but that's not
expected behaviour. This patch fixes this, and also fixes the "broken"
configs automatically upon masterd restart.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

cce46164

Jul 13, 2012

Allow reinstall even when secondaries are offline · 96c3d5d4

René Nussbaumer authored 12 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

96c3d5d4

Jul 11, 2012

Ignore offline node errors when removing disks · 03e5cdd5

Agata Murawska authored 12 years ago


When we delete DRBD disks from some instance, we do not want to get
errors due to nodes other than that instance's primary being offline.

Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

03e5cdd5

Jul 07, 2012

Allow instance disc activation with offline secondaries · d908ba61

Iustin Pop authored 12 years ago


Currently, this is not allowed, so one can't run a replace-disks; this
breaks any non-invasive method of recovering the redundancy of the
instance if its disks are already stopped (but it still works if the
disks on the primary are active). So let's fix this inconsistency.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

d908ba61

Jul 06, 2012

RAPI regression beparams/memory fix · 28a45bfc

René Nussbaumer authored 12 years ago


For compatibility with the old Ganeti version, we want to keep the
beparams/memory field around for another release. This patch fixes this
regression.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

28a45bfc

Jul 05, 2012

Fix redistribution of files w.r.t. offline nodes · cc706abc

Iustin Pop authored 12 years ago


Currently, _RedistributeAncillaryFiles computes two lists: the list of
online nodes (for all files redistribution), and the list of
vm_capable nodes, for hypervisor-specific files. However, the
vm_capable list includes offline nodes too, leading to warning
messages:

  WARNING: Copy of file /etc/xen/xend-config.sxp to node node13.example.com failed: Node is marked offline

We fix this by trivially intersecting the vm_capable list with the
online one.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

cc706abc

Fix cluster verify error on master-ip-setup script · 770461fe

René Nussbaumer authored 12 years ago


This error does not show up until we exceed the pool of master
candidates and have nodes which are not master candidates.

The background is that we check for master-ip-setup script on master
candidates and expect them not to be on the other nodes. However, we
distribute a default master-ip-script which break this assumption.
Furthermore, there's no reason why the file should just exists on the
master candidates.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

770461fe

Jun 29, 2012

Allow param `modify' during gnt-instance modify · f0d22861

Constantinos Venetsanopoulos authored 12 years ago


With the new gnt-instance modify syntax for addition and removal of
disks/NICs on arbitrary indexes, we hit an assertion if the user
passes `modify' as one of the disk's parameters. E.g::

 gnt-instance modify --disk 2:modify,size=3G instance1
 gnt-instance modify --disk 3:add,size=1G,modify instance2

This patch fixes the bug, by allowing `modify' to be passed as a
parameter (as happens with `add' and `remove'), as long as it is
not done alongside `add' or `remove'. If so, it is treated in the
same way as if none of modify/add/remove is passed --> modify.

Signed-off-by: Constantinos Venetsanopoulos <cven@grnet.gr>
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

f0d22861

Jun 28, 2012

Annotate disk params on instance_os_add · b8291e00

René Nussbaumer authored 12 years ago


We call _OpenRealBD during the process and this needs disk parameters to
work. This was reported by Constantinos.

The fix is very ugly though.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

b8291e00

Jun 27, 2012

Annotate disks upon blockdev_shutdown · 55de1d68

René Nussbaumer authored 12 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

55de1d68

Annotate disks on blockdev_remove · 4504bfcb

René Nussbaumer authored 12 years ago


This annotates the disks for the blockdev_remove where it is
appropriate. It leaves out 2 cases were we can't reliably annotate disk
parameters due to lack of knowledge what we should annotate. Those cases
affects only lvs used for drbd, so it doesn't affect the bug reported by
Constantinos.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

4504bfcb