Commits · 2911f46c23b6ea1815828537c5b3622109c83b45 · itminedu / snf-ganeti

Jun 11, 2012

Add the keymap directory to the list of runtime KVM dirs · 2911f46c

Iustin Pop authored 13 years ago


Commit 4f580fef added the keymap support, but missed that this
directory needs to be ensured/created at hypervisor init time.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

2911f46c

May 11, 2012

Fix gnt-group --help display · 3ad56046

Iustin Pop authored 13 years ago


Copy-paste mismatch :)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
(cherry picked from commit 36c70d4d)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

3ad56046

Fix hardcoded Xen kernel path · 3ac3de7a

Iustin Pop authored 13 years ago


We already have a ./configure-time variable for this, but it seems to
be actually unused.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
(cherry picked from commit 3c4afa2e)

Signed-off-by: Iustin Pop <iustin@google.com>
(trivial patch, let's cherry-pick it)
Reviewed-by: Michael Hanselmann <hansmi@google.com>

3ac3de7a

Fix grow-disk handling of invalid units · b53874cb

Iustin Pop authored 13 years ago


The reason why grow-disk was doing:

$ gnt-instance grow-disk instance3 0 -64
Unhandled Ganeti error: Invalid format

Is because it does it's own ParseUnit call, and doesn't transform that
into a nicer message.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
(cherry picked from commit c8bde61e)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

b53874cb

Accept both PUT and POST in noded · 37c4d509

Iustin Pop authored 13 years ago


This is a partial cherry-pick from
7530364d on master:

Currently, noded requires PUT, even though the semantics of the RPC
calls do not match a PUT. We change the code accept both PUT and POST,
with the intention to remove the PUT support in a later version.

Additionally, we add a message to the HttpBadRequest exception to make
clear the failure mode (not seeing any error message was what made me
send this patch…). This was the only description-less use of this
exception, by the way.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
(cherry picked from commit 7530364d)

What was not cherry-picked is the rpc change (to switch to PUT). The
reason I want to backport this to devel-2.5 is that when upgrading to
2.6, having noded accept both makes for an easier upgrade path.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
(cherry picked from commit 5d0566de)

Signed-off-by: Iustin Pop <iustin@google.com>

Yet another cherry-pick (must go deeper!); since we might not make a
new release from the devel-2.5 branch, let's add this to stable-2.5.

Reviewed-by: Michael Hanselmann <hansmi@google.com>

37c4d509

Update synopsis for “gnt-cluster repair-disk-sizes” · 0fc1764f

Michael Hanselmann authored 13 years ago


Mention that instances can be passed on the CLI when “--help” is used.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>
(cherry picked from commit eb5ac108)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

0fc1764f

Workaround changed LVM behaviour · 4c5dd3ff

Iustin Pop authored 13 years ago


The vgreduce command has changed behaviour from when we initially
wrote the code (2.02.02 versus 2.02.66, 4 years delta):

- if there are LVs which will be impacted, it requires --force
- otherwise refuses to proceed, but it still returns exit code 0

We handle this by looking to see if it returns "Wrote out consistent
volume group" (behaviour unchanged), or if it complains about
"--force"; in the case it didn't complete, we retry the operation.

We improve a bit the checking of "vgs", as it uses to fail silently
and we didn't detect it.

New tests for this function should test, I believe, all the expected
variations; at the least we now have data files with the expected
output.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
(cherry picked from commit 048eeb2b)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

4c5dd3ff

May 09, 2012

Add a default PATH variable to OS scripts env · 9a6ade06

Iustin Pop authored 13 years ago


In commit 896a03f6 I cleaned up the environment for OS scripts,
however I think that was a bit too extreme - it breaks our own
instance-debootstrap hooks, because for example dpkg (called from the
grub script) requires PATH to be set.

Instead of requiring every OS to define a path, let's set a default
PATH for the OS scripts, which should cover most common uses. A more
specialised PATH can be set, if needed, in the OS scripts.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

9a6ade06

Move hooks PATH environment variable to constants · aa7b59ac

Andrea Spadaccini authored 13 years ago


Move the contents of the PATH environment variable for hooks to
constants, and use its value in the code and in the hooks documentation.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit fe5ca2bb)
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

aa7b59ac

Fix exception re-raising in Python Luxi clients · 98dfcaff

Iustin Pop authored 13 years ago


Commit e687ec01 (present in 2.5 since the 2.5 beta 3) did consistency
fixes across the code-base. Unfortunately this was done without enough
checks on the actual meaning of one of the fixes, which means error
re-raising in lib/errors.py is broken.

The problem is that:

  raise cls, args

is different than:

  raise cls(args)

And our unit-tests didn't catch this (this patch updates the tests).

This breakage is usually trivial, like wrong error messages:

  $ gnt-instance remove no-such-instance
  Failure: prerequisites not met for this operation:
  ("Instance 'no-such-instance' not known", 'unknown_entity')

versus:

  $ gnt-instance remove no-such-instance
  Failure: prerequisites not met for this operation:
  error type: unknown_entity, error details:
  Instance 'no-such-instance' not known

or:

  $ gnt-instance add … no-such-instance
  Failure: prerequisites not met for this operation:
  ('The given name (no-such-instance) does not resolve: Name or service not known', 'resolver_error')

versus:

  $ gnt-instance add … no-such-instance
  Failure: prerequisites not met for this operation:
  error type: resolver_error, error details:
  The given name (no-such-instance) does not resolve: Name or service not known

But in some cases where we rely on a certain data representation
(e.g. HooksAbort), this actually breaks because we try to iterate over
the wrong type:

  File "/usr/lib/python2.6/dist-packages/ganeti/cli.py", line 1907, in FormatError
     for node, script, out in err.args[0]:
  ValueError: need more than 1 value to unpack

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

98dfcaff

May 07, 2012

Fix LVM volume listing with newer LVM · a1f38213

Iustin Pop authored 13 years ago


Per commit 0304f0ec, newer LVM has extended the lv_attr field. However,
that commit was incomplete as we examine this attribute in another
place in the code.

Thanks to user alperhome, the _LVSLINE_REGEX in lib/backend.py also
needs fixing. I've used the same change as in the above commit: accept
at minimum 6 characters, but allow for more.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>

a1f38213

Apr 11, 2012

Fix extra whitespace · 612f7fd4

Iustin Pop authored 13 years ago


Sorry, didn't catch this before…

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
(cherry picked from commit 54b010ca)

Signed-off-by: Michael Hanselmann <hansmi@google.com>

612f7fd4

Further fixes concerning drbd port release · 42f25b0b

Dimitris Aragiorgis authored 13 years ago


Commit 3b3b1bca does not entirely fix the bug introduced in commit
f396ad8c. It fixes consistency of config data in permanent storage, but
does not ensure consistency in data held in runtime memory of masterd.

The bug of duplicate ports is still triggered when LUInstanceRemove()
invokes _RemoveDisks() and this returns False (in case
call_blockdev_remove RPC fails). The drbd ports get returned in the
pool, but execution is aborted and RemoveInstance() is never invoked.

Due to the fact that port handling is not done with
TemporaryReservationManager, ensure that ports are released, only if
disk related config data is deleted.

In _RemoveDisks() release ports only if all RPCs succeed.

Extend _RemoveDisks() to include ignore_failures argument passed by
_RemoveInstance() to handle the ports appropriately.

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

42f25b0b

Fix a bug concerning TCP port release · 2522b7c4

Dimitris Aragiorgis authored 13 years ago


Commit f396ad8c returns the TCP port used by DRBD disk back to the
TCP/UDP port pool using AddTcpUdpPort().

However, AddTcpUdpPort() writes the config on every invocation,
using _WriteConfig(). This causes two problems:

 * it causes critical errors logged by VerifyConfig(), after the DRBD
   disk removal, and until the actual instance removal.
 * if the code following AddTcpUdpPort() fails, the port is already
   returned back the pool, which causes the port to have duplicates
   (inconsistent config).

AddTcpUdpPort() is invoked in three cases:

 * during InstanceRemove() through _RemoveDisks().
 * during InstanceSetParams() in case of disk removal.
 * during InstanceSetParams() through _ConvertDrbdToPlain().

This commit fixes the problem by removing the _WriteConfig() call from
AddTcpUdpPort(), delegate it to Update() via the
TemporaryReservationManager and ensure AddTcpUdpPort() precedes
Update().

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
[iustin@google.com: small comments adjustements]
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
(cherry picked from commit 3b3b1bca)

2522b7c4

Mar 30, 2012

Fix extra whitespace · 54b010ca

Iustin Pop authored 13 years ago


Sorry, didn't catch this before…

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

54b010ca

Mar 29, 2012

Fix a bug concerning TCP port release · 3b3b1bca

Dimitris Aragiorgis authored 13 years ago


Commit f396ad8c returns the TCP port used by DRBD disk back to the
TCP/UDP port pool using AddTcpUdpPort().

However, AddTcpUdpPort() writes the config on every invocation,
using _WriteConfig(). This causes two problems:

 * it causes critical errors logged by VerifyConfig(), after the DRBD
   disk removal, and until the actual instance removal.
 * if the code following AddTcpUdpPort() fails, the port is already
   returned back the pool, which causes the port to have duplicates
   (inconsistent config).

AddTcpUdpPort() is invoked in three cases:

 * during InstanceRemove() through _RemoveDisks().
 * during InstanceSetParams() in case of disk removal.
 * during InstanceSetParams() through _ConvertDrbdToPlain().

This commit fixes the problem by removing the _WriteConfig() call from
AddTcpUdpPort(), delegate it to Update() via the
TemporaryReservationManager and ensure AddTcpUdpPort() precedes
Update().

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
[iustin@google.com: small comments adjustements]
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

3b3b1bca

Mar 28, 2012

LUOobCommand: acquire BGL in shared mode · 6977943c

Bernardo Dal Seno authored 13 years ago


Fixed a typo so that now LUOobCommand acquires the BLG in shared mode, as
intended.

Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

6977943c

Mar 23, 2012

LUNodeAdd: Verify version in Prereq · 2d453213

René Nussbaumer authored 13 years ago


There are other ways to leave the cluster in a broken state than just
the version check. However they are not very trivial to fix in 2.5. So
leave it up to 2.6 for a nicer fix.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
(cherry picked from commit e2ea8de1)

2d453213

Fix LV status parsing to accept newer LVM · 0304f0ec

Iustin Pop authored 13 years ago


LVM version 2.02.93 (or at least, sometimes after .88) has extend the
lv_attr field with two more flag; we only care about the first digit,
so let's change the "!= 6" check to "< 6".

Thanks to Robin H Johnson <robbat2@gentoo.org> for finding this issue.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

0304f0ec

Mar 22, 2012

Revert "Stop acquiring BGL for LUXI queries" · 6fe4baf0

Michael Hanselmann authored 13 years ago


This reverts commit 0fa753ba.

Turns out there are more queries acquiring locks than we'd like. This
patch goes to version 2.6 and a separate patch fixes the immediate
issues in LUClusterVerifyConfig.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>

6fe4baf0

LUClusterVerifyConfig: Share BGL, acquire all locks in shared mode · a5485ffc

Michael Hanselmann authored 13 years ago


Instead of acquiring the BGL in exclusive mode (which blocks all other
operations), we acquire all locks for groups, nodes and instances in
shared mode before verifying the configuration.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>

a5485ffc

Mar 21, 2012

KVM: don't add -nographic using spice · 596b2459

Guido Trotter authored 13 years ago


This fixes issue 222.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

596b2459

Mar 20, 2012

Stop acquiring BGL for LUXI queries · 0fa753ba

Michael Hanselmann authored 13 years ago


Short description: This fixes an issue whereby masterd would become
unresponsive on the LUXI socket, leading to client timeouts. While made
worse in 2.5, the underlying issue was already present in 2.4.

Longer description: Until now all LUXI queries would acquire the BGL
(big Ganeti lock) in shared mode. With the exception of OpNodeAdd and
OpNodeRemove, this was also the case for all opcodes before version 2.5.
In 2.5 we split OpClusterVerify into multiple opcodes, one of which
(OpClusterVerifyConfig) now acquires the BGL in exclusive mode. Whether
or not doing so is good is a separate discussion: OpNodeAdd and
OpNodeRemove, as of this writing, still require an exclusive BGL.
OpClusterVerifyConfig is run more often than OpNodeAdd or OpNodeRemove
in normal clusters, which is why we only recognized this issue in 2.5.

What would happen is that once OpClusterVerifyConfig tried to acquire
its exclusive BGL while it was actually held by other opcodes (e.g.
OpInstanceReplaceDisks), the locking code would not grant shared
acquires for the BGL, even when the exclusive acquire is removed from
the queue for a short amount of time after a timeout. This is necessary
to prevent lock starvation.

In this situation further LUXI queries requiring the BGL in shared mode,
e.g. OpClusterQuery, would block and the client eventually time out.
Over time they fill the client request workerpool's queue and at that
point even requests not requiring the BGL stop working. Once the
long-running operation(s) holding the BGL in shared mode finished,
OpClusterVerifyConfig gets it in exclusive mode and everything returns
to normal. LUXI recovers very soon too.

I'd like to thank Bernardo Dal Seno for his contribution to this bugfix.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>

0fa753ba

Mar 19, 2012

Fix type error in LUInstanceChangeGroup · 666e013f

Iustin Pop authored 13 years ago


If a specific list of groups has been requested, then the code used
that, without transforming it to a (frozen)set first, which results
in:

 unsupported operand type(s) for &: 'list' and 'frozenset'

Trivial fix is to do that in the 'then' branch.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

666e013f

Jan 31, 2012

Fix type check for OpQuery.filter · 545d0362

Michael Hanselmann authored 13 years ago

Just using ht.TListOf as a type check doesn't work correctly. The
function must be called with the expected item type. In this specific
case TListOf was always called with the filter as a value, and the
result of that call evaluated to truth. Since filters can be quite
complex there's no check yet, and therefore just “TList” is used.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

545d0362

Jan 26, 2012

Fix explanation of gnt-node evacuate --primaries-only · f1dff7ec

Iustin Pop authored 13 years ago


Furthermore, correct the --help display on evacuate.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

f1dff7ec

Jan 25, 2012

Fix cluster verification issues on multi-group clusters · 2c2f257d

Michael Hanselmann authored 13 years ago


This patch attempts to fix a number of issues with “gnt-cluster verify”
in presence of multiple node groups and DRBD8 instances split over nodes
in more than one group.

- Look up instances in a group only by their primary node (otherwise
  split instances would be considered when verifying any of their node's
  groups)
- When gathering additional nodes for LV checks, just compare instance's
  node's groups with the currently verified group instead of comparing
  against the primary node's group
- Exclude nodes in other groups when calculating N+1 errors and checking
  logical volumes

Not directly related, but a small error text is also clarified.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

2c2f257d

Jan 20, 2012

Migrate: don't check for free memory on cleanup · 6b826dfa

Guido Trotter authored 13 years ago

Cleanup just updates the config with the correct location of the
instance, or informs of its down status, but never starts it. As such
there's no point in checking for enough free memory. Actually this check
could prevent a perfectly safe cleanup operation if a node is busy.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

6b826dfa

Jan 06, 2012

KVM: support version reported by 1.0 · 585c8187

Guido Trotter authored 13 years ago


This of course was working for all the rcs, but broke with 1.0 itself.

In addition:
  - split between running kvm --version and parsing its output
  - unittest parsing for various known --help outputs
  - updated NEWS file
  - happy 2012 wishes
  - the hope to finish this patch before it's time to say happy easter
    :)

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

585c8187

Dec 21, 2011

jqueue: Fix epylint errors introduced in 37d76f1e · 1316ebc2
Michael Hanselmann authored 13 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
1316ebc2

jqueue: Fix deadlock between job queue and dependency manager · 37d76f1e

Michael Hanselmann authored 13 years ago


When an opcode is about to be processed its dependencies are
evaluated using “_JobDependencyManager.CheckAndRegister”. Due
to its nature that function requires a lock on the manager's
internal structures. All of this happens while the job queue
lock is held in shared mode (required for the job processor).

When a job has been processed any pending dependencies are re-added
to the job workerpool. Before this patch that would require
the manager's lock and then, for adding the jobs, the job queue
lock. Since this is in reverse order it will lead to deadlocks.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

37d76f1e

Nov 30, 2011

Fix a bug in command line option parsing code · 997f690f

Nikos Skalkotos authored 13 years ago


Fix bug affecting command line options of "keyval" type. Although
escaping commands with \ is supported, it is is not applied to the
input recursively.

Signed-off-by: Nikos Skalkotos <skalkoto@grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

997f690f

Nov 24, 2011

ConfigWriter: Fix epydoc error · 1d4930b9

Michael Hanselmann authored 13 years ago


The parameter is called “mods”, not “modes”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>
(cherry picked from commit 1730d4a1)

1d4930b9

ConfigWriter: Fix epydoc error · 1730d4a1

Michael Hanselmann authored 13 years ago


The parameter is called “mods”, not “modes”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

1730d4a1

LUGroupAssignNodes: Fix node membership corruption · 54c31fd3

Michael Hanselmann authored 13 years ago


Note: This bug only manifests itself in Ganeti 2.5, but since the
problematic code also exists in 2.4, I decided to fix it there.

If a node was assigned to a new group using “gnt-group assign-nodes” the
node object's group would be changed, but not the duplicate member list
in the group object. The latter is an optimization to require fewer
locks for other operations. The per-group member list is only kept in
memory and not written to disk.

Ganeti 2.5 starts to make use of the data kept in the per-group member
list and consequently fails when it is out of date. The following
commands can be used to reproduce the issue in 2.5 (in 2.4 the issue was
confirmed using additional logging):

  $ gnt-group add foo
  $ gnt-group assign-nodes foo $(gnt-node list --no-header -o name)
  $ gnt-cluster verify  # Fails with KeyError

This patch moves the code modifying node and group objects into
“config.ConfigWriter” to do the complete operation under the config
lock, and also to avoid making use of side-effects of modifying objects
without calling “ConfigWriter.Update”. A unittest is included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
(cherry picked from commit 218f4c3d)

54c31fd3

LUGroupAssignNodes: Fix node membership corruption · 218f4c3d

Michael Hanselmann authored 13 years ago


Note: This bug only manifests itself in Ganeti 2.5, but since the
problematic code also exists in 2.4, I decided to fix it there.

If a node was assigned to a new group using “gnt-group assign-nodes” the
node object's group would be changed, but not the duplicate member list
in the group object. The latter is an optimization to require fewer
locks for other operations. The per-group member list is only kept in
memory and not written to disk.

Ganeti 2.5 starts to make use of the data kept in the per-group member
list and consequently fails when it is out of date. The following
commands can be used to reproduce the issue in 2.5 (in 2.4 the issue was
confirmed using additional logging):

  $ gnt-group add foo
  $ gnt-group assign-nodes foo $(gnt-node list --no-header -o name)
  $ gnt-cluster verify  # Fails with KeyError

This patch moves the code modifying node and group objects into
“config.ConfigWriter” to do the complete operation under the config
lock, and also to avoid making use of side-effects of modifying objects
without calling “ConfigWriter.Update”. A unittest is included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

218f4c3d

Fix pylint warning on unreachable code · 9c4f4dd6

Michael Hanselmann authored 13 years ago


Commit c50452c3 added an exception when all instances should be
evacuated off a node, but did so in a way which made pylint complain
about unreachable code.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

9c4f4dd6

Nov 23, 2011

LUNodeEvacuate: Disallow migrating all instances at once · c50452c3

Michael Hanselmann authored 13 years ago


There is a design issue in the iallocator interface which prevents us
from doing this.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

c50452c3

LUNodeEvacuate: Locking fixes · 50722bfd

Michael Hanselmann authored 13 years ago


When evacuating a node, only an assertion without informative text was
used to check if the necessary node locks had been acquired. This was on
top of evaluating the list of nodes without having a node group lock, so
this was changed as well.

Also update some exception messages to include “retry the operation”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

50722bfd

Fix error when removing node · d05326fc

Michael Hanselmann authored 13 years ago


ConfigWriter.GetAllInstancesInfo returns a dictionary, not a list.
Removing a node would fail with “too many values to unpack”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

d05326fc