Commit 7734de0a authored by Iustin Pop's avatar Iustin Pop
Browse files

Revert breakage introduced in e4e9b806

Commit e4e9b806

 introduced two problems
in backend.InstanceShutdown():

- first, it reduced the check interval significantly (especially for the
  first few checks); there are very few production VMs that shutdown in
  one second, and while not breaking anything this creates unnecessary
  load for the hypervisor
- second, a wrong test added to the while condition (“not tried_once”)
  means that we only sleep once for an instance, and after that we
  immediately kill it forcefully

These two together means that any instance which is not lucky enough to
finish in roughly 1-1.5 seconds (the time it takes to sleep and verify
again the instance list) will have this happen:

2009-10-21 23:33:46,034:  pid=16634 INFO Called for inst9 w. False/False
2009-10-21 23:33:47,440:  pid=16634 ERROR Shutdown of 'inst9' unsuccessful, forcing
2009-10-21 23:33:47,440:  pid=16634 INFO Called for inst9 w. True/False

The “Called…” are logs from the hypervisor shutdown function. This means
of course that at restart time:

[12775866.644682] EXT3-fs: INFO: recovery required on readonly filesystem.
[12775866.644689] EXT3-fs: write access will be enabled during recovery.
[12775868.533674] kjournald starting.  Commit interval 5 seconds
[12775868.533697] EXT3-fs: sda1: orphan cleanup on readonly fs
[12775868.551797] EXT3-fs: sda1: 12 orphan inodes deleted
[12775868.551803] EXT3-fs: recovery complete.
[12775868.586275] EXT3-fs: mounted filesystem with ordered data mode.

This patch reverts the broken test and changes the sleep to a fixed
duration of five seconds, since it makes no sense to check that often
for shutdown (and after ~20 seconds we anyway reach a stable value of
five seconds).
Signed-off-by: default avatarIustin Pop <iustin@google.com>
Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
parent 0cf11e68
......@@ -994,10 +994,10 @@ def InstanceShutdown(instance, timeout):
start = time.time()
end = start + timeout
sleep_time = 1
sleep_time = 5
tried_once = False
while not tried_once and time.time() < end:
while time.time() < end:
try:
hyper.StopInstance(instance, retry=tried_once)
except errors.HypervisorError, err:
......@@ -1006,10 +1006,6 @@ def InstanceShutdown(instance, timeout):
time.sleep(sleep_time)
if instance.name not in hyper.ListInstances():
break
if sleep_time < 5:
# 1.2 behaves particularly good for our case:
# it gives us 10 increasing steps and caps just slightly above 5 seconds
sleep_time *= 1.2
else:
# the shutdown did not succeed
logging.error("Shutdown of '%s' unsuccessful, forcing", iname)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment