1. 22 Oct, 2009 4 commits
    • Flavio Silvestrow's avatar
    • Iustin Pop's avatar
      Try to reduce wrong errors in InstanceShutdown · 3782acd7
      Iustin Pop authored
      
      
      In backend.InstanceShutdown(), there is a race condition between
      checking that the instance exists and trying to shut it down which
      translates sometime in error messages like:
      
      Tue Oct 20 20:08:30 2009 - WARNING: Could not shutdown instance: Failed
      to force stop instance instance9: Failed to stop instance instance9:
      exited with exit code 1, Error: Domain 'instance9' does not exist.
      
      To fix this, we ignore any hypervisor StopInstance() errors if the
      instance doesn't exist anymore, since our purpose (to make the instance
      go away) is already accomplished.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      3782acd7
    • Iustin Pop's avatar
      Revert breakage introduced in e4e9b806 · 7734de0a
      Iustin Pop authored
      Commit e4e9b806
      
       introduced two problems
      in backend.InstanceShutdown():
      
      - first, it reduced the check interval significantly (especially for the
        first few checks); there are very few production VMs that shutdown in
        one second, and while not breaking anything this creates unnecessary
        load for the hypervisor
      - second, a wrong test added to the while condition (“not tried_once”)
        means that we only sleep once for an instance, and after that we
        immediately kill it forcefully
      
      These two together means that any instance which is not lucky enough to
      finish in roughly 1-1.5 seconds (the time it takes to sleep and verify
      again the instance list) will have this happen:
      
      2009-10-21 23:33:46,034:  pid=16634 INFO Called for inst9 w. False/False
      2009-10-21 23:33:47,440:  pid=16634 ERROR Shutdown of 'inst9' unsuccessful, forcing
      2009-10-21 23:33:47,440:  pid=16634 INFO Called for inst9 w. True/False
      
      The “Called…” are logs from the hypervisor shutdown function. This means
      of course that at restart time:
      
      [12775866.644682] EXT3-fs: INFO: recovery required on readonly filesystem.
      [12775866.644689] EXT3-fs: write access will be enabled during recovery.
      [12775868.533674] kjournald starting.  Commit interval 5 seconds
      [12775868.533697] EXT3-fs: sda1: orphan cleanup on readonly fs
      [12775868.551797] EXT3-fs: sda1: 12 orphan inodes deleted
      [12775868.551803] EXT3-fs: recovery complete.
      [12775868.586275] EXT3-fs: mounted filesystem with ordered data mode.
      
      This patch reverts the broken test and changes the sleep to a fixed
      duration of five seconds, since it makes no sense to check that often
      for shutdown (and after ~20 seconds we anyway reach a stable value of
      five seconds).
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      7734de0a
    • Iustin Pop's avatar
      Xen: Ignore the retry argument in stop instance · 0cf11e68
      Iustin Pop authored
      Commit 4ad45119
      
       changed the KVM hypervisor to send multiple shutdown
      requests to the monitor, but it didn't change this for the Xen
      hypervisor. We simply remove the return on retry model, since we do want
      to send multiple shutdown signals for both Xen and KVM (even if the
      behaviour is not perfect, they should behave the same).
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      0cf11e68
  2. 21 Oct, 2009 1 commit
  3. 20 Oct, 2009 4 commits
  4. 19 Oct, 2009 1 commit
  5. 16 Oct, 2009 5 commits
  6. 15 Oct, 2009 5 commits
  7. 13 Oct, 2009 15 commits
  8. 12 Oct, 2009 5 commits