1. 05 Nov, 2009 1 commit
    • Michael Hanselmann's avatar
      Add new “daemon-util” script to start/stop Ganeti daemons · f154a7a3
      Michael Hanselmann authored
      Until now, Ganeti started and stopped its own daemons using custom functions.
      To start, the daemon was just executed and then sent the appropriate signals to
      stop it again. Init scripts would have to pay attention to the PID file and
      other things.
      With this patch, a new script is added (“daemon-util”, installed in
      $prefix/lib/ganeti/), centralizing the starting and stopping of daemons. The
      provided example init script is adjusted to use this new script. Ganeti's code
      no longer calls its own init script.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
  2. 04 Nov, 2009 1 commit
  3. 03 Nov, 2009 4 commits
  4. 02 Nov, 2009 2 commits
  5. 22 Oct, 2009 3 commits
    • Ken Wehr's avatar
      Adding '--no-ssh-init' option to 'gnt-cluster init'. · b989b9d9
      Ken Wehr authored
      Allows the initialization of a cluster without the creation or distribution
      of SSH key pairs. Includes changes for LeaveCluster and RPC.
      Signed-off-by: default avatarKen Wehr <ksw@google.com>
      Signed-off-by: default avatarGuido Trotter <ultrotter@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
    • Iustin Pop's avatar
      Try to reduce wrong errors in InstanceShutdown · 3782acd7
      Iustin Pop authored
      In backend.InstanceShutdown(), there is a race condition between
      checking that the instance exists and trying to shut it down which
      translates sometime in error messages like:
      Tue Oct 20 20:08:30 2009 - WARNING: Could not shutdown instance: Failed
      to force stop instance instance9: Failed to stop instance instance9:
      exited with exit code 1, Error: Domain 'instance9' does not exist.
      To fix this, we ignore any hypervisor StopInstance() errors if the
      instance doesn't exist anymore, since our purpose (to make the instance
      go away) is already accomplished.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
    • Iustin Pop's avatar
      Revert breakage introduced in e4e9b806 · 7734de0a
      Iustin Pop authored
      Commit e4e9b806
       introduced two problems
      in backend.InstanceShutdown():
      - first, it reduced the check interval significantly (especially for the
        first few checks); there are very few production VMs that shutdown in
        one second, and while not breaking anything this creates unnecessary
        load for the hypervisor
      - second, a wrong test added to the while condition (“not tried_once”)
        means that we only sleep once for an instance, and after that we
        immediately kill it forcefully
      These two together means that any instance which is not lucky enough to
      finish in roughly 1-1.5 seconds (the time it takes to sleep and verify
      again the instance list) will have this happen:
      2009-10-21 23:33:46,034:  pid=16634 INFO Called for inst9 w. False/False
      2009-10-21 23:33:47,440:  pid=16634 ERROR Shutdown of 'inst9' unsuccessful, forcing
      2009-10-21 23:33:47,440:  pid=16634 INFO Called for inst9 w. True/False
      The “Called…” are logs from the hypervisor shutdown function. This means
      of course that at restart time:
      [12775866.644682] EXT3-fs: INFO: recovery required on readonly filesystem.
      [12775866.644689] EXT3-fs: write access will be enabled during recovery.
      [12775868.533674] kjournald starting.  Commit interval 5 seconds
      [12775868.533697] EXT3-fs: sda1: orphan cleanup on readonly fs
      [12775868.551797] EXT3-fs: sda1: 12 orphan inodes deleted
      [12775868.551803] EXT3-fs: recovery complete.
      [12775868.586275] EXT3-fs: mounted filesystem with ordered data mode.
      This patch reverts the broken test and changes the sleep to a fixed
      duration of five seconds, since it makes no sense to check that often
      for shutdown (and after ~20 seconds we anyway reach a stable value of
      five seconds).
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
  6. 20 Oct, 2009 1 commit
  7. 13 Oct, 2009 1 commit
  8. 12 Oct, 2009 1 commit
  9. 09 Oct, 2009 2 commits
  10. 05 Oct, 2009 7 commits
  11. 25 Sep, 2009 1 commit
  12. 14 Sep, 2009 1 commit
    • Iustin Pop's avatar
      Treat virtual LVs as inexistent · 33f2a81a
      Iustin Pop authored
      Currently, “gnt-cluster verify” and “gnt-cluster verify-disks” use the
      list of LVs as returned by backend.GetVolumeList to determine whether an
      LV exists or not. However, LVs can also be ‘virtual’, which is handled
      correctly (i.e. as missing) by the bdev code, but not by this function.
      This patch changed GetVolumeList to simply skip virtual LVs; this makes
      cluster verify and verify-disks report these correctly as missing. The
      only downside is that an user could get confused (lvs reports the volume
      as existing, but ganeti as missing). However, this is better than simply
      considering virtual LVs as “good”.
      No other code beside these two gnt-cluster operations uses the
      GetVolumeList function, so we don't change the behaviour of the rest of
      the code (e.g. replace-disks, instance info, etc.).
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
  13. 03 Sep, 2009 1 commit
  14. 24 Aug, 2009 1 commit
  15. 05 Aug, 2009 3 commits
  16. 04 Aug, 2009 1 commit
  17. 29 Jul, 2009 1 commit
  18. 24 Jul, 2009 2 commits
  19. 20 Jul, 2009 2 commits
  20. 19 Jul, 2009 1 commit
  21. 08 Jul, 2009 1 commit
  22. 07 Jul, 2009 2 commits