1. 26 Jan, 2011 1 commit
  2. 22 Oct, 2010 2 commits
    • Iustin Pop's avatar
      ConfigWriter: prevent using a foreign config · eb180fe2
      Iustin Pop authored
      If the configuration file doesn't denote this node as master, we prevent
      startup. This would have detected our previous race condition more
      easily, hence we add it as a permanent check.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
    • Iustin Pop's avatar
      Fix bootstrap.MasterFailover race with watcher · 21004460
      Iustin Pop authored
      This fixes a recently diagnosed race condition between master failover
      and the watcher.
      Currently, the master failover first stops the master daemon, checks
      that the IP is no longer reachable, and then distributes the updated
      configuration. Between the stop and the distribution, it can happen that
      the watcher starts the master daemon on the old node again, since ssconf
      still points the master to it (and all nodes vote so).
      In even more weird cases, the master daemon starts and before it manages
      to open the configuration file, it is updated, which means the master
      will respond to QueryClusterInfo with another node as the real master.
      This patch reorders the actions during master failover:
      - first, we redistribute a fixed config; this means the old master will
        refuse to update its own config file and ssconf, and that most jobs
        that change state will fail to finish
      - we then immediately kill it; after this step, the watcher will be
        unable to start it, since the master will refuse startup
      - and only then we check for IP reachability, etc.
      I've tested the new version against concurrent launch of the watcher;
      while my tests are not very exhaustive, two things can happen: watcher
      see the daemons as dead, and tries to restart them, which also fail; or
      it simply get an error while reading from the master daemon. Both these
      should be OK.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
  3. 12 Oct, 2010 1 commit
  4. 15 Sep, 2010 2 commits
  5. 25 Aug, 2010 1 commit
  6. 23 Aug, 2010 1 commit
  7. 20 Aug, 2010 2 commits
  8. 19 Aug, 2010 2 commits
  9. 18 Aug, 2010 4 commits
  10. 21 Jul, 2010 1 commit
  11. 09 Jul, 2010 1 commit
  12. 08 Jul, 2010 1 commit
  13. 07 Jul, 2010 1 commit
  14. 06 Jul, 2010 2 commits
  15. 30 Jun, 2010 1 commit
  16. 09 Jun, 2010 1 commit
  17. 16 Apr, 2010 1 commit
  18. 15 Apr, 2010 2 commits
    • Iustin Pop's avatar
      Fix cluster behaviour with disabled file storage · 0e3baaf3
      Iustin Pop authored
      There are a few issues with disabled file storage:
      - cluster initialization is broken by default, as it uses the 'no'
        setting which is not a valid path
      - some other parts of the code require the file storage dir to be a
        valid path; we workaround by skipping such code paths when it is
      A side effect is that we abstract the storage type checks into a
      separate function and add validation in RepairNodeStorage (previously a
      luxi client which didn't use cli.py and submitted an invalid type would
      get "storage units of type 'foo' can not be repaired").
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
    • Iustin Pop's avatar
      Fix cfgupgrade with non-default DATA_DIR · aeefe835
      Iustin Pop authored
      Commit 43575108 added bootstrap.GenerateclusterCrypto and commit
       changed cfgupgrade to use it. However, this lost the
      functionality of upgrading in non-default DATA_DIR.
      To fix this, we enhance bootstrap.GenerateclusterCrypto to accept custom
      file paths for the three files it modifies. If more files will be needed
      in the future, we could just pass in modified DATA_DIR, but for now it
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
  19. 08 Apr, 2010 1 commit
  20. 17 Mar, 2010 1 commit
  21. 15 Mar, 2010 2 commits
  22. 12 Mar, 2010 1 commit
  23. 08 Mar, 2010 1 commit
  24. 19 Feb, 2010 1 commit
  25. 01 Feb, 2010 1 commit
  26. 13 Jan, 2010 1 commit
  27. 05 Nov, 2009 1 commit
    • Michael Hanselmann's avatar
      Add new “daemon-util” script to start/stop Ganeti daemons · f154a7a3
      Michael Hanselmann authored
      Until now, Ganeti started and stopped its own daemons using custom functions.
      To start, the daemon was just executed and then sent the appropriate signals to
      stop it again. Init scripts would have to pay attention to the PID file and
      other things.
      With this patch, a new script is added (“daemon-util”, installed in
      $prefix/lib/ganeti/), centralizing the starting and stopping of daemons. The
      provided example init script is adjusted to use this new script. Ganeti's code
      no longer calls its own init script.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
  28. 04 Nov, 2009 1 commit
    • Iustin Pop's avatar
      Introduce a wrapper for hostname resolving · 104f4ca1
      Iustin Pop authored
      Currently a few of the LU's CheckPrereq use utils.HostInfo which raises
      a resolver error in case of failure. This is an exception from the
      standard that CheckPrereq should raise an OpPrereqError if the error is
      in the 'pre' phase (so that it can be retried).
      This patch adds a new error code (resolver_error) and a wrapper over
      utils.HostInfo that just converts the ResolverError into
      OpPrereqError(…, errors.ECODE_RESOLVER). It then uses this wrapper in
      cmdlib, bootstrap and some scripts.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
  29. 03 Nov, 2009 2 commits