1. 13 Jan, 2012 2 commits
    • Iustin Pop's avatar
      Further fixes to instance policy validation · 57dc299a
      Iustin Pop authored
      
      
      As a followup from "Remove extraneous check in policy creation", there
      are more places where we build an ipolicy, and then manually check for
      its validity. This is very bad style, as it duplicates the
      verification code across many places.
      
      This patch removes all such explicit checks (except for one in
      cmdlib.py which is correct), and instead does a bit more validation in
      the builder functions or in the actual dedicated verification
      functions. It also fixes cluster init which used the wrong,
      non-completed ipolicy (this was not detected before as we did call
      check on it, but otherwise we ignored it), and fixes a too-strong
      assert (due to the call chain, we first create the ipolicy from
      cmdline params, and only then we fill it).
      
      Finally, it removes an extraneous logging.info which I forgot from
      debugging.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      57dc299a
    • Iustin Pop's avatar
      Add new disk_templates parameter to instance policy · 2cc673a3
      Iustin Pop authored
      
      
      This is a bit more complex patch, as it requires changing the
      assumption that all keys in the policy dict points to values that are
      themselves dicts. Right now we introduce an assumption that any
      non-dicts are lists, we'll see in the future if this holds or whether
      we need more complex type checking (manual, yay Python).
      
      The patch also does some trivial style changes.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      2cc673a3
  2. 16 Dec, 2011 1 commit
  3. 08 Dec, 2011 1 commit
  4. 01 Dec, 2011 1 commit
  5. 22 Nov, 2011 1 commit
  6. 14 Nov, 2011 2 commits
  7. 03 Nov, 2011 3 commits
  8. 02 Nov, 2011 2 commits
  9. 26 Oct, 2011 1 commit
  10. 18 Oct, 2011 2 commits
  11. 05 Oct, 2011 2 commits
  12. 30 Sep, 2011 2 commits
  13. 20 Sep, 2011 2 commits
  14. 30 Aug, 2011 1 commit
  15. 25 Aug, 2011 1 commit
  16. 23 Jun, 2011 1 commit
    • Guido Trotter's avatar
      remove bootstrap._InitSharedFileStorage · 0376655e
      Guido Trotter authored
      
      
      This function is a copy of bootstrap._InitFileStorage with the following
      differences:
        - check constants.ENABLE_SHARED_FILE_STORAGE and not
          constants.ENABLE_FILE_STORAGE
        - use different local variable names
        - one different error string
      
      Thus:
        - move the constant check outside of the function call
        - change error string so it's clear where the error is
        - call the same function twice
      Signed-off-by: default avatarGuido Trotter <ultrotter@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      0376655e
  17. 31 Mar, 2011 1 commit
  18. 01 Mar, 2011 2 commits
  19. 28 Feb, 2011 1 commit
  20. 26 Jan, 2011 2 commits
  21. 04 Jan, 2011 1 commit
  22. 08 Dec, 2010 1 commit
  23. 01 Dec, 2010 1 commit
  24. 29 Nov, 2010 3 commits
  25. 22 Oct, 2010 2 commits
    • Iustin Pop's avatar
      ConfigWriter: prevent using a foreign config · eb180fe2
      Iustin Pop authored
      
      
      If the configuration file doesn't denote this node as master, we prevent
      startup. This would have detected our previous race condition more
      easily, hence we add it as a permanent check.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      eb180fe2
    • Iustin Pop's avatar
      Fix bootstrap.MasterFailover race with watcher · 21004460
      Iustin Pop authored
      
      
      This fixes a recently diagnosed race condition between master failover
      and the watcher.
      
      Currently, the master failover first stops the master daemon, checks
      that the IP is no longer reachable, and then distributes the updated
      configuration. Between the stop and the distribution, it can happen that
      the watcher starts the master daemon on the old node again, since ssconf
      still points the master to it (and all nodes vote so).
      
      In even more weird cases, the master daemon starts and before it manages
      to open the configuration file, it is updated, which means the master
      will respond to QueryClusterInfo with another node as the real master.
      
      This patch reorders the actions during master failover:
      
      - first, we redistribute a fixed config; this means the old master will
        refuse to update its own config file and ssconf, and that most jobs
        that change state will fail to finish
      - we then immediately kill it; after this step, the watcher will be
        unable to start it, since the master will refuse startup
      - and only then we check for IP reachability, etc.
      
      I've tested the new version against concurrent launch of the watcher;
      while my tests are not very exhaustive, two things can happen: watcher
      see the daemons as dead, and tries to restart them, which also fail; or
      it simply get an error while reading from the master daemon. Both these
      should be OK.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      21004460
  26. 12 Oct, 2010 1 commit