1. 11 Mar, 2010 2 commits
  2. 08 Mar, 2010 1 commit
  3. 11 Feb, 2010 1 commit
  4. 09 Feb, 2010 1 commit
    • Iustin Pop's avatar
      Add an early release lock/storage for disk replace · 7ea7bcf6
      Iustin Pop authored
      
      
      This patch adds an early_release parameter in the OpReplaceDisks and
      OpEvacuateNode opcodes, allowing earlier release of storage and more
      importantly of internal Ganeti locks.
      
      The behaviour of the early release is that any locks and storage on all
      secondary nodes are released early. This is valid for change secondary
      (where we remove the storage on the old secondary, and release the locks
      on the old and new secondary) and replace on secondary (where we remove
      the old storage and release the lock on the secondary node.
      
      Using this, on a three node setup:
      
      - instance1 on nodes A:B
      - instance2 on nodes C:B
      
      It is possible to run in parallel a replace-disks -s (on secondary) for
      instances 1 and 2.
      
      Replace on primary will remove the storage, but not the locks, as we use
      the primary node later in the LU to check consistency.
      
      It is debatable whether to also remove the locks on the primary node,
      and thus making replace-disks keep zero locks during the sync. While
      this would allow greatly enhanced parallelism, let's first see how
      removal of secondary locks works.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      7ea7bcf6
  5. 04 Jan, 2010 4 commits
  6. 16 Dec, 2009 1 commit
  7. 25 Nov, 2009 1 commit
  8. 07 Oct, 2009 1 commit
  9. 05 Oct, 2009 1 commit
  10. 29 Sep, 2009 1 commit
  11. 17 Sep, 2009 1 commit
  12. 28 Aug, 2009 1 commit
  13. 24 Aug, 2009 3 commits
  14. 17 Aug, 2009 1 commit
  15. 21 Jul, 2009 3 commits
  16. 19 Jul, 2009 1 commit
  17. 16 Feb, 2009 1 commit
    • Iustin Pop's avatar
      Burnin: fix rename · 2e39ab98
      Iustin Pop authored
      In rename, we must stop different names in the first and second phases,
      so we create two different opcodes for this purpose (instead of using
      the same one twice, which doesn't work).
      
      Reviewed-by: imsnah
      2e39ab98
  18. 10 Feb, 2009 1 commit
  19. 04 Feb, 2009 1 commit
    • Iustin Pop's avatar
      Implement lockless query operations · ec79568d
      Iustin Pop authored
      This patch adds the framework for, and enables lockless OpQueryInstances. This
      means that instances will be shown in ERROR_up or ERROR_down state, even though
      this is not an error (but just an in-progress job).
      
      The framework is implemented as follows:
        - the OpQueryInstances, OpQueryNodes and OpQueryExports opcodes take
          an additional “use_locking” flag which will denote whether to lock
          or not; this patch only implements this for LUQueryInstances
        - the luxi query functions take an additional argument use_locking
          which is passed to the master daemon, and then passed to the above
          opcodes
        - cli.py export a new SYNC_OPT command line options which implement
          setting this flag to true
        - except for gnt-instance list, which uses this option, and for
          name-only queries (e.g. QueryNodes(fields=["names"])), all other
          callers are setting this flag to True
        - RAPI also sets the flag to True
      
      The patch was tested with a continuous (0.2s sleep in-between)
      gnt-instance list during a burnin, and no problems were observed.
      
      Reviewed-by: ultrotter
      ec79568d
  20. 23 Jan, 2009 2 commits
    • Iustin Pop's avatar
      Make iallocator work with offline nodes · 1325da74
      Iustin Pop authored
      This patch changes the iallocator framework to work with and properly
      export to plugins offline nodes. It does this by only exporting the
      static configuration data for those nodes, and not attempting to parse
      the runtime data.
      
      The patch also fixes bugs in iallocator related to the RpcResult
      conversion, changes the should_run to admin_up attribute name (as per
      the internals change), and adds “-I” as a short option for
      “--iallocator” in gnt-instance, gnt-backup and burnin.
      
      Reviewed-by: ultrotter
      1325da74
    • Iustin Pop's avatar
      Rework the execution model in burnin · c723c163
      Iustin Pop authored
      This patch changes (significantly) the execution model in burnin:
        - for all runs, (almost) all instance mods in a single Burn* procedure
          are done as part of a job; so for example add disk, stop, remove
          disk, start are no longer done as separate jobs but as a single job
          consisting of four opcodes
        - for parallel runs, all Burn* procedures except the rename (which
          uses a single target name) run in parallel; before, only the
          creation was done in parallel
        - due to the single-job execution and also parallel execution, the
          logging messages are no longer happening synchronously with the
          execution, so they are more informative than an actual execution log
      
      The end result is that burnin now tests properly multi-opcode jobs and
      also tests all opcodes (except rename) for parallel execution.
      
      Note: On a test cluster, parallelization reduces burnin time from 23m to
      15m.
      
      Reviewed-by: ultrotter
      c723c163
  21. 20 Jan, 2009 1 commit
    • Iustin Pop's avatar
      Fix burnin problems when using http checks · 5dc626fd
      Iustin Pop authored
      The urllib2 module has very bad error handling. This patch changes to urllib
      which is simpler, and we derive a custom class from the FancyURLopener. Burning
      is no longer keeping sockets in CLOSE_WAIT state with this patch.
      
      Reviewed-by: ultrotter
      5dc626fd
  22. 16 Jan, 2009 2 commits
    • Iustin Pop's avatar
      burnin: only call self.GrowDisks() if needed · aa089b65
      Iustin Pop authored
      In case we pass --disk-grow 0[,0..] then we should not call GrowDisks as it
      prints confusing log lines.
      
      Reviewed-by: imsnah
      aa089b65
    • Iustin Pop's avatar
      burnin: add option to not remove instances · 320eda24
      Iustin Pop authored
      This patch adds a burnin option to keep instances at the end, so that
      debugging after a burnin failure is easier.
      
      Also, we reorder the command line parsing and client query so that one
      can use ./tools/burnin --help even on non-ganeti machines.
      
      Reviewed-by: ultrotter
      320eda24
  23. 14 Jan, 2009 1 commit
  24. 13 Jan, 2009 4 commits
    • Iustin Pop's avatar
      Forward port of the burnin migration · 99bdd139
      Iustin Pop authored
      This is again a copy of the latest 1.2 burnin code related to migration.
      
      Reviewed-by: ultrotter
      99bdd139
    • Iustin Pop's avatar
      burnin: redo the output formatting · 836d59d7
      Iustin Pop authored
      Since we added many more tests in burnin, the output became almost
      unreadable. This patch changes the output to an indented one, so that
      the different phases and operations of burnin are more easily
      understood.
      
      Reviwed-by: ultrotter
      836d59d7
    • Iustin Pop's avatar
      burnin: move start_stop at the end · eb61f8d3
      Iustin Pop authored
      Traditionally the start/stop test was the last, so move it back to there
      (added as last option in commit 854).
      
      Reviewed-by: amishchenko
      eb61f8d3
    • Iustin Pop's avatar
      burnin: introduce instance alive checks · 5178f1bc
      Iustin Pop authored
      This patch adds instance alive checks after most start operations. The
      check is done in a custom way:
        - the instance is expected to have an http server up and running
        - and it should server the '/hostname.txt' resource containing the
          hostname of the instance
      
      This allows checking that:
        - creation is working OK
        - start after failover (and in the future migrate) is ok
        - rename works correctly
      
      By default, the check is disabled since one needs a custom OS for this
      check.
      
      The patch also fixes a wrong variable name from a previous burnin patch.
      
      Reviewed-by: ultrotter
      5178f1bc
  25. 12 Jan, 2009 1 commit
  26. 09 Jan, 2009 2 commits