Skip to content
Snippets Groups Projects
  1. Feb 09, 2010
    • Iustin Pop's avatar
      Add an early release lock/storage for disk replace · 7ea7bcf6
      Iustin Pop authored
      
      This patch adds an early_release parameter in the OpReplaceDisks and
      OpEvacuateNode opcodes, allowing earlier release of storage and more
      importantly of internal Ganeti locks.
      
      The behaviour of the early release is that any locks and storage on all
      secondary nodes are released early. This is valid for change secondary
      (where we remove the storage on the old secondary, and release the locks
      on the old and new secondary) and replace on secondary (where we remove
      the old storage and release the lock on the secondary node.
      
      Using this, on a three node setup:
      
      - instance1 on nodes A:B
      - instance2 on nodes C:B
      
      It is possible to run in parallel a replace-disks -s (on secondary) for
      instances 1 and 2.
      
      Replace on primary will remove the storage, but not the locks, as we use
      the primary node later in the LU to check consistency.
      
      It is debatable whether to also remove the locks on the primary node,
      and thus making replace-disks keep zero locks during the sync. While
      this would allow greatly enhanced parallelism, let's first see how
      removal of secondary locks works.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      7ea7bcf6
  2. Jan 04, 2010
  3. Dec 16, 2009
  4. Nov 25, 2009
  5. Oct 16, 2009
  6. Oct 07, 2009
  7. Oct 05, 2009
  8. Sep 29, 2009
  9. Sep 17, 2009
  10. Aug 28, 2009
  11. Aug 24, 2009
  12. Aug 17, 2009
  13. Jul 24, 2009
  14. Jul 21, 2009
  15. Jul 19, 2009
  16. Jul 07, 2009
  17. Jun 26, 2009
  18. Jun 08, 2009
  19. Mar 06, 2009
    • Iustin Pop's avatar
      Fix serial_no field on instances · 6f285030
      Iustin Pop authored
      The instance objects did not get a serial_no field. This patch adds a
      new constants for the field name and uses it for all three cases
      (cluster, nodes, instances).
      
      Reviewed-by: imsnah
      6f285030
  20. Mar 04, 2009
    • Iustin Pop's avatar
      Complete the cfgupgrade script for 2.0 migrations · ac4d25b6
      Iustin Pop authored
      This patch makes the cfgupgrade script to handle:
        - instance changes
        - disk changes
        - further cluster fixes
        - adds configuration checks at the end, in non-dry-run mode
      
      Reviewed-by: ultrotter
      ac4d25b6
    • Iustin Pop's avatar
      First run at cfgupgrade for 2.0 upgrades · a421fdeb
      Iustin Pop authored
      This patch makes cfgupgrade work on empty cluster (i.e. no instances),
      up to a point that the config file can be converted from 1.2 to 2.0.
      This is not yet complete, though.
      
      Reviewed-by: ultrotter
      a421fdeb
  21. Feb 16, 2009
    • Iustin Pop's avatar
      Burnin: fix rename · 2e39ab98
      Iustin Pop authored
      In rename, we must stop different names in the first and second phases,
      so we create two different opcodes for this purpose (instead of using
      the same one twice, which doesn't work).
      
      Reviewed-by: imsnah
      2e39ab98
  22. Feb 10, 2009
  23. Feb 04, 2009
    • Iustin Pop's avatar
      Implement lockless query operations · ec79568d
      Iustin Pop authored
      This patch adds the framework for, and enables lockless OpQueryInstances. This
      means that instances will be shown in ERROR_up or ERROR_down state, even though
      this is not an error (but just an in-progress job).
      
      The framework is implemented as follows:
        - the OpQueryInstances, OpQueryNodes and OpQueryExports opcodes take
          an additional “use_locking” flag which will denote whether to lock
          or not; this patch only implements this for LUQueryInstances
        - the luxi query functions take an additional argument use_locking
          which is passed to the master daemon, and then passed to the above
          opcodes
        - cli.py export a new SYNC_OPT command line options which implement
          setting this flag to true
        - except for gnt-instance list, which uses this option, and for
          name-only queries (e.g. QueryNodes(fields=["names"])), all other
          callers are setting this flag to True
        - RAPI also sets the flag to True
      
      The patch was tested with a continuous (0.2s sleep in-between)
      gnt-instance list during a burnin, and no problems were observed.
      
      Reviewed-by: ultrotter
      ec79568d
  24. Feb 03, 2009
    • Iustin Pop's avatar
      lvmstrap: allow removable devices too · d1687c6f
      Iustin Pop authored
      For testing or just in case a device is exported by a bad driver with
      the 'removable' flag set, this patch adds a flag to lvmstrap that allows
      it to use these devices too.
      
      Reviewed-by: ultrotter
      d1687c6f
  25. Jan 23, 2009
    • Iustin Pop's avatar
      Make iallocator work with offline nodes · 1325da74
      Iustin Pop authored
      This patch changes the iallocator framework to work with and properly
      export to plugins offline nodes. It does this by only exporting the
      static configuration data for those nodes, and not attempting to parse
      the runtime data.
      
      The patch also fixes bugs in iallocator related to the RpcResult
      conversion, changes the should_run to admin_up attribute name (as per
      the internals change), and adds “-I” as a short option for
      “--iallocator” in gnt-instance, gnt-backup and burnin.
      
      Reviewed-by: ultrotter
      1325da74
    • Iustin Pop's avatar
      Rework the execution model in burnin · c723c163
      Iustin Pop authored
      This patch changes (significantly) the execution model in burnin:
        - for all runs, (almost) all instance mods in a single Burn* procedure
          are done as part of a job; so for example add disk, stop, remove
          disk, start are no longer done as separate jobs but as a single job
          consisting of four opcodes
        - for parallel runs, all Burn* procedures except the rename (which
          uses a single target name) run in parallel; before, only the
          creation was done in parallel
        - due to the single-job execution and also parallel execution, the
          logging messages are no longer happening synchronously with the
          execution, so they are more informative than an actual execution log
      
      The end result is that burnin now tests properly multi-opcode jobs and
      also tests all opcodes (except rename) for parallel execution.
      
      Note: On a test cluster, parallelization reduces burnin time from 23m to
      15m.
      
      Reviewed-by: ultrotter
      c723c163
  26. Jan 20, 2009
    • Iustin Pop's avatar
      Fix burnin problems when using http checks · 5dc626fd
      Iustin Pop authored
      The urllib2 module has very bad error handling. This patch changes to urllib
      which is simpler, and we derive a custom class from the FancyURLopener. Burning
      is no longer keeping sockets in CLOSE_WAIT state with this patch.
      
      Reviewed-by: ultrotter
      5dc626fd
Loading