1. 05 Feb, 2008 1 commit
    • Iustin Pop's avatar
      Reduce the chance of DRBD errors with stale primaries · fdbd668d
      Iustin Pop authored
      This patch is a first step in reducing the chance of causing DRBD
      activation failures when the primary node has not-perfect data.
      This issue is more seen with DRBD8, which has an 'outdate' state (in
      which it can get more often). But it can (and before this patch, usually
      will) happen with both 7 and 8 in the case the primary has data to sync.
      The error comes from the fact that, before this patch, we activate the
      primary DRBD device and immediately (i.e. as soon as we can run another
      shell command) we try to make it primary. This might fail - since the
      primary knows it has some data to catch up to - but we ignored this
      error condition. The failure was visible later, in either md failing to
      activate over a read-only storage or by instance failing to start.
      The patch has two parts: one affecting bdev.py, which changes failures
      in BlockDev.Open() from returning False to raising
      errors.BlockDeviceError; noone (except a generic method inside bdev.py)
      checked this return value and we logged it but the master didn't know
      about it; now all classes raise errors from Open if they have a failure.
      The other part, affecting cmdlib.py, changes the activation sequence
        - activate on primary node as primary and secondary as secondary, in
          whatever order a function returns the nodes
      to the following:
        - activate all drives as secondaries, on both the primary and the
          secondary nodes of the instance
        - after that, on the primary node, re-activate the device stack as
      This is in order to give the chance to DRBD to connect and make the
      handshake. As noted in the comments, this just increases the chances of
      a handshake/connect, not fixing entirely the problem. However, it is a
      good first step and it passes all tests of starting with stale (either
      full or partial) primaries, with both drbd 7 and 8, and also passes a
      Note that the patch might make the device activation a little bit
      slower, but it is a reasonable trade-off.
      Reviewed-by: imsnah
  2. 04 Feb, 2008 2 commits
  3. 31 Jan, 2008 2 commits
  4. 30 Jan, 2008 3 commits
  5. 28 Jan, 2008 7 commits
    • Alexander Schreiber's avatar
      tiny typo fix · f2e9e0e8
      Alexander Schreiber authored
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      Improve the documentation of query output fields · d8a4b51d
      Iustin Pop authored
      The gnt-node and gnt-instance list commands have a customizable list of
      output fields, but the list is not up to date (in the man page) and not
      easily understandable from the ‘--help’ output.
      This patch updates the man pages and adds the available fields and
      default fields in the ‘--help’ output, as part of the description.
        gnt-node list
      Lists the nodes in the cluster. The available fields are (see the man page for
      details): name, pinst_cnt, pinst_list, sinst_cnt, sinst_list, pip, sip,
      dtotal, dfree, mtotal, mnode, mfree, bootid. The default field list is (in
      order): name, dtotal, dfree, mtotal, mnode, mfree, pinst_cnt, sinst_cnt.
      Reviewed-by: imsnah,ultrotter
    • Guido Trotter's avatar
      Fix a typo in a devel/upload comment · 4a160927
      Guido Trotter authored
      Files are uploaded to $prefix/sbin, not $prefix/bin
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      Add QA tests for gnt-instance modify · c0f74c55
      Iustin Pop authored
      This patch adds QA tests for most of the possible parameters in the
      instance modify operation (exception being the MAC), and modifies the
      sample QA file to run this test.
      It also tests the no-modification test, but that is a weak one: we only
      test that the exit code is one, not that the command gave a proper
      response ("... please give at least one parameter") as opposed to a
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Add option for the number of VCPUs in instance listing · d6d415e8
      Iustin Pop authored
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Allow selection of hypervisor type in QA · b32f9859
      Iustin Pop authored
      This patch allows the selection of the hypervisor type for the QA
      process; this is useful when testing hypervisor-independent changes that
      don't require a Xen setup.
      The patch also fixes the OS name in the sample QA config file provided.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Fix "gnt-instance modify --initrd" · 2bc22872
      Iustin Pop authored
      The new QA tests for instance modify uncovered a bug in the modify
      initrd operation when setting the initrd to none.
      Reviewed-by: imsnah
  6. 27 Jan, 2008 1 commit
  7. 25 Jan, 2008 1 commit
  8. 22 Jan, 2008 1 commit
    • Iustin Pop's avatar
      Change the install directory for the tools · 909a0e4d
      Iustin Pop authored
      Currently, the tools are installed under $prefix/share/ganeti. This
      prevents installing other things in a nice way under share/ganeti (like
      arch-independent OS definitions), therefore we want the tools to live
      under share/ganeti/tools.
      A second change is that since these are programs, they would better live
      under libdir than datadir - we might have to change them later to
      binaries in which case 'share' is definitely not the way to go.
      This patch therefore changes the install directory for the tools to
      Reviewed-by: imsnah
  9. 21 Jan, 2008 8 commits
    • Guido Trotter's avatar
      Remove qa tests for gnt-instance start/stop · e0b62a26
      Guido Trotter authored
      Those tests were added in the wrong place. This patch removes them.  One day
      we'll implement proper command line regression testing and they should go in
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Test start/stop aliases in qa · ce9fb89d
      Guido Trotter authored
      This tests both that those two aliases have not been removed and also that
      aliases handling hasn't been broken.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Add a few aliases for startup/shutdown · 536fda25
      Guido Trotter authored
      These aliases are widely used to think of these operations and save some typing
      too. Even though there is some thought to make start/stop the default operation
      name I don't think this should happen for 1.2, for now adding it as an alias is
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Add the first command alias · dbfd89dd
      Guido Trotter authored
      Alias activate_block_devs to activate-disks, for ganeti 1.1 compatibility.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Add support for command aliases · de47cf8f
      Guido Trotter authored
      Passing a new aliases dict to generic main we can easily support aliases for
      compatibility reasons or simply useability.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Add tool to ease testing of unsubmitted patches · 6e5e91a1
      Guido Trotter authored
      The upload tool can be used to submit the current code to an arbitrary list of
      nodes. This helps developers in easily testing their changes before submitting
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Check that we have a valid export list · 461f0538
      Guido Trotter authored
      Before iterating over the list of exports present on a node, check that what
      ganeti returned is actually a list. This solves the case when one of the nodes
      is down, and an error value is returned.
      This fixes issue 21
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Fix VG listing broken by r510 · d87ae7d2
      Iustin Pop authored
      LVM code sometimes adds an extra separator at the end of the field list.
      Make the code strip it if exists.
      Reviewed-by: imsnah
  10. 20 Jan, 2008 7 commits
    • Iustin Pop's avatar
      Make backend._GetVGInfo check the validity of 'vgs' · f4d377e7
      Iustin Pop authored
      Currently, the function backend._GetVGInfo only checks for errors via
      the exit code of the 'vgs' command. However, there are other ways of
      failure so we need to also check for valid output before parsing.
      Furthermore, the checks on the exit code were reported via a 'raise
      LVMError', however this exception is not handled anywhere and so the
      remote caller will not get reasonable data.
      This patch does two main things:
        - change the calling protocol for this function to not raise an error,
          and instead return the same type of argument always (dict) with the
          requested keys but values changed into None; this allows in the
          parent rpc call node_info to have valid memory information but
          "error" value for disk space, if there's an error with disks
        - check the validity of the output so that in case we fail to parse
          it, we don't abort with a backtrace in the node daemon but instead
          return the default result value (containing errors), and log these
          cases in the node daemon log file
      We also bump the protocol version to 11.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Fix checking of node free disk in CreateInstance · 8d75db10
      Iustin Pop authored
      This patch does two things:
        - checks that the result values from call_node_info are valid integer
          values and aborts otherwise
        - skips disk space computation for the DT_DISKLESS case
      The most important point of the patch is the verification of results
      from the rpc call, as it prepares for a patch that allows failures to be
      better reported from the remote node.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Abstract node memory checking into a separate function · d4f16fd9
      Iustin Pop authored
      The checking of a node's free memory (via rpc.call_node_info) is done in
      both start instance an failover. This patch abstracts this call,
      together with the appropriate error handling, into a separate function
      called _CheckNodeFreeMemory.
      The patch also has some related changes:
        - the check is done in prereq and not in exec for start instance
        - the redundant check in exec for failover has been removed
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Change a hardcoded path into its proper constant · 97628462
      Iustin Pop authored
      The function backend.UploadFile still uses "/etc/hosts" directly instead
      of the existing constant; this patch fixes this.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Allow use of 'diskless' disk template in burnin · bd249e2f
      Iustin Pop authored
      Even if this doesn't have any practical use for actually creating
      instances, it can be used for very fast burnin and testing just the
      add/start/stop/remove functionality.
      This has also revealed a bug in export/import related to diskless
      instances, so it's educational value is proved.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Fix run directory for the fake hypervisor · 1ed70996
      Iustin Pop authored
      Currently the fake hypervisor has hardcoded ‘/var/run’ as a base
      directory for its store. This patch adds a constant RUN_DIR that is used
      for both the fake hypervisor and for BDEV_CACHE_DIR.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Fix the init.d script · e71d6323
      Iustin Pop authored
      The script (which is geared towards Debian) is actually not fully
      compliant, as lintian generates a warning on it - the S runlevel is not
      a valid one in the "Stop" stanza. This patch removes "S" from the stop
      Reviewed-by: imsnah
  11. 18 Jan, 2008 2 commits
    • Iustin Pop's avatar
      Fix the make dist rule · b6f2e47f
      Iustin Pop authored
      In revision 459 I added a bug in the make dist rule in the sense that
      the archive will include *all* of test/data directory, including the
      .svn directory if it exists.
      This patch fixes that problem and adds a distcheck hook that tests for
      such errors in the future (files/directories matching the .svn and .git
      It also fixes a typo in the NEWS file.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Bump version numbers for the 1.2.1 release · 31b9055c
      Iustin Pop authored
      This a merge to trunk of revision 494.
      Reviewed-by: imsnah
  12. 16 Jan, 2008 2 commits
  13. 14 Jan, 2008 3 commits
    • Guido Trotter's avatar
      Make instance start/stop skippable at burnin time · d4844f0f
      Guido Trotter authored
      Even though burnin was born just to do that test it now contains a lot more
      things one might try, so it makes sense to make instance start/stop optional
      This creates a burnin that at the bare minimum tests instance create and
      remove, if all the --no options are specified, but usually does a lot more.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Do instance export and import during burnin · bd5e77f9
      Guido Trotter authored
      Instances get exported to a remote node, then removed and imported back to
      their original nodes. This should be an idempotent option from the instance
      point of view, and help making sure ImportExport is kept up to date.
      It will also help making burnin take a lot longer, which is nice to take a nap.
      "...but I'm doing a cluster burnin...". Unfortunately this subfeature is a bit
      jeopardized by the fact that the new code can be skipped with the
      --no-importexport option, but nobody needs to know that, do they?
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      Allow burnin to take "-t plain" as an option · 4aa036ab
      Iustin Pop authored
      The burnin code deals with "-t plain", but the command line parser
      doesn't allow that as an option. This patch fixes this issue.
      Reviewed-by: ultrotter