Skip to content
Snippets Groups Projects
  1. Feb 08, 2008
    • Guido Trotter's avatar
      Shared Lock implementation and unit tests. · 162c1c1f
      Guido Trotter authored
      Adding a locking.py file for the ganeti locking library. Its first component is
      the implementation of a non-recursive blocking shared lock complete with a
      testing library.
      
      Reviewed-by: imsnah, iustinp
      162c1c1f
  2. Feb 05, 2008
    • Iustin Pop's avatar
      Add a test opcode that sleeps for a given duration · 06009e27
      Iustin Pop authored
      This can be used for testing purposes.
      
      Reviewed-by: ultrotter,imsnah
      06009e27
    • Iustin Pop's avatar
      Reduce the chance of DRBD errors with stale primaries · fdbd668d
      Iustin Pop authored
      This patch is a first step in reducing the chance of causing DRBD
      activation failures when the primary node has not-perfect data.
      
      This issue is more seen with DRBD8, which has an 'outdate' state (in
      which it can get more often). But it can (and before this patch, usually
      will) happen with both 7 and 8 in the case the primary has data to sync.
      
      The error comes from the fact that, before this patch, we activate the
      primary DRBD device and immediately (i.e. as soon as we can run another
      shell command) we try to make it primary. This might fail - since the
      primary knows it has some data to catch up to - but we ignored this
      error condition. The failure was visible later, in either md failing to
      activate over a read-only storage or by instance failing to start.
      
      The patch has two parts: one affecting bdev.py, which changes failures
      in BlockDev.Open() from returning False to raising
      errors.BlockDeviceError; noone (except a generic method inside bdev.py)
      checked this return value and we logged it but the master didn't know
      about it; now all classes raise errors from Open if they have a failure.
      
      The other part, affecting cmdlib.py, changes the activation sequence
      from:
        - activate on primary node as primary and secondary as secondary, in
          whatever order a function returns the nodes
      to the following:
        - activate all drives as secondaries, on both the primary and the
          secondary nodes of the instance
        - after that, on the primary node, re-activate the device stack as
          primary
      
      This is in order to give the chance to DRBD to connect and make the
      handshake. As noted in the comments, this just increases the chances of
      a handshake/connect, not fixing entirely the problem. However, it is a
      good first step and it passes all tests of starting with stale (either
      full or partial) primaries, with both drbd 7 and 8, and also passes a
      burnin.
      
      Note that the patch might make the device activation a little bit
      slower, but it is a reasonable trade-off.
      
      Reviewed-by: imsnah
      fdbd668d
  3. Feb 04, 2008
  4. Jan 31, 2008
  5. Jan 30, 2008
    • Guido Trotter's avatar
      Export bridge information too · 1cafd236
      Guido Trotter authored
      gnt-backup export used to export the ip and mac of each nic, but not which
      bridge it was connected to. Adding this information.
      
      Reviewed-by: iustinp
      
      1cafd236
  6. Jan 28, 2008
    • Iustin Pop's avatar
      Improve the documentation of query output fields · d8a4b51d
      Iustin Pop authored
      The gnt-node and gnt-instance list commands have a customizable list of
      output fields, but the list is not up to date (in the man page) and not
      easily understandable from the ‘--help’ output.
      
      This patch updates the man pages and adds the available fields and
      default fields in the ‘--help’ output, as part of the description.
      
      Example:
      Usage
      =====
        gnt-node list
      
      Lists the nodes in the cluster. The available fields are (see the man page for
      details): name, pinst_cnt, pinst_list, sinst_cnt, sinst_list, pip, sip,
      dtotal, dfree, mtotal, mnode, mfree, bootid. The default field list is (in
      order): name, dtotal, dfree, mtotal, mnode, mfree, pinst_cnt, sinst_cnt.
      
      Reviewed-by: imsnah,ultrotter
      d8a4b51d
    • Iustin Pop's avatar
      Add option for the number of VCPUs in instance listing · d6d415e8
      Iustin Pop authored
      Reviewed-by: ultrotter
      d6d415e8
    • Iustin Pop's avatar
      Fix "gnt-instance modify --initrd" · 2bc22872
      Iustin Pop authored
      The new QA tests for instance modify uncovered a bug in the modify
      initrd operation when setting the initrd to none.
      
      Reviewed-by: imsnah
      2bc22872
  7. Jan 25, 2008
  8. Jan 21, 2008
    • Guido Trotter's avatar
      Add support for command aliases · de47cf8f
      Guido Trotter authored
      Passing a new aliases dict to generic main we can easily support aliases for
      compatibility reasons or simply useability.
      
      Reviewed-by: iustinp
      de47cf8f
    • Iustin Pop's avatar
      Fix VG listing broken by r510 · d87ae7d2
      Iustin Pop authored
      LVM code sometimes adds an extra separator at the end of the field list.
      Make the code strip it if exists.
      
      Reviewed-by: imsnah
      d87ae7d2
  9. Jan 20, 2008
    • Iustin Pop's avatar
      Make backend._GetVGInfo check the validity of 'vgs' · f4d377e7
      Iustin Pop authored
      Currently, the function backend._GetVGInfo only checks for errors via
      the exit code of the 'vgs' command. However, there are other ways of
      failure so we need to also check for valid output before parsing.
      
      Furthermore, the checks on the exit code were reported via a 'raise
      LVMError', however this exception is not handled anywhere and so the
      remote caller will not get reasonable data.
      
      This patch does two main things:
        - change the calling protocol for this function to not raise an error,
          and instead return the same type of argument always (dict) with the
          requested keys but values changed into None; this allows in the
          parent rpc call node_info to have valid memory information but
          "error" value for disk space, if there's an error with disks
        - check the validity of the output so that in case we fail to parse
          it, we don't abort with a backtrace in the node daemon but instead
          return the default result value (containing errors), and log these
          cases in the node daemon log file
      
      We also bump the protocol version to 11.
      
      Reviewed-by: ultrotter
      f4d377e7
    • Iustin Pop's avatar
      Fix checking of node free disk in CreateInstance · 8d75db10
      Iustin Pop authored
      This patch does two things:
        - checks that the result values from call_node_info are valid integer
          values and aborts otherwise
        - skips disk space computation for the DT_DISKLESS case
      
      The most important point of the patch is the verification of results
      from the rpc call, as it prepares for a patch that allows failures to be
      better reported from the remote node.
      
      Reviewed-by: ultrotter
      8d75db10
    • Iustin Pop's avatar
      Abstract node memory checking into a separate function · d4f16fd9
      Iustin Pop authored
      The checking of a node's free memory (via rpc.call_node_info) is done in
      both start instance an failover. This patch abstracts this call,
      together with the appropriate error handling, into a separate function
      called _CheckNodeFreeMemory.
      
      The patch also has some related changes:
        - the check is done in prereq and not in exec for start instance
        - the redundant check in exec for failover has been removed
      
      Reviewed-by: ultrotter
      d4f16fd9
    • Iustin Pop's avatar
      Change a hardcoded path into its proper constant · 97628462
      Iustin Pop authored
      The function backend.UploadFile still uses "/etc/hosts" directly instead
      of the existing constant; this patch fixes this.
      
      Reviewed-by: ultrotter
      97628462
    • Iustin Pop's avatar
      Fix run directory for the fake hypervisor · 1ed70996
      Iustin Pop authored
      Currently the fake hypervisor has hardcoded ‘/var/run’ as a base
      directory for its store. This patch adds a constant RUN_DIR that is used
      for both the fake hypervisor and for BDEV_CACHE_DIR.
      
      Reviewed-by: ultrotter
      1ed70996
  10. Jan 16, 2008
  11. Jan 14, 2008
    • Iustin Pop's avatar
      Fix some misspellings · ba4b62cf
      Iustin Pop authored
      This patch fixes two name typos and a style issue (which makes pylint
      complain).
      
      Reviewed-by: ultrotter
      ba4b62cf
    • Guido Trotter's avatar
      Fix CreateInstance new optional parameters · 40ed12dd
      Guido Trotter authored
      Some new paramenters of the CreateInstance opcode are optional (namely
      kernel_path, initrd_path and hvm_boot_order) but their absence makes the code
      crash. Fix this by initializing them to a default value if they're not present.
      
      Reviewed-by: iustinp
      40ed12dd
  12. Jan 11, 2008
  13. Jan 09, 2008
  14. Jan 08, 2008
  15. Jan 07, 2008
    • Iustin Pop's avatar
      Improve verify-disks: broken/missing LV detection · b63ed789
      Iustin Pop authored
      This patch improves the ‘gnt-cluster verify-disks’ command by adding
      support for detecting broken volume groups and missing logical volume
      names.
      
      As such, we don't try anymore to activate disks for instances that are
      not likely to succeed anyway, and instead report them.
      
      Reviewed-by: schreiberal
      b63ed789
    • Iustin Pop's avatar
      Activate logical volumes at Assemble() time · 5574047a
      Iustin Pop authored
      This patch changes the Assemble() method for logical volumes from a noop
      to do a `lvchange -ay` on the logical volume; this ensures that if the
      logical volume is not active, we are able to activate and use it.
      
      Reviewed-by: imsnah
      5574047a
    • Iustin Pop's avatar
      Improve speed of activating block devs · be1ba2bd
      Iustin Pop authored
      This patch fixes the double attach operation in bdev.AttachOrAssemble,
      which was an indentation mistake in the first place.
      
      Reviewed-by: imsnah
      be1ba2bd
    • Iustin Pop's avatar
      Add unittest for DRBD8 drdbsetup show parser · 3840729d
      Iustin Pop authored
      This patch changes the bdev.DRBD8._GetDevInfo to take a string instead
      of a minor, separates the `drbdsetup show` invocation into a new
      separate method (bdev.DRBD8._GetShowData) and modifies the rest of the
      DRBD8 class to make the appropriate calls.
      
      It also adds a unittest script and data files for testing various cases
      of device output.
      
      Reviewed-by: imsnah
      3840729d
  16. Dec 27, 2007
  17. Dec 20, 2007
  18. Dec 19, 2007
    • Iustin Pop's avatar
      Make utils.RunCmd() deal with interleaved stdout/stderr · 9c233417
      Iustin Pop authored
      Currently, RunCmd is written with the assumption that programs will have
      a small stderr output, therefore we read the child's stdout (which can
      be big, so we don't want to block the child) and then the stderr (which
      is small, so it shouldn't block).
      
      However, with the ‘gnt-cluster verify-disks’ command, we ourselves
      generate heavy stderr, therefore we break the ganeti-watcher which runs
      the verify-disks via utils.RunCmd.
      
      This patch turns the RunCmd command into an poll-based one, which means
      any kind of interleaved output by a child on stdout/stderr will be
      handled correctly. Of course, since the output is buffered in memory,
      there are other ways to break RunCmd(). But at least this should fix the
      common case.
      
      Reviewed-by: hansmi
      9c233417
Loading