1. 18 Mar, 2008 4 commits
  2. 11 Mar, 2008 2 commits
    • Iustin Pop's avatar
      Disable cluster init with a reachable IP · 411f8ad0
      Iustin Pop authored
      Make the cluster init fail if the IP to which the cluster name resolved
      is already reachable by the master node. This is not a foolproof
      solution, but it allows a cheap method of detecting simple mistakes.
      
      It will also disallow using the master node name as cluster name (which
      is something good).
      
      The only drawbacks that I see are:
        - you are not allowed to do this, which might come in handy in cluster
          upgrades; but since we support rename, this is mitigated
        - cluster init takes longer now (+the timeout value, set to 5
          seconds), but since this is a one-off operation, it should be ok
      
      Reviewed-by: ultrotter
      411f8ad0
    • Iustin Pop's avatar
      Modify utils.TcpPing to make source address optional · b15d625f
      Iustin Pop authored
      This patch modifies TcpPing and its callers to make the source address
      selection optional. Usually, the kernel will know better what
      source address to use, just in some cases we want to enforce a given
      source address so it makes sense to make this optional.
      
      Reviewed-by: ultrotter
      b15d625f
  3. 05 Mar, 2008 1 commit
  4. 29 Feb, 2008 2 commits
    • Iustin Pop's avatar
      Fix master role stop on cluster destroy · c9064964
      Iustin Pop authored
      Currently the cluster destroy doesn't remove the master role, which
      means that the IP address of the cluster remains assigned to the master
      node.
      
      This patch fixes this and also a docstring in backend.StopMaster().
      
      Reviewed-by: imsnah
      c9064964
    • Iustin Pop's avatar
      Fix cluster rename operation · 488b540d
      Iustin Pop authored
      This one-liner fixes the cluster rename operation. As a side note, we
      should have a QA test for this too.
      
      Reviewed-by: imsnah
      488b540d
  5. 28 Feb, 2008 1 commit
    • Guido Trotter's avatar
      Don't allow renaming to an existing instance · 7bde3275
      Guido Trotter authored
      Even if the target instance is down or we are not checking for IP conflicts
      changing an instance name to a new one which is already in the cluster is
      doomed to fail, because in a lot of places (among which figures the mind of
      most users/admins) instance names are assumed to be unique.
      
      Reviewed-by: imsnah
      7bde3275
  6. 27 Feb, 2008 1 commit
  7. 25 Feb, 2008 1 commit
  8. 22 Feb, 2008 1 commit
  9. 16 Feb, 2008 1 commit
    • Guido Trotter's avatar
      Fix gnt-instance info i1 i2 ... · 515207af
      Guido Trotter authored
      Due to an indentation error only the last instance queried got returned by
      LUQueryInstanceData. Moving the append() call inside the for cycle to fix this
      issue.
      
      This is a one-liner targeted at 1.2.3
      
      Reviewed-by: iustinp
      515207af
  10. 15 Feb, 2008 1 commit
  11. 14 Feb, 2008 1 commit
    • Iustin Pop's avatar
      Modify the default output of gnt-instance list · d8052456
      Iustin Pop authored
      This patch adds a new field available for selection in gnt-instance list
      names "status" which represents the combined value of "admin_state" and
      "oper_state". Since this is much easier to parse (e.g. gnt-instance list
      |grep ERROR), we also modify the default field list to use this instead
      of the admin/oper state fields.
      
      Reviewed-by: imsnah
      d8052456
  12. 05 Feb, 2008 2 commits
    • Iustin Pop's avatar
      Add a test opcode that sleeps for a given duration · 06009e27
      Iustin Pop authored
      This can be used for testing purposes.
      
      Reviewed-by: ultrotter,imsnah
      06009e27
    • Iustin Pop's avatar
      Reduce the chance of DRBD errors with stale primaries · fdbd668d
      Iustin Pop authored
      This patch is a first step in reducing the chance of causing DRBD
      activation failures when the primary node has not-perfect data.
      
      This issue is more seen with DRBD8, which has an 'outdate' state (in
      which it can get more often). But it can (and before this patch, usually
      will) happen with both 7 and 8 in the case the primary has data to sync.
      
      The error comes from the fact that, before this patch, we activate the
      primary DRBD device and immediately (i.e. as soon as we can run another
      shell command) we try to make it primary. This might fail - since the
      primary knows it has some data to catch up to - but we ignored this
      error condition. The failure was visible later, in either md failing to
      activate over a read-only storage or by instance failing to start.
      
      The patch has two parts: one affecting bdev.py, which changes failures
      in BlockDev.Open() from returning False to raising
      errors.BlockDeviceError; noone (except a generic method inside bdev.py)
      checked this return value and we logged it but the master didn't know
      about it; now all classes raise errors from Open if they have a failure.
      
      The other part, affecting cmdlib.py, changes the activation sequence
      from:
        - activate on primary node as primary and secondary as secondary, in
          whatever order a function returns the nodes
      to the following:
        - activate all drives as secondaries, on both the primary and the
          secondary nodes of the instance
        - after that, on the primary node, re-activate the device stack as
          primary
      
      This is in order to give the chance to DRBD to connect and make the
      handshake. As noted in the comments, this just increases the chances of
      a handshake/connect, not fixing entirely the problem. However, it is a
      good first step and it passes all tests of starting with stale (either
      full or partial) primaries, with both drbd 7 and 8, and also passes a
      burnin.
      
      Note that the patch might make the device activation a little bit
      slower, but it is a reasonable trade-off.
      
      Reviewed-by: imsnah
      fdbd668d
  13. 04 Feb, 2008 2 commits
  14. 31 Jan, 2008 1 commit
  15. 28 Jan, 2008 2 commits
  16. 20 Jan, 2008 2 commits
    • Iustin Pop's avatar
      Fix checking of node free disk in CreateInstance · 8d75db10
      Iustin Pop authored
      This patch does two things:
        - checks that the result values from call_node_info are valid integer
          values and aborts otherwise
        - skips disk space computation for the DT_DISKLESS case
      
      The most important point of the patch is the verification of results
      from the rpc call, as it prepares for a patch that allows failures to be
      better reported from the remote node.
      
      Reviewed-by: ultrotter
      8d75db10
    • Iustin Pop's avatar
      Abstract node memory checking into a separate function · d4f16fd9
      Iustin Pop authored
      The checking of a node's free memory (via rpc.call_node_info) is done in
      both start instance an failover. This patch abstracts this call,
      together with the appropriate error handling, into a separate function
      called _CheckNodeFreeMemory.
      
      The patch also has some related changes:
        - the check is done in prereq and not in exec for start instance
        - the redundant check in exec for failover has been removed
      
      Reviewed-by: ultrotter
      d4f16fd9
  17. 16 Jan, 2008 1 commit
  18. 14 Jan, 2008 2 commits
    • Iustin Pop's avatar
      Fix some misspellings · ba4b62cf
      Iustin Pop authored
      This patch fixes two name typos and a style issue (which makes pylint
      complain).
      
      Reviewed-by: ultrotter
      ba4b62cf
    • Guido Trotter's avatar
      Fix CreateInstance new optional parameters · 40ed12dd
      Guido Trotter authored
      Some new paramenters of the CreateInstance opcode are optional (namely
      kernel_path, initrd_path and hvm_boot_order) but their absence makes the code
      crash. Fix this by initializing them to a default value if they're not present.
      
      Reviewed-by: iustinp
      40ed12dd
  19. 11 Jan, 2008 1 commit
  20. 08 Jan, 2008 4 commits
  21. 07 Jan, 2008 1 commit
    • Iustin Pop's avatar
      Improve verify-disks: broken/missing LV detection · b63ed789
      Iustin Pop authored
      This patch improves the ‘gnt-cluster verify-disks’ command by adding
      support for detecting broken volume groups and missing logical volume
      names.
      
      As such, we don't try anymore to activate disks for instances that are
      not likely to succeed anyway, and instead report them.
      
      Reviewed-by: schreiberal
      b63ed789
  22. 27 Dec, 2007 1 commit
  23. 20 Dec, 2007 1 commit
  24. 18 Dec, 2007 3 commits
    • Alexander Schreiber's avatar
      Internal API change for instance console access. · 30989e69
      Alexander Schreiber authored
      Change the internal hypervisor API for GetShellCommandForConsole, we
      now call it with the instance instead of just the instance name.
      
      This is a prep patch for HVM, since HVM needs more than just the instance
      name to determine a way of console access.
      
      (this is a resend due to mail adress tyop)
      
      Reviewed-by: iustinp
      
      30989e69
    • Alexander Schreiber's avatar
      cleanup for hypervisor constants · 2584d4a4
      Alexander Schreiber authored
      Move constant definitions for hypervisor into constants.py
      
      Reviewed-by: ultrotter
      
      2584d4a4
    • Guido Trotter's avatar
      Specify hint as a named argument · 79caa9ed
      Guido Trotter authored
      hint is declared as a named argument for the LogWarning function. Make its
      caller pass it naming it.
      
      Reviewed-by: iustinp
      79caa9ed
  25. 17 Dec, 2007 1 commit