Skip to content
Snippets Groups Projects
  1. Jul 05, 2012
    • Iustin Pop's avatar
      Ensure that --wait-for-sync is used in QA · 32da72f3
      Iustin Pop authored
      
      We don't have a specific test for activate disks, so let's add it in
      the cases where we run (incidentally) activate-disks.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      32da72f3
    • Iustin Pop's avatar
      Add --wait-for-sync in gnt-instance · f30d8165
      Iustin Pop authored
      
      Note that this needs (like for the opcode) a new option, with the
      default reverted (False instead of True).
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      f30d8165
    • Iustin Pop's avatar
      9Add wait_for_sync flag to OpInstanceActivateDisks · b69437c5
      Iustin Pop authored
      
      This can be used to ensure that after activate-disks has returned, the
      instance's storage is consistent; currently there's no programmatic
      way to do this.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      b69437c5
    • Iustin Pop's avatar
      hbal: return exit status 0 in case of early exit · 2a2e2610
      Iustin Pop authored
      
      This derives from an internal bug, but the story is consistent across
      both internal and external usage of hbal.
      
      Basically right now, hbal returns exit code 1 if requested to exit
      early, even if all jobs are successful. This is counter-intuitive due
      to two reasons:
      
      - hbal did what it was requested (exit early), so it shouldn't return error
      - there were no job failures, so there's nothing to "cleanup" or
        investigate on the Ganeti cluster, so again it shouldn't return
        error
      
      Therefore the new behaviour is as follows:
      
      - for cases where all jobs were successful, even if terminated early
        via SIGINT or via --limit, we exit with code 0
      - for cases where jobs have failed or there were other errors in
        running hbal, the exit code is 1
      - for cases were hbal is requested an immediate termination (SIGTERM),
        exit code is 2, denoting "unknown whether the Ganeti cluster is
        consistent or not"
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      2a2e2610
    • Iustin Pop's avatar
      Fix DRBD resize code · cad0723b
      Iustin Pop authored
      
      There are two bugs in the current resize code, affecting mostly DRBD.
      
      First, due to bugs in old DRBD versions (pre 8.0.14), the code currently
      calls `drbdsetup resize' on both the primary or secondary. However,
      this is actually wrong per current DRBD (from drbdsetup(8)):
      
           resize
             This causes DRBD to reexamine the size of the device's backing
             storage device. To actually do online growing you need to
             extend the backing storages on both devices and call the resize
             command on one of your nodes.
      
      So calling it just on the primary node should be enough. However, we
      can't simply remove the calls to the secondary nodes, since that would
      break the growth of the underlying storage (LVM) on the
      secondary. Which leads to the second existing bug: we call resize on
      each node, even before finish the growth of the underlying
      storage. This can leads to all kind of issues if DRDB is not well
      behaved.
      
      So to fix both these bugs, we have to extend the current RPC call with
      another parameter, which denotes whether to extend the actual backing
      storage or just the "logical" one (DRBD being the only one; MD would
      be another, if implemented). This allows us to do the growth in two
      steps, first the backing store on all nodes, then the logical storage
      on just the primary node.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      cad0723b
  2. Jun 29, 2012
  3. Jun 28, 2012
  4. Jun 27, 2012
  5. Jun 26, 2012
  6. Jun 25, 2012
  7. Jun 19, 2012
    • Guido Trotter's avatar
      Allow single-homed <-> multi-homed transitions · 79829d23
      Guido Trotter authored
      
      To change the cluster from single homed to multi homed or vice versa one
      must target the master node first, and pass the --force option. All
      other nodes then will work as long as they are reachable by the master.
      
      Note that this will also prevent a node to be set to single-homed if the
      master is multi-homed, which wasn't disallowed before, and warn if a
      single-homed <-> multi-homed transition happens.
      
      Also note that it's still theoretically possible to flip a cluster
      inadvertently by changing the master node this way, and then doing a
      master failover before fixing the other nodes.
      
      Signed-off-by: default avatarGuido Trotter <ultrotter@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      79829d23
  8. Jun 15, 2012
  9. Jun 14, 2012
  10. Jun 12, 2012
  11. Jun 11, 2012
  12. Jun 08, 2012
  13. Jun 07, 2012
Loading