1. 08 Dec, 2008 2 commits
    • Iustin Pop's avatar
      gnt-node modify: add the offline attribute · 3a5ba66a
      Iustin Pop authored
      This patch changes gnt-node modify and the associated opcode/lu to allow
      modification of the node offline attribute.
      
      Setting a node into offline mode automatically demotes it from the
      master role.
      
      Reviewed-by: ultrotter
      3a5ba66a
    • Iustin Pop's avatar
      RPC: do not make calls to offline nodes · ed83f5cc
      Iustin Pop authored
      This patch changes the _MultNodeCall and _SingleNodeCall helpers to not
      actually make calls to offline nodes, but instead generate fake
      responses which have a parameter caller 'offline' set so that callers
      can check for this value if they want (otherwise, it's just a failed RPC
      call).
      
      Reviewed-by: ultrotter
      ed83f5cc
  2. 05 Dec, 2008 11 commits
    • Iustin Pop's avatar
      Make cluster verify understand offline nodes · 0a66c968
      Iustin Pop authored
      This patch changes cluster verify to not alert on offline nodes, but
      instead just show a note at the end with the number of such nodes.
      
      It also removes warnings in verify-disks and hooks about failures to
      make rpc calls to such nodes.
      
      Reviewed-by: ultrotter
      0a66c968
    • Iustin Pop's avatar
      cmdlib: check node stats in prereqs · 7527a8a4
      Iustin Pop authored
      This patch adds checks for offline nodes in most instance LUs so that we
      can work with offline secondaries, but not with offline primaries. Some
      cases (like grow disk, which needs both sides up) are not allowing
      offline nodes at all.
      
      Reviewed-by: ultrotter
      7527a8a4
    • Iustin Pop's avatar
      Add two utility functions to cmdlib · a5961235
      Iustin Pop authored
      These will be used for parameter checking and node status checking.
      
      Reviewed-by: ultrotter
      a5961235
    • Iustin Pop's avatar
      Add function to compute the master candidates · ec0292f1
      Iustin Pop authored
      Since some nodes can be offline, we can't just take the length of the
      node list as the maximum possible number of master candidates.
      
      The patch adds an utility function to correctly compute this value and
      replaces hardcoded computations with the use of this function. It then
      adds utility functions to automate the maintenance of the node lists.
      
      Reviewed-by: ultrotter
      ec0292f1
    • Iustin Pop's avatar
      http: use slicing instead of string modification · b18dd019
      Iustin Pop authored
      The combination of the current buffer splitting method and (4KB) buffer
      size is very inefficient when writing big amounts of data. Just walking
      over a 16 megabyte string using a 4K buffer takes (on a random computer)
      1m06s, whereas using slices will decrease this to 0.080s, and slicing
      with 32 KB size decreases this to 0.073s.
      
      This means that uploading a big config file (it nears 1MB for big
      clusters) will take more and more time per the number of nodes, since it
      needs lots of slicing.
      
      I happened upon this by accidentally setting all nodes as master
      candidates, at which point just uploading the config file to all nodes
      took 40s. Applying the patch decreases this to 15s (this probably can
      still be optimized).
      
      The patch also removes a duplicate constant (the one actually used is in
      http/client.py), and changes the receive buffer size to use the same
      constant.
      
      Reviewed-by: imsnah
      b18dd019
    • Iustin Pop's avatar
      Add the offline node list to ssconf · a3316e4a
      Iustin Pop authored
      The patch also changes the various node list generation to be more
      consistent.
      
      Reviewed-by: imsnah
      a3316e4a
    • Iustin Pop's avatar
      Cleanup the config file on demotion from candidate · 56aa9fd5
      Iustin Pop authored
      This patch adds a simple rpc which makes a backup of the config file and
      then removes it. This is done so that cluster verify doesn't complain
      immediately after demoting a node.
      
      Reviewed-by: imsnah
      56aa9fd5
    • Iustin Pop's avatar
      watcher: handle offline nodes better · cbfc4681
      Iustin Pop authored
      This patch changes the LUQueryInstances to show a different state for
      offline nodes and also modifies the watcher to understand the offline
      state in its checks.
      
      Reviewed-by: ultrotter
      cbfc4681
    • Iustin Pop's avatar
      node list: add the offline field · 9ddb5e45
      Iustin Pop authored
      Reviewed-by: ultrotter
      9ddb5e45
    • Iustin Pop's avatar
      Add a new node parameter 'offline' · fc0fe88c
      Iustin Pop authored
      This patch adds a new node parameter called offline that will be used to
      mark nodes which should be touched by commands.
      
      We also add this flag at cluster init, node add, and export it to
      iallocator scripts.
      
      Reviewed-by: ultrotter
      fc0fe88c
    • Iustin Pop's avatar
      ssconf: empty files should not add a newline · 02b31f32
      Iustin Pop authored
      Currently we add a newline in the ssconf writeout process, even if the
      file is empty. We chage this case so that lists of values (e.g. offline
      nodes) are correct (not a list of one empty element).
      
      Reviewed-by: imsnah
      02b31f32
  3. 04 Dec, 2008 8 commits
  4. 03 Dec, 2008 16 commits
  5. 02 Dec, 2008 3 commits
    • Iustin Pop's avatar
      Fix gnt-cluster verify w.r.t. rpc changes · 25361b9a
      Iustin Pop authored
      This partially reorganizes the cluster verify LU:
        - introduce constants for the node verify rpc call
        - move from additional rpc calls to a single rpc call, the
          call_node_info, which gaters all data needed
      
      Also fix a small error (self.LogWarning instead of self.Warning).
      
      Reviewed-by: imsnah
      25361b9a
    • Iustin Pop's avatar
      Fix cluster rename · 55cf7d83
      Iustin Pop authored
      With the recent configwriter/ssconf changes, cluster rename becomes
      trivial. This patch gets rids of the code and just updates the cluster
      object.
      
      Reviewed-by: imsnah
      55cf7d83
    • Iustin Pop's avatar
      Convert rpc results to a custom type · 781de953
      Iustin Pop authored
      For a long time we had the problem that both RPC-layer errors and
      results from the remote node share the same "valuespace". This is
      because we shouldn't raise an exception when only one node failed
      (and lose the results from the other nodes).
      
      This patch attempts to address this problem by returning a special
      object from RPC calls, which separates the rpc-layer status and the
      remote results into different attributes.
      
      All the users of rpc (mainly cmdlib, but also bootstrap and the
      HooksMaster in mcpu) have been converted to this new model. The code has
      changed from, e.g. for boolean return types:
      
        if not self.rpc.call_...
      
      to
      
        result = self.rpc.call_
        if result.failed or not result.data:
           ^ rpc-layer error    |
                                - result payload
      
      While this is slightly more complicated, it will allow cleaner checks in
      the future; right now the code is just a plain port, without
      optimizations.
      
      There's also a "result.Raise()" which raises an OpExecError if the
      rpc-layer had errors.
      
      One side-effect of the patch is that now all return types from the
      rpc.call_ functions are of either RpcResult (single-node) or dicts of
      (node name, RpcResult); previously, some functions were returning
      different object types based on error status.
      
      The code passes burnin (after many retries :).
      
      Reviewed-by: imsnah
      781de953