1. 05 Dec, 2008 9 commits
    • Iustin Pop's avatar
      Add two utility functions to cmdlib · a5961235
      Iustin Pop authored
      These will be used for parameter checking and node status checking.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Add function to compute the master candidates · ec0292f1
      Iustin Pop authored
      Since some nodes can be offline, we can't just take the length of the
      node list as the maximum possible number of master candidates.
      The patch adds an utility function to correctly compute this value and
      replaces hardcoded computations with the use of this function. It then
      adds utility functions to automate the maintenance of the node lists.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      http: use slicing instead of string modification · b18dd019
      Iustin Pop authored
      The combination of the current buffer splitting method and (4KB) buffer
      size is very inefficient when writing big amounts of data. Just walking
      over a 16 megabyte string using a 4K buffer takes (on a random computer)
      1m06s, whereas using slices will decrease this to 0.080s, and slicing
      with 32 KB size decreases this to 0.073s.
      This means that uploading a big config file (it nears 1MB for big
      clusters) will take more and more time per the number of nodes, since it
      needs lots of slicing.
      I happened upon this by accidentally setting all nodes as master
      candidates, at which point just uploading the config file to all nodes
      took 40s. Applying the patch decreases this to 15s (this probably can
      still be optimized).
      The patch also removes a duplicate constant (the one actually used is in
      http/client.py), and changes the receive buffer size to use the same
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Add the offline node list to ssconf · a3316e4a
      Iustin Pop authored
      The patch also changes the various node list generation to be more
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Cleanup the config file on demotion from candidate · 56aa9fd5
      Iustin Pop authored
      This patch adds a simple rpc which makes a backup of the config file and
      then removes it. This is done so that cluster verify doesn't complain
      immediately after demoting a node.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      watcher: handle offline nodes better · cbfc4681
      Iustin Pop authored
      This patch changes the LUQueryInstances to show a different state for
      offline nodes and also modifies the watcher to understand the offline
      state in its checks.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      node list: add the offline field · 9ddb5e45
      Iustin Pop authored
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Add a new node parameter 'offline' · fc0fe88c
      Iustin Pop authored
      This patch adds a new node parameter called offline that will be used to
      mark nodes which should be touched by commands.
      We also add this flag at cluster init, node add, and export it to
      iallocator scripts.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      ssconf: empty files should not add a newline · 02b31f32
      Iustin Pop authored
      Currently we add a newline in the ssconf writeout process, even if the
      file is empty. We chage this case so that lists of values (e.g. offline
      nodes) are correct (not a list of one empty element).
      Reviewed-by: imsnah
  2. 04 Dec, 2008 10 commits
  3. 03 Dec, 2008 17 commits
  4. 02 Dec, 2008 4 commits
    • Guido Trotter's avatar
      Fix hooks_unittest with new rpc call structure · 7ccb3074
      Guido Trotter authored
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      Fix gnt-cluster verify w.r.t. rpc changes · 25361b9a
      Iustin Pop authored
      This partially reorganizes the cluster verify LU:
        - introduce constants for the node verify rpc call
        - move from additional rpc calls to a single rpc call, the
          call_node_info, which gaters all data needed
      Also fix a small error (self.LogWarning instead of self.Warning).
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Fix cluster rename · 55cf7d83
      Iustin Pop authored
      With the recent configwriter/ssconf changes, cluster rename becomes
      trivial. This patch gets rids of the code and just updates the cluster
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Convert rpc results to a custom type · 781de953
      Iustin Pop authored
      For a long time we had the problem that both RPC-layer errors and
      results from the remote node share the same "valuespace". This is
      because we shouldn't raise an exception when only one node failed
      (and lose the results from the other nodes).
      This patch attempts to address this problem by returning a special
      object from RPC calls, which separates the rpc-layer status and the
      remote results into different attributes.
      All the users of rpc (mainly cmdlib, but also bootstrap and the
      HooksMaster in mcpu) have been converted to this new model. The code has
      changed from, e.g. for boolean return types:
        if not self.rpc.call_...
        result = self.rpc.call_
        if result.failed or not result.data:
           ^ rpc-layer error    |
                                - result payload
      While this is slightly more complicated, it will allow cleaner checks in
      the future; right now the code is just a plain port, without
      There's also a "result.Raise()" which raises an OpExecError if the
      rpc-layer had errors.
      One side-effect of the patch is that now all return types from the
      rpc.call_ functions are of either RpcResult (single-node) or dicts of
      (node name, RpcResult); previously, some functions were returning
      different object types based on error status.
      The code passes burnin (after many retries :).
      Reviewed-by: imsnah