1. 24 Apr, 2008 3 commits
    • Iustin Pop's avatar
      Style fixes for trunk · b4de68a9
      Iustin Pop authored
      This small patch fixes:
        - wrong indentation in two places
        - use of 'os' variable that hides global scope os module
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Implement replace secondary via the iallocator · b6e82a65
      Iustin Pop authored
      This patch implements secondary replace via the iallocator. The new
      opcode parameter 'iallocator' behaves like this: if passed, it will
      always compute and assign a new secondary, behaving in effect as if the
      secondary node has been passed. It conflicts with actually giving the
      secondary too.
      [Note: not tested with remote_raid1, but the code should behave the
      same, we only touch CheckPrereq and we assign a node.]
      The patch also adds burnin support for the replace secondary operation;
      with this in place, burnin can fully work with auto-assigned nodes.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Fix generalized relocate mode of IAllocator · 29859cb7
      Iustin Pop authored
      The patch which generalized the IAllocator was half-true: it actually
      put the selection of the node inside the IAllocator, so callers were not
      able to specify replace primary node.
      This patch does:
        - split the arguments to the constructor in three sets: mode and name
          are always passed, and then we differentiate between allocation
          parameters and relocation ones
        - add a new relocate_from option to the IAllocator constructor which
          is a list of nodes we want to move the instance off
        - rename the 'nodes' argument in the request object to 'relocate_from'
          since this is clearer and is not confused with the result field also
          called 'nodes'
      Reviewed-by: ultrotter
  2. 23 Apr, 2008 4 commits
    • Guido Trotter's avatar
      Add gnt-backup remove functionality · 9ac99fda
      Guido Trotter authored
      This patch also fixes the LUExportInstance Prereq docstring.
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      Generalize the replace_secondary mode in iallocator · 2a139bb0
      Iustin Pop authored
      Currently the replace_secondary mode is too restrictive. This patch
      changes this to a general 'relocate' mode where the node(s) to be
      changed are specified via a new key in the request dict ('nodes') so
      that we can change any of the instance's nodes.
      Note that for the relocate mode, len(nodes) == required_nodes, so the
      required nodes field is redundant, but it is provided for consistency
      with the allocate mode.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Send required_nodes field to the iallocator scripts · 27579978
      Iustin Pop authored
      This patch adds the 'required_nodes' field in the request dict for the
      This means that the handmade-checks in the create instance can be
      simplified, and that the dumb allocator can be made simple. Therefore
      the patch also modifies it.
      The patch also sends the disk_space_total to the script in realocate
      mode and a small fix for showing errors (include stderr too).
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Move all iallocator functions into a class · d1c2dd75
      Iustin Pop authored
      This patch moves all the iallocator function into a separate class that
      is then somewhat easier to use. It doesn't bring any new functionality.
      The patch also changes the way the iallocator is called - the
      OpTestAllocator opcode is no longer needed, and all its parameters
      should be passed directly to the IAllocator constructor.
      Reviewed-by: ultrotter
  3. 21 Apr, 2008 1 commit
    • Iustin Pop's avatar
      Abstract the json functions into a separate module · 8d14b30d
      Iustin Pop authored
      This simple patch adds a new module that holds the simplejson functions
      for serialization/deserialization. This reduces the amount of redundant
      The patch also adds some normalizations to the json output:
        - the output text will always have an EOL as last char
        - extra spaces before EOL are removed
      Reviewed-by: ultrotter
  4. 16 Apr, 2008 5 commits
    • Michael Hanselmann's avatar
      Add --readd option to “gnt-node add” · e7c6e02b
      Michael Hanselmann authored
      This allows us to readd a node after it failed and required a
      reinstallation or replacement.
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      IAllocator part 3: LUCreateInstance changes · 538475ca
      Iustin Pop authored
      This (final) patch allows the instance's nodes to be selected
      automatically based on the passed allocator algorithm.
      The patch changes the pnode opcode parameter from required to optional,
      now either the pnode or the iallocator must be passed.
      A possible improvement could be to organize all the _IAllocator
      functions into a separate class, but that can come later and the current
      version is functionally ok.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Reorder checks in instance create · 901a65c1
      Iustin Pop authored
      This patch reorders the checks in the instance create prereq so that all
      checks and normalisations that are not node-dependent are done before
      the node dependent checks.
      This is done so that, after the instance-related opcode parameters are
      checked and fixed, we can run the allocator and compute the primary (and
      any secondary) nodes, and only then proceed with node-related checks.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Implement 'out' direction on allocator tests · 298fe380
      Iustin Pop authored
      This patch adds the paths for searching for instance allocators and
      makes the LUTestAllocator code run the allocator and return the results
      if the direction specified is 'out'. 'out' means that the opcode will
      return the result of the allocator run, instead of the allocator input
      file ('in').
      The patch unifies all names to refer to 'iallocator' instead of plain
      The patch also adds an example allocator that can be used for testing
      this new functionality.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Allocator framework, 1st part: allocator input generation · d61df03e
      Iustin Pop authored
      In preparation for the introduction of automatic instance allocator,
      this patch adds an allocator simulation opcode, that based on the input
      parameters, will return either the input message to the allocator
      (implemented) or the result of the allocator run (not yet implemented).
      This allows algorithm tests against simulated allocations and the
      current cluster state.
      The patch adds the following:
        - a function that generates the generic cluster information for the
        - a function that generates the 'new instance' information
        - a function that generates the 'replace_secondary' information
      These three functions will be used by the allocator framework later to
      generate the actual information for the external algorithms. Currently
      we just return the json-serialized text.
      Reviewed-by: imsnah
  5. 15 Apr, 2008 3 commits
  6. 10 Apr, 2008 12 commits
    • Guido Trotter's avatar
      Verify: make skipping checks possible · e54c4c5e
      Guido Trotter authored
      Add a general way to skip some checks at cluster-verify time and make the N+1
      memory redundancy check optional.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Verify: add N+1 Memory redundancy verification · 2b3b6ddd
      Guido Trotter authored
      For every node we check that we can host all the instances it's currently
      secondary for belonging to the same primary. This ensures that if a node fails
      all its instances can fit on their secondary node. The code only works when
      failover is forced to go to the secondary node, and cannot go to an arbitrary
      node in the cluster, which is the case in Ganeti 1.2.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Verify: save instance config · 26b6af5e
      Guido Trotter authored
      Save the instance config after we queried it in an instance_cfg dict.  This can
      be used later by any function that wants it, without reloading it from the
      configuration module. It will be used for N+1 memory resilience checking.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Verify: add more instance information to node_info · 36e7da50
      Guido Trotter authored
      The sisnt-by-pnode field contains all secondary instances of a node, grouped by
      their primary node. This information allows us to see quickly if when a node
      dies some of its instances cannot be started on their secondary node.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Verify: add instance information to node_info · 93e4c50b
      Guido Trotter authored
      With this patch node_info is changed to store information about which primary
      and secondary instances are configured on a node. This information is useful to
      check memory and disk allocation. A list of non-redundant instances is also
      collected at this stage.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Verify: Add and populate node_info dict · 9c9c7d30
      Guido Trotter authored
      During information gathering we collect information from call_node_info, and
      then when we cycle trough the nodes add it into a node_info dict containing a
      node's free memory and disk. This will be useful later to verify that the
      cluster is N+1 redundant. The disk space is saved as well because it can be
      useful for checks about disk space redundancy.
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      Rework the results of OpDiagnoseOS opcode · 1f9430d6
      Iustin Pop authored
      Currently, the opcode DiagnoseOS is the only opcode that return a
      structure of objects.OS (which is a custom class, and not a simple
      python object) and furthermore all the processing of OS validity across
      nodes is left to the clients of this opcode.
      It would be more logical to have this opcode be similar to list
      instances/nodes, in the sense that:
        - it should return a table of results
        - the fields in the table should be selectable
      This patch does the above. The possible fields are:
        - name (os name)
        - valid (bool representing validity across all nodes)
        - node_status, which is a complicated structure required for ‘gnt-os
      With this patch, gnt-os list becomes a very simple iteration over the
      list of results, filtering out non-valid ones. gnt-os diagnose is still
      complicated, but no more than before.
      The burnin tool has also been modified to work with the modified
      results, and is simpler because of this (it only needs to know if an OS
      is valid or not, not the per-node details).
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Verify: remove useless check in _VerifyInstance · ceb76b36
      Guido Trotter authored
      The list of instances passed to _VerifyInstance is the one coming from
      self.cfg.GetInstanceList(). So there's no point, inside that function, in
      checking whether the current instance is a member of that list. Moreover
      orphaned instance verification is already done in a separate step.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Verify: instance verification cleanup · c5705f58
      Guido Trotter authored
      The instance configuration is grabbed both in the _VerifyInstance function and
      in the loop that calls it. Clean this up by passing the configuration as a
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Verify: fix crash when a node is down · a872dae6
      Guido Trotter authored
      Currently if ganeti-noded doesn't respond on a node gnt-cluster verify will die
      when verifying primary instances for that node. Fix this by just emitting an
      error message if no information about running instances is returned from the
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Verify: fix ERROR message indentation · c840ae6f
      Guido Trotter authored
      All ERROR messages in cluster verify are indented by four spaces, this one is
      indented by two. Fixing this skew.
      Reviewed-by: imsnah, iustinp
    • Manuel Franceschini's avatar
      Small code style fix · 16687b98
      Manuel Franceschini authored
      Reviewed-by: imsnah
  7. 09 Apr, 2008 1 commit
  8. 08 Apr, 2008 3 commits
    • Manuel Franceschini's avatar
      Two small code style fixes · 1c6e3627
      Manuel Franceschini authored
      Reviewed-by: imsnah
    • Manuel Franceschini's avatar
      Modify LURenameInstance to support file backend · b23c4333
      Manuel Franceschini authored
      This patch does two things:
      - Modify LURenameInstance.Exec to rename directory
        when a file-based instance is renamed
      - Modify config.RenameInstance() to replace the directory name in
        config.data for file devices
      Reviewed-by: iustinp
    • Manuel Franceschini's avatar
      Modify LUCreateInstance to support file backend · 0f1a06e3
      Manuel Franceschini authored
      - Modfiy _GenerateDiskTemplate to support file-based disk template
      - Modify _CreateDisks to create directory needed for file-based
        instances before creating the actual files
      - Modify _RemoveDisks to delete directory for file-based instances
        after deleting their VBDs
      - Add Prereq-check to check if given file-driver is valid
      - Add Prereq-check to check if given file-storage-dir path is relative
      Reviewed-by: iustinp
  9. 02 Apr, 2008 5 commits
    • Guido Trotter's avatar
      Improve disk consistency error message again · aa9d0c32
      Guido Trotter authored
      This new version includes all the possible failure options.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Fix misleading error message when checking disks · ad6d3f7d
      Guido Trotter authored
      _CheckDiskConsistency outputs "Can't get any data from node NODE" when no drbd
      is found on the target node. This causes a misleading error message to be
      output for example on failover (when the primary node is down, or the instance
      is not running), stating that no data could be got from the secondary node,
      which scares the user and misleads him. Changing this to "Disk degraded or not
      found on node %s" still reports that something is missing, but on the other
      hand doesn't make the user think the node is down, or has no data at all...
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Handle better failing over non-running instances · a0aaa0d0
      Guido Trotter authored
      Right now if you try to failover an instance which is not marked as up the
      operation will fail unless you pass the --ignore-consistency flag because the
      disks won't be considered to be consistent. Allow them to be if we know the
      instance shouldn't be up.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Improve export and fix export-on-norun bug · fb300fb7
      Guido Trotter authored
      Currently gnt-backup export chains the ShutdownInstance and StartupInstance
      opcodes to itself. This works but (a) it's suboptimal, because there's no need
      to deactivate the instance's disks as we are about to restart it anyway, and
      (b) doesn't take care of instances which are already down (and should be). This
      patch takes care of this by just calling the shutdown rpc function instead of
      the whole opcode, and just starting up the instance if it's configured as up in
      the first place.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      failover: only start instance if we should · 12a0cfbe
      Guido Trotter authored
      gnt-instance failover on an instance marked as down will mistakenly bring it
      up. The watcher will then shut it down again, but it's a lot better (and safer)
      not to start it at all.
      Reviewed-by: imsnah
  10. 31 Mar, 2008 3 commits
    • Iustin Pop's avatar
      Change the 'gnt-cluster command' execution order · 5f83e263
      Iustin Pop authored
      This patch makes the command execute last on the master (if the master
      is selected). The order for the other nodes is unchanged.
      The patch also updates the man page with some explanations and an
      Reviewed-by: imsnah
    • Manuel Franceschini's avatar
      parms->params Refactoring · 7767bbf5
      Manuel Franceschini authored
      - Substitute all occurences of name 'parms' with 'params'
      - Small codestyle fix
      Reviewed-by: ultrotter
    • Manuel Franceschini's avatar
      Skip HasValidVG when --no-lvm-storage on cluster init · efa14262
      Manuel Franceschini authored
      This patch does two things:
      - Remove "vg_name" from _OP_REQP due to the introduction of
        --no-lvm-storage. Since vg_name option has as default now None and is
        only set to the DEFAULT_VG if lvm_storage is enabled, this is needed
      - It changes LUInitCluster.CheckPrereq() to skip _HasValidVG check
        initializing the cluster with --no-lvm-storage. Furthermore it adds to
        the help message the statement of the possibility to use
        --no-lvm-storage if no 'xenvg' volume group is found.
      Reviewed-by: iustinp