1. 10 Apr, 2008 9 commits
    • Guido Trotter's avatar
      Verify: add more instance information to node_info · 36e7da50
      Guido Trotter authored
      The sisnt-by-pnode field contains all secondary instances of a node, grouped by
      their primary node. This information allows us to see quickly if when a node
      dies some of its instances cannot be started on their secondary node.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Verify: add instance information to node_info · 93e4c50b
      Guido Trotter authored
      With this patch node_info is changed to store information about which primary
      and secondary instances are configured on a node. This information is useful to
      check memory and disk allocation. A list of non-redundant instances is also
      collected at this stage.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Verify: Add and populate node_info dict · 9c9c7d30
      Guido Trotter authored
      During information gathering we collect information from call_node_info, and
      then when we cycle trough the nodes add it into a node_info dict containing a
      node's free memory and disk. This will be useful later to verify that the
      cluster is N+1 redundant. The disk space is saved as well because it can be
      useful for checks about disk space redundancy.
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      Rework the results of OpDiagnoseOS opcode · 1f9430d6
      Iustin Pop authored
      Currently, the opcode DiagnoseOS is the only opcode that return a
      structure of objects.OS (which is a custom class, and not a simple
      python object) and furthermore all the processing of OS validity across
      nodes is left to the clients of this opcode.
      It would be more logical to have this opcode be similar to list
      instances/nodes, in the sense that:
        - it should return a table of results
        - the fields in the table should be selectable
      This patch does the above. The possible fields are:
        - name (os name)
        - valid (bool representing validity across all nodes)
        - node_status, which is a complicated structure required for ‘gnt-os
      With this patch, gnt-os list becomes a very simple iteration over the
      list of results, filtering out non-valid ones. gnt-os diagnose is still
      complicated, but no more than before.
      The burnin tool has also been modified to work with the modified
      results, and is simpler because of this (it only needs to know if an OS
      is valid or not, not the per-node details).
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Verify: remove useless check in _VerifyInstance · ceb76b36
      Guido Trotter authored
      The list of instances passed to _VerifyInstance is the one coming from
      self.cfg.GetInstanceList(). So there's no point, inside that function, in
      checking whether the current instance is a member of that list. Moreover
      orphaned instance verification is already done in a separate step.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Verify: instance verification cleanup · c5705f58
      Guido Trotter authored
      The instance configuration is grabbed both in the _VerifyInstance function and
      in the loop that calls it. Clean this up by passing the configuration as a
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Verify: fix crash when a node is down · a872dae6
      Guido Trotter authored
      Currently if ganeti-noded doesn't respond on a node gnt-cluster verify will die
      when verifying primary instances for that node. Fix this by just emitting an
      error message if no information about running instances is returned from the
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Verify: fix ERROR message indentation · c840ae6f
      Guido Trotter authored
      All ERROR messages in cluster verify are indented by four spaces, this one is
      indented by two. Fixing this skew.
      Reviewed-by: imsnah, iustinp
    • Manuel Franceschini's avatar
      Small code style fix · 16687b98
      Manuel Franceschini authored
      Reviewed-by: imsnah
  2. 09 Apr, 2008 1 commit
  3. 08 Apr, 2008 3 commits
    • Manuel Franceschini's avatar
      Two small code style fixes · 1c6e3627
      Manuel Franceschini authored
      Reviewed-by: imsnah
    • Manuel Franceschini's avatar
      Modify LURenameInstance to support file backend · b23c4333
      Manuel Franceschini authored
      This patch does two things:
      - Modify LURenameInstance.Exec to rename directory
        when a file-based instance is renamed
      - Modify config.RenameInstance() to replace the directory name in
        config.data for file devices
      Reviewed-by: iustinp
    • Manuel Franceschini's avatar
      Modify LUCreateInstance to support file backend · 0f1a06e3
      Manuel Franceschini authored
      - Modfiy _GenerateDiskTemplate to support file-based disk template
      - Modify _CreateDisks to create directory needed for file-based
        instances before creating the actual files
      - Modify _RemoveDisks to delete directory for file-based instances
        after deleting their VBDs
      - Add Prereq-check to check if given file-driver is valid
      - Add Prereq-check to check if given file-storage-dir path is relative
      Reviewed-by: iustinp
  4. 02 Apr, 2008 5 commits
    • Guido Trotter's avatar
      Improve disk consistency error message again · aa9d0c32
      Guido Trotter authored
      This new version includes all the possible failure options.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Fix misleading error message when checking disks · ad6d3f7d
      Guido Trotter authored
      _CheckDiskConsistency outputs "Can't get any data from node NODE" when no drbd
      is found on the target node. This causes a misleading error message to be
      output for example on failover (when the primary node is down, or the instance
      is not running), stating that no data could be got from the secondary node,
      which scares the user and misleads him. Changing this to "Disk degraded or not
      found on node %s" still reports that something is missing, but on the other
      hand doesn't make the user think the node is down, or has no data at all...
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Handle better failing over non-running instances · a0aaa0d0
      Guido Trotter authored
      Right now if you try to failover an instance which is not marked as up the
      operation will fail unless you pass the --ignore-consistency flag because the
      disks won't be considered to be consistent. Allow them to be if we know the
      instance shouldn't be up.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Improve export and fix export-on-norun bug · fb300fb7
      Guido Trotter authored
      Currently gnt-backup export chains the ShutdownInstance and StartupInstance
      opcodes to itself. This works but (a) it's suboptimal, because there's no need
      to deactivate the instance's disks as we are about to restart it anyway, and
      (b) doesn't take care of instances which are already down (and should be). This
      patch takes care of this by just calling the shutdown rpc function instead of
      the whole opcode, and just starting up the instance if it's configured as up in
      the first place.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      failover: only start instance if we should · 12a0cfbe
      Guido Trotter authored
      gnt-instance failover on an instance marked as down will mistakenly bring it
      up. The watcher will then shut it down again, but it's a lot better (and safer)
      not to start it at all.
      Reviewed-by: imsnah
  5. 31 Mar, 2008 5 commits
    • Iustin Pop's avatar
      Change the 'gnt-cluster command' execution order · 5f83e263
      Iustin Pop authored
      This patch makes the command execute last on the master (if the master
      is selected). The order for the other nodes is unchanged.
      The patch also updates the man page with some explanations and an
      Reviewed-by: imsnah
    • Manuel Franceschini's avatar
      parms->params Refactoring · 7767bbf5
      Manuel Franceschini authored
      - Substitute all occurences of name 'parms' with 'params'
      - Small codestyle fix
      Reviewed-by: ultrotter
    • Manuel Franceschini's avatar
      Skip HasValidVG when --no-lvm-storage on cluster init · efa14262
      Manuel Franceschini authored
      This patch does two things:
      - Remove "vg_name" from _OP_REQP due to the introduction of
        --no-lvm-storage. Since vg_name option has as default now None and is
        only set to the DEFAULT_VG if lvm_storage is enabled, this is needed
      - It changes LUInitCluster.CheckPrereq() to skip _HasValidVG check
        initializing the cluster with --no-lvm-storage. Furthermore it adds to
        the help message the statement of the possibility to use
        --no-lvm-storage if no 'xenvg' volume group is found.
      Reviewed-by: iustinp
    • Manuel Franceschini's avatar
      Add LUSetClusterParams to cmdlib · 8084f9f6
      Manuel Franceschini authored
      Add LUSetClusterParams, which is the LU to modify cluster options.
      This includes checks:
      - not to disable lvm storage when it's already disabled
      - not to enable lvm storage when it is already enabled
      - not to disable lvm when lvm-based instances are present
      - that the specified volume group is valid on all cluster-nodes
        when lvm-storage is going to be enabled
      Reviewed-by: iustinp
    • Manuel Franceschini's avatar
      Add lvm-storage check when creating instances · eedc99de
      Manuel Franceschini authored
      This adds a check to fail instance creation if lvm-storage is disabled
      (cluster-wide). If lvm-storage is disabled (by initializing the cluster
      with --no-lvm-storage) only instances with disk template in frozenset
      DTS_NOT_LVM are allowed to create.
      Reviewed-by: iustinp
  6. 30 Mar, 2008 1 commit
    • Iustin Pop's avatar
      Change the order of config updates in some LUs · fe482621
      Iustin Pop authored
      In the start and stop instance LUs, the configuration update is done
      right at the end. This means that if, for example, the instance shutdown
      succeeds, but the drive deactivation fails, the next run of the watcher
      will start the instance again, as it's still marked in running mode.
      This patch changes these two LUs so that first the update the
      configuration to the desired state, and only then we proceed to update
      the config. This ensures that the state saved is the desired state.
      Because the config might be updated even though the LU failed, this
      patch also modifies the mcpu.Processor.ExecOpCode method to run the
      RunConfigUpdate hook in a finally: phase while the lu.Exec is done in
      its try phase. This ensures that config update hooks (tries to) run at
      all times when the config is updated.
      Reviewed-by: schreiberal
  7. 27 Mar, 2008 1 commit
  8. 25 Mar, 2008 2 commits
    • Iustin Pop's avatar
      Remove the option to create md/drbd7 instances · f9193417
      Iustin Pop authored
      This patch removes the options that allow to create local_raid1 or
      remote_raid1 instances. It also modifies the documentation and removes
      these disk templates from burnin and from qa.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Remove the add/remove mirror operations · 249069a1
      Iustin Pop authored
      These two operations are related to md/drbd7 code (remote_raid1). Remove
      them as part of the md/drbd7 removal.
      Reviewed-by: imsnah
  9. 20 Mar, 2008 1 commit
    • Manuel Franceschini's avatar
      Modify cluster-init to create file-storage-dir · 2872a949
      Manuel Franceschini authored
      This patch adds three things:
      - it normalizes the file storage directory path passed to gnt-cluster init
      - if the file-storage-path doesn't exist on the master node, ganeti
        tries to create it
      - adds additional check if the passed file-storage-dir is not a directory
      Reviewed-by: iustinp
  10. 19 Mar, 2008 1 commit
  11. 18 Mar, 2008 4 commits
  12. 11 Mar, 2008 2 commits
    • Iustin Pop's avatar
      Disable cluster init with a reachable IP · 411f8ad0
      Iustin Pop authored
      Make the cluster init fail if the IP to which the cluster name resolved
      is already reachable by the master node. This is not a foolproof
      solution, but it allows a cheap method of detecting simple mistakes.
      It will also disallow using the master node name as cluster name (which
      is something good).
      The only drawbacks that I see are:
        - you are not allowed to do this, which might come in handy in cluster
          upgrades; but since we support rename, this is mitigated
        - cluster init takes longer now (+the timeout value, set to 5
          seconds), but since this is a one-off operation, it should be ok
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Modify utils.TcpPing to make source address optional · b15d625f
      Iustin Pop authored
      This patch modifies TcpPing and its callers to make the source address
      selection optional. Usually, the kernel will know better what
      source address to use, just in some cases we want to enforce a given
      source address so it makes sense to make this optional.
      Reviewed-by: ultrotter
  13. 05 Mar, 2008 1 commit
  14. 29 Feb, 2008 2 commits
    • Iustin Pop's avatar
      Fix master role stop on cluster destroy · c9064964
      Iustin Pop authored
      Currently the cluster destroy doesn't remove the master role, which
      means that the IP address of the cluster remains assigned to the master
      This patch fixes this and also a docstring in backend.StopMaster().
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Fix cluster rename operation · 488b540d
      Iustin Pop authored
      This one-liner fixes the cluster rename operation. As a side note, we
      should have a QA test for this too.
      Reviewed-by: imsnah
  15. 28 Feb, 2008 1 commit
    • Guido Trotter's avatar
      Don't allow renaming to an existing instance · 7bde3275
      Guido Trotter authored
      Even if the target instance is down or we are not checking for IP conflicts
      changing an instance name to a new one which is already in the cluster is
      doomed to fail, because in a lot of places (among which figures the mind of
      most users/admins) instance names are assumed to be unique.
      Reviewed-by: imsnah
  16. 27 Feb, 2008 1 commit