1. 02 Nov, 2009 4 commits
    • Iustin Pop's avatar
      Introduce two-argument style for OpPrereqError · 5c983ee5
      Iustin Pop authored
      
      
      This patch introduces a two-argument style for OpPrereqError. Only the
      direct raise calls in cmdlib.py are converted, other users will follow.
      
      cli.py is modified to handle both two-argument style and the current
      format. RAPI doesn't need modification as the way we encode errors is
      already using a list for the error arguments, so RAPI users only need to
      start checking the list length and the second argument.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      5c983ee5
    • Iustin Pop's avatar
      Remove the OpRetryError exception · 159d4ec6
      Iustin Pop authored
      
      
      This is only used in two places, in an error path that is no longer
      valid since Ganeti 2.0. We remove the try..except since we should not
      get it anymore (and if we do, then we should catch it in all
      config.Update cases) and we remove the exception class completely.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      159d4ec6
    • Michael Hanselmann's avatar
      Activate disks while exporting an instance · 3e53a60b
      Michael Hanselmann authored
      
      
      Exporting an instance not running or without activated disks
      will fail. This patch makes sure to activate disks before
      exporting an instance if it's in the ADMIN_down state.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      3e53a60b
    • Iustin Pop's avatar
      Unify the query fields for the storage framework · 620a85fd
      Iustin Pop authored
      
      
      This patch unifies the query fields in the storage framework for all
      types. Note that the information is still computed on-demand, so if e.g.
      the used disk space is not requested for the ‘file’ type, it won't be
      computed on nodes.
      
      Summary of changes:
      - improve the LVM storage type to support multiple lvm fields in the
        LIST_FIELDS declaration and constant (not-computed via lvm commands)
        fields
      - rename utils.GetFilesystemFreeSpace to utils.GetFilesystemStats
        returning tuple of (total, free)
      - add used and free as valid fields for lvm-vg (use being computed as
        vg_size-vg_free)
      - make allocatable accepted for all types (ones which are always
        allocatable always return True)
      - add a new list field ‘type’ that gives the current selected type; not
        much useful today (except for understanding what the default output
        is) but in the future might help if we want to list multiple types
      - add type, size and allocatable to the default output field list
      - update the man page with details on how, for file storage, size ≠ used
        + free for non-mountpoint cases
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      620a85fd
  2. 29 Oct, 2009 1 commit
  3. 28 Oct, 2009 2 commits
  4. 27 Oct, 2009 2 commits
    • Michael Hanselmann's avatar
      Provide feedback from redistributing configuration · a4eae71f
      Michael Hanselmann authored
      
      
      This is particularily useful for “gnt-cluster redist-conf”, but
      also for all other cases where the configuration files are
      rewritten on other nodes.
      
      $ gnt-cluster redist-conf
      … Copy of file /var/lib/ganeti/config.data to node … failed: Error while
      executing backend function: [Errno 1] Operation not permitted
      … Error while uploading ssconf files to node …: Error while executing backend
      function: [Errno 1] Operation not permitted
      
      $ gnt-node modify --offline no --force node3.example.com
      … - WARNING: Not enough master candidates (desired 10, new value will be 4)
      … Copy of file /var/lib/ganeti/config.data to node node8.example.com failed:
      Error while executing backend function: [Errno 1] Operation not permitted
      Modified node node3.example.com
       - offline -> True
       - master_candidate -> auto-demotion due to offline
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      a4eae71f
    • Iustin Pop's avatar
      Fix gnt-node evacuate w. iallocator · e9022531
      Iustin Pop authored
      Commit 2bb5c911
      
       moved around and changed the _RunAllocator function in
      the DiskReplace → TaskLet conversion, but in the process it changed the
      relocate_from argument from a list of nodes to just the secondary node.
      This breaks the protocol and current iallocator scripts.
      
      This patch fixes that but also adds a local variable 'instance' since
      it's not nice to write self.instance so many times.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      e9022531
  5. 23 Oct, 2009 1 commit
  6. 22 Oct, 2009 1 commit
  7. 20 Oct, 2009 1 commit
  8. 13 Oct, 2009 1 commit
  9. 12 Oct, 2009 1 commit
  10. 09 Oct, 2009 2 commits
  11. 05 Oct, 2009 3 commits
  12. 02 Oct, 2009 4 commits
  13. 01 Oct, 2009 1 commit
  14. 25 Sep, 2009 1 commit
    • Iustin Pop's avatar
      Fix the confusing ssh/hostname message in node add · 31821208
      Iustin Pop authored
      
      
      Before, it used to say:
      
        ssh/hostname verification failed node1.example.com -> hostname mismatch, got
        node2
      
      Now it says for wrong hostnames (maybe too verbose):
      
        ssh/hostname verification failed (checking from node1.example.com): hostname
        mismatch, expected node2.example.com but got node3
      
      And for non-FQDN hostnames:
      
        ssh/hostname verification failed (checking from node1.example.com): hostname
        not FQDN: expected node2.example.com but got node2
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      31821208
  15. 24 Sep, 2009 3 commits
  16. 21 Sep, 2009 2 commits
  17. 17 Sep, 2009 3 commits
    • Michael Hanselmann's avatar
      3cebe102
    • Iustin Pop's avatar
      Add an error-simulation mode to cluster verify · a0c9776a
      Iustin Pop authored
      
      
      One of the issues we have in ganeti is that it's very hard to test the
      error-handling paths; QA and burnin only test the OK code-path, since
      it's hard to simulate errors.
      
      LUVerifyCluster is special amongst the LUs in the fact that a) it has a
      lot of error paths and b) the error paths only log the error, they don't
      do any rollback or other similar actions. Thus, it's enough for this LU
      to separate the testing of the error condition from the logging of the
      error condition.
      
      This patch does this by replacing code blocks of the form:
      
        if x:
          log_error()
          [y]
      
      into:
      
        log_error_if(x)
        [if x:
          y
        ]
      
      After this change, it's simple enough to turn on logging of all errors
      by adding a special case inside log_error_if such that if the incoming
      opcode has a special ‘debug_simulate_errors’ attribute and it's true, it
      will log unconditionally the error.
      
      Surprisingly this also turns into an absolute code reduction, since some
      of the if blocks were simplified. The only downside to this patch is
      that the various _VerifyX() functions are now stateful (modifying an
      attribute on the LU instance) instead of returning a boolean result.
      
      Last note: yes, this discovered some error cases in the logging.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      a0c9776a
    • Iustin Pop's avatar
      Introduce parseable error codes in LUVerifyCluster · 7c874ee1
      Iustin Pop authored
      
      
      Currently the output of cluster verify can be parsed for 'ERROR'
      messages, but that is the only indication we get (error or no error). In
      order to allow monitoring tools to separate different error conditions,
      this patch introduces a new output format (“gnt-cluster verify
      --error-codes”) that changes the output from human-friendly to
      machine-friendly. In this mode, an error line changes from:
        ERROR: node node1: drbd minor 1 of instance inst1.is not active
      
      to:
        ERROR:ENODEDRBD:node:node1:drbd minor 1 of instance inst1 is not active
      
      i.e. the error message is a ‘:’-separated field, with ERROR in the first
      place, the error code in the second, the object type (cluster, node,
      instance) in the third, the name of the object (for nodes/instances) in
      the fourth, and then the text message.
      
      The patch also removes some of the verbosity of the operation
      (“Verifying instance X”, “Verifying node X”) since on big clusters these
      informational messages can quickly fill up an entire screen. The
      original behaviour can be restored via the ‘--verbose’ option.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      7c874ee1
  18. 16 Sep, 2009 1 commit
  19. 14 Sep, 2009 2 commits
  20. 11 Sep, 2009 2 commits
  21. 08 Sep, 2009 1 commit
  22. 03 Sep, 2009 1 commit