1. 05 May, 2009 2 commits
  2. 04 May, 2009 2 commits
  3. 24 Apr, 2009 3 commits
    • Iustin Pop's avatar
      LUDiagnoseOS: change locking and error handling · a6ab004b
      Iustin Pop authored
      Since the “list OSes” call is exported via RAPI, this can be used pretty
      easily to DOS the master daemon during long jobs.
      The implementation of LUDiagnoseOS makes an RPC call to all nodes; we
      lock nodes here in order to prevent node removal.
      However, after closer examination, the worst case is:
        - we get the list of nodes from the config
        - another thread removes a node
        - our RPC queries reach the removed node
      As this point, if ganeti-noded is stopped or doesn't accept our queries,
      the RPC call will return failed, and in the current implementation all
      OSes will become invalid.
      If we change the ‘failed RPC’ handling to ignore such nodes, this allows
      us to both remove locking, and to handle transient RPC failures better
      (not invalidating all OSes).
      This patch does both these things, with a single drawback: in gnt-os
      diagnose, the down nodes do not appear at all. I think this is a small
      drawback, and the alternative is to add them with status failed; this
      works (3-line patch), but then the output of “list” and “diagnose” will
      no longer be consistent. As such, my proposal is to not list the nodes.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Fix verify-disks with broken volume groups · ea9ddc07
      Iustin Pop authored
      When a remote node returns invalid LVM data, we check it, but we don't
      stop and continue with the rest of the checks (which require a valid
      volume group). This raises an internal error and breaks verify disks.
      This seems unchanged for a long while, I don't know why it surfaced just
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Prevent errors when xenvg is broken cluster verify · 9a198532
      Iustin Pop authored
      When vg_name is not returned at all, we currently abort with an internal
      error. This is because we don't catch KeyError.
      This patch adds a custom message for this case, and also adds KeyError
      to the list of catched exceptions, just for safety.
      On the other hand, we could also just remove this piece of code since
      it's not used at all the ["dfree"] value.
      Reviewed-by: ultrotter
  4. 15 Apr, 2009 1 commit
    • Iustin Pop's avatar
      A bunch of doc and other small fixes · 949bdabe
      Iustin Pop authored
      This patch adds a couple of both externally and internally reported
        - missing SGML tags (Issue 54), report and patch by superdupont
        - wrong variable used in the init.d script, report and patch by
          Karsten Keil <karsten-keil@t-online.de>
        - man page for gnt-instance reinstall needs clarification (Issue 56)
        - gnt-instance man page missing --disks documentation for
        - gnt-node modify help output is unclear about the -C/-D/-O input
          format, and the man page doesn't document this command at all
        - “gnt-node modify -C yes” for offline or drained nodes had wrong
          error message
        - “gnt-instance reinstall --select-os” has wrong prompt, we only
          accept a number for the OS and not the template name
      Reviewed-by: ultrotter
  5. 06 Apr, 2009 2 commits
    • Iustin Pop's avatar
      Fix Xen soft reboot via polling · 7dd106d3
      Iustin Pop authored
      This patch fixes the Xen soft reboot ("xm reboot") via polling for a specific
      time for either changed domain ID or decreased CPU run-time.
      This sould prevent the race-conditions discussed on the mailing list for
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Add a new ssconf file with the cluster tags · 5d60b3bd
      Iustin Pop authored
      Since the cluster tags are/should be more-or-less static, add them as an
      ssconf key, so that querying them is possible without creating a
      job/requiring the masterd to be running.
      Reviewed-by: imsnah
  6. 20 Mar, 2009 2 commits
  7. 12 Mar, 2009 1 commit
    • Guido Trotter's avatar
      kvm: use the correct vnc bind address · 19498d6c
      Guido Trotter authored
      There is a bug in kvm, when binding vnc to a specific address the
      constant 'vnc_bind_address' is passed in, instead of the actual
      requested address. This patch fixes it.
      Reviewed-by: iustinp
  8. 10 Mar, 2009 1 commit
  9. 09 Mar, 2009 2 commits
    • Iustin Pop's avatar
      Handle ghost instances in temp DRBD map · c614e5fb
      Iustin Pop authored
      Currently cluster-verify doesn't handle the (admitedly invalid) case where we
      have reservation for instances that were removed in the meantime.
      This patch adds a check for this and prevents code errors in cluster-verify in
      this case:
       * Verifying node node4.example.com (master candidate)
         - ERROR: ghost instance \'instance3.example.com\' in temporary DRBD map
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Fix error handling in replace-disks with new node · 82759cb1
      Iustin Pop authored
      Currently the _CreateSingleBlockDev function only raises OpExecError and not
      BlockDeviceError. This means that we don't release the instance's temporary
      minors properly, and this creates problems later if the instance is removed
      without master restart.
      We could just use OpExecError, but adding it and leaving
      BlockDeviceError in seems safer.
      Reviewed-by: imsnah
  10. 02 Mar, 2009 4 commits
    • Iustin Pop's avatar
      Export tags to cluster verify hooks · 35e994e9
      Iustin Pop authored
      This patch export the cluster and node tags to the cluster verify hook
      scripts. The tags are exported as a space-separated list, which allows
      easy parsing from the shell (e.g. “for tag in $GANETI_CLUSTER_TAGS; do
      ...”) and therefore requires the previous “Don't allow spaces in tag
      names” patch.
      The patch also fixes a minor line length style problem.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Don't allow spaces in tag names · 28ab6fed
      Iustin Pop authored
      This patch restricts the use of spaces in tags, as this does not allow
      nice exporting of tags to environment in hooks. One can use underscores
      or dashes instead of spaces.
      Reviewed-by: schreiberal
    • Iustin Pop's avatar
      Update the iallocator documentation · 77031881
      Iustin Pop authored
      This updates the iallocator documentation to 2.0, bumps up the
      iallocator version (and moves a constants to lib/constants.py), and
      fixes a style on install.rst.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Fix a bug in utils.EnsureDirs · 1b2c8f85
      Iustin Pop authored
      This fixes a bug introduced in rev 2562 and also fixes the indentation.
      Reviewed-by: ultrotter
  11. 27 Feb, 2009 4 commits
    • Guido Trotter's avatar
      Use EnsureDirs in KVM as well. · 9afb67fe
      Guido Trotter authored
      The KVM hypervisor has also code to ensure a list of directories exist.
      Substitute it with our new utils function.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Create runtime dir in bootstrap · 9dae41ad
      Guido Trotter authored
      Some hypervisors (KVM) need RUN_GANETI_DIR to exist even at cluster init
      time. This patch creates it in InitCluster just before hv parameter
      checking. Since the code to make list of directories is already repeated
      twice in the code, and this would be the third time, we abstract it into
      an utils.EnsureDirs function and we call that one from ganti-noded,
      ganeti-masterd and bootstrap.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      LUVerifyCluster: Handle the "no volume group" case · cc9e1230
      Guido Trotter authored
      If we're only file based and out volume group is set to "None" there's
      no point in asking nodes for their volume groups, logical volumes, and
      drbd devices, and checking those.
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      Fix some epydoc style issues · 5fcc718f
      Iustin Pop authored
      99% of the epydoc return tags are "@return:", but each of the modified files
      had one "@returns:" line. We fix this for consistency.
      Reviewed-by: imsnah
  12. 26 Feb, 2009 1 commit
  13. 25 Feb, 2009 4 commits
    • Iustin Pop's avatar
      Fix mixed pvm/hvm clusters and instance listing · b33b6f55
      Iustin Pop authored
      The current implementation of the combining of the instance lists will
      only do this for instances whose all four-fields match in both
      hypervisors; however, this is broken for the dynamic fields (state,
      times) which can change between the invocations of the two different
      hypervisors if the instance is busy.
      The patch checks only the memory and VCPUs, and makes mixed clusters
      work even with 100% CPU instances.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Fix xen-hvm and KERNEL_ARGS · b399ce1e
      Iustin Pop authored
      xen-hvm doesn't have KERNEL_ARGS, and I just changed blindly all old
      extra_args to HV_KERNEL_ARGS. This makes xen-hvm work again.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Update some version-related constants · f3e2e4c6
      Iustin Pop authored
      Since we are quite close to final RPC and hooks APIs, we update the hooks and
      protocol_version constants.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Update some hooks settings · 2c2690c9
      Iustin Pop authored
      While reviewing the hooks document, I realised we are not correctly
      exporting the instance properties.
      This patch fixes:
        - export the disk and disk template in all LUs, not only (hardcoded)
          in the instance create
        - removes the instance create INSTANCE_ prefix on some non-instance
          variables (those are LU-related, not instance-related)
        - adds a couple of more variables to other LUs
      The hook document will be updated in a separate patch.
      Reviewed-by: ultrotter
  14. 24 Feb, 2009 4 commits
  15. 19 Feb, 2009 1 commit
  16. 17 Feb, 2009 1 commit
  17. 16 Feb, 2009 3 commits
    • Iustin Pop's avatar
      Fix some bugs in reboot · ae48ac32
      Iustin Pop authored
      There are two issues fixed in this patch:
        - first, the recent RPC changes caused loss of data in hard reboot
          type; we weren't reporting any results from the stop/start instance
        - second, in soft or hard reboots, we didn't initialized the disk
          physical ID; based on the last state of the instance's disks, this
          can create a failure in identifying the disks
      After this patch, burnin works again with reboot, and reports errors
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Convert IOErrors for /proc/drbd into our errors · f6eaed12
      Iustin Pop authored
      If /proc/drbd can't be opened, this raises an IOError, but all the
      error-handling behaviour in backend treats only BlockDeviceErrors. This
      creates a plain failure in cluster verify and in other RPC calls.
      This patch simply converts EnvironmentErrors into BlockDeviceErrors, and
      also changes the RPC result for NV_DRBDLIST and its handling to be able
      to show the error. The other RPC calls work by default now, due the
      existing error handling.
      Reviewed-by: ultrotter
    • Guido Trotter's avatar
      Convert default root partition to msdos style · 1cd8141c
      Guido Trotter authored
      As discussed with 2.0 msdos partition style should be the default in the
      instance OS, so we're changing the default instance params accordingly.
      A followup patch will update the debootstrap os.
      Reviewed-by: iustinp
  18. 13 Feb, 2009 2 commits
    • Iustin Pop's avatar
      RAPI: documentation updates · bf4a90af
      Iustin Pop authored
      This patch fixes the version and does some update to the RAPI resources
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      RAPI: fixes related to write mode · 6e99c5a0
      Iustin Pop authored
      This patch fixes many small issues related to write functions:
        - update documentations w.r.t. how to add users
        - update the instance add function for latest API
        - add instance delete
        - fix addition of tags
        - update some error messages
      Reviewed-by: imsnah