1. 29 Jan, 2009 13 commits
    • Guido Trotter's avatar
      KVM: make the kernel and initrd arguments optional · df5ab9f0
      Guido Trotter authored
      Under KVM we don't strictly need a kernel and initrd. If some are passed
      we'll use them, otherwise the guest OS will need to behave as fully
      native, and have its own boot loader and kernel. 
      The root_path hypervisor parameter becomes mandatory only if a kernel is
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      KVM: add the HV_SERIAL_CONSOLE parameter · a2faf9ee
      Guido Trotter authored
      Up until now a KVM instance was forced to have a serial port.
      With this change this is no longer mandatory, by default we'll use one,
      but if the HV_SERIAL_CONSOLE parameter is set to False we'll do without.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      GetShellCommand: get hvparams and beparams · 5431b2e4
      Guido Trotter authored
      Sometimes the hypervisor will use the instance hv and/or be parameters
      to determine the best shell command. This is not possible, though,
      currently, as the instance hv/beparams are not filled, so we have to
      pass the filled versions separately.
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      Implement software release version checks too · e9ce0a64
      Iustin Pop authored
      Currently the LUVerifyCluster only reports the protocol version changes,
      not software ones. This is useful to know/monitor, so we add this too as
      a warning.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      gnt-instance list: accept input names · 5ffaa51d
      Iustin Pop authored
      Currently gnt-instance list will refuse to take arguments, and always
      return the full list of instances. This patch allows it to pass names to
      LUQueryInstances, so that we restrict the input to a given set of
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      LUQueryInstances: keep the given order of names · a7f5dc98
      Iustin Pop authored
      Currently LUQueryInstances keeps the ordering of instances only in some cases,
      and in others it will reorder the list. This patch fixes this by more clearly
      separating the various cases (names passed or not and locking or not locking),
      so that the output list is in the same order as always.
      Of course, this disables the sorting when arguments are passed.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      locking.LockSet: don't modify input arguments · 2a21bc88
      Iustin Pop authored
      Currently LockSet.acquire() sorts in place it's input argument if it's a
      list. This is not good, since callers might depend on a specific
      ordering of the input data, and this is a 'hidden' modification.
      We fix it by simply using a sorted copy, instead of sorting in place.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Re-wrap some lines to keep them under 80 chars · f12eadb3
      Iustin Pop authored
      This non-code change rewraps some lines in locking.py to keep them under
      80 chars.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Check that instance exists before confirm. queries · a76f0c4a
      Iustin Pop authored
      Currently we ask the user for confirmation, and only after (try to)
      remove, failover or migrate the instance. This doesn't work nicely if
      the instance doesn't exist, so we make a query for the instance before
      the prompt, which will throw an error in case it doesn't exist.
      Side-note: the way the query works today is not really nice. It would be
      better if we could query explicitly for a missing instance name, so that
      this is done cleaner (explicit check) instead of side-effect (throw
      exception). We do add code for this explicit check, except that today it
      won't be used actually.
      Reviewed-by: ultrotter
    • Oleksiy Mishchenko's avatar
      RAPI: tag work · 18cb43a2
      Oleksiy Mishchenko authored
      Generalize tag work for instances/nodes/cluster tag management.
      Reviewed-by: iustinp
    • Oleksiy Mishchenko's avatar
      RAPI: rlib1 removal · 4e5a68f8
      Oleksiy Mishchenko authored
      The resources we still need moved to rlib2.
      Reviewed-by: iustinp
    • Oleksiy Mishchenko's avatar
      RAPI: Implement /2 resource · fc72a3a3
      Oleksiy Mishchenko authored
      Reviewed-by: iustinp
    • Oleksiy Mishchenko's avatar
      RAPI: Deprecate version Rapi version1 · dc824c9f
      Oleksiy Mishchenko authored
      It is impossible to keep backward compatibility due to
      significant changes in the Ganeti core.
      Reviewed-by: iustinp
  2. 28 Jan, 2009 3 commits
  3. 27 Jan, 2009 9 commits
    • Guido Trotter's avatar
      Xen: use utils.WriteFile for the instance configs · 73cd67f4
      Guido Trotter authored
      Also raise HypervisorError rather than OpExecError.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      Xen: use utils.Readfile to read the VNC password · 78f66a17
      Guido Trotter authored
      Also raise HypervisorError rather than OpExecError.
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      Implement disk verify checks in config verify · 332d0e37
      Iustin Pop authored
      This patch adds a simple check that the 'mode' attribute of top-level disks is
      correct. It does not recurse over children.
      The framework could be extended with other checks in the future.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Fix the mode attribute of newly-created disks · 6ec66eae
      Iustin Pop authored
      Currently, only the LUSetInstanceParams correctly sets up the mode
      attribute via a manual operation. We remove this and instead do the
      correct setting in the generic _GenerateDiskTemplate function, so that
      we set the mode correctly for all disk creations.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Rework the multi-instance gnt commands · 479636a3
      Iustin Pop authored
      This patch changes the multi-instance gnt-* commands (gnt-instance
      start/stop, gnt-node evacuate/failover) such that the individual
      operations are submitted in parallel, ideally improving the speed of the
      The patch does this by abstracting the job set functionality into a new
      class in cli.py, that takes care of the job submit, job poll and error
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Fix single-job archiving (gnt-job archive) · 5278185a
      Iustin Pop authored
      This is a simply typo from the conversion to multi-job archiving.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      KVM and Xen: add the HV_ROOT_PATH parameter · 074ca009
      Guido Trotter authored
      This parameter allows a different path to be passed to the instance
      kernel. The new parameter is mandatory, and by default has the value of
      the old hardcoded value for both kvm and xen.
      Beta1 clusters will need to have this parameter added for their
      instances to be able to boot.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      KVM: implement GetShellCommandForConsole · 637ce7f9
      Guido Trotter authored
      This is a class method, because it calls _InstanceSerial, which is
      another class method. The patch changes it to classmethod for all the
      hypervisor classes.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      KVM: classify _Instance{Monitor,Serial,KVMRuntime} · 0df4d98a
      Guido Trotter authored
      Those methods need nothing from the instantiated class, and just
      manipulate strings, and fetch some class global variables, so they can
      be classmethods.
      Reviewed-by: iustinp
  4. 26 Jan, 2009 2 commits
    • Iustin Pop's avatar
      Release 2.0 beta 1 · e33a0080
      Iustin Pop authored
      Even though alpha started at 0, we release beta 1 first as we did for
      Reviewed-by: imsnah, ultrotter
    • Iustin Pop's avatar
      Update the NEWS documents for beta1 · 10f31783
      Iustin Pop authored
      Also import the NEWS entries from the 1.2 branch which were added since
      we created it.
      Reviewed-by: ultrotter
  5. 23 Jan, 2009 10 commits
    • Guido Trotter's avatar
      Xen and KVM: correct a typo when checking args · 50cb2e2a
      Guido Trotter authored
      A missing 'be' was present in the error string for both xen and kvm,
      when the kernel or initrd path was not absolute.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Sort the instance names in batcher · 7312b33d
      Iustin Pop authored
      In case we submit multiple instances via batcher, it's nicer to have the
      sorted nicely.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Fix batcher for 2.0-style disks and nics · 9939547b
      Iustin Pop authored
      This patch fixes the gnt-instance batch-create command, and in doing so
      also slightly changes two other functions:
        - we change utils.ParseUnit so that it accepts integer values also
          (both ParseUnit(5) and ParseUnit("5") return the same value)
        - a bridge 'None' in LUCreateInstance will be converted to the default
          bridge; currently only missing bridges will be accepted to mean the
          default one
      The main changes to batcher were the change to variable number of disks
      and NICs.
      The patch also adds a batcher-instances.json example file copied from
      the 1.2 branch and properly modified.
      Reviewed-by: imsnah, killerfoxi
    • Iustin Pop's avatar
      Make iallocator work with offline nodes · 1325da74
      Iustin Pop authored
      This patch changes the iallocator framework to work with and properly
      export to plugins offline nodes. It does this by only exporting the
      static configuration data for those nodes, and not attempting to parse
      the runtime data.
      The patch also fixes bugs in iallocator related to the RpcResult
      conversion, changes the should_run to admin_up attribute name (as per
      the internals change), and adds “-I” as a short option for
      “--iallocator” in gnt-instance, gnt-backup and burnin.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Remove checking of DRBD metadata for validity · 3b559640
      Iustin Pop authored
      Currently the DRBD code checks that the metadata devices are valid
      before creation, initial disk attachment and add children.
      However, the process for checking validity requires a free DRBD minor,
      and this conflict with parallel checking.
      There are at least three possible solutions:
        - serialize all checks, which means we reduce parallelism and need
          extra locks
        - don't pass a valid minor number, but one like “/dev/drbd256” (which
          is invalid); this works for current version of DRBD, but since it's
          not guaranteed to remain so it doesn't look nice
        - don't do the checking at all, and rely on “drbdsetup ... disk ...”
          to fail by itself
      The reason for checking metadata was that in 1.2, this was much cheaper
      than trying to activate devices (and the subsequent iteration over the
      minors). However, in 2.0, they have the same cost, so we can choose
      option 3: just remove the explicit checking and rely on drbdsetup and
      the kernel to fail.
      Since DRBD8._InitMeta still requires a minor number, the two places
      where this is run are handled as follows:
        - Create: we just use our own (unused currently) minor number
        - AddChildren: we keep using FindUnusedMinor, with the caveat that
          this function (used by replace-disks -n ...) cannot be yet
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Rework the execution model in burnin · c723c163
      Iustin Pop authored
      This patch changes (significantly) the execution model in burnin:
        - for all runs, (almost) all instance mods in a single Burn* procedure
          are done as part of a job; so for example add disk, stop, remove
          disk, start are no longer done as separate jobs but as a single job
          consisting of four opcodes
        - for parallel runs, all Burn* procedures except the rename (which
          uses a single target name) run in parallel; before, only the
          creation was done in parallel
        - due to the single-job execution and also parallel execution, the
          logging messages are no longer happening synchronously with the
          execution, so they are more informative than an actual execution log
      The end result is that burnin now tests properly multi-opcode jobs and
      also tests all opcodes (except rename) for parallel execution.
      Note: On a test cluster, parallelization reduces burnin time from 23m to
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Relax the restrictions on temporary DRBD minors · 79b26a7a
      Iustin Pop authored
      Currently the restrictions are too harsh: there is a time interval
      between an instance gets a new disk and before it is added to the
      configuration in which the restriction is not met. We solve this by
      allowing temporary DRBD minors to match existing minors (for the same
      instance), such that parallel creations/minor allocations are OK.
      The change is done by moving the add of temporary minors to the
      minor map after the instance minors are computed, and only considering
      them as duplicate if the instance name doesn't match.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Introduce more configuration consistency checks · 4a89c54a
      Iustin Pop authored
      This patch enhances the duplicate DRBD minors checks (currently just a
      few) and adds automatic checks of configuration consistency at
      configuration file writing time.
      In order to do so and show meaningful error messages, the
      _UnlockedComputeDRBDMap function is changed to not raise errors in case
      of duplicates, but instead return both the minors map and the duplicate
      list, and its callers now raise the error. This allows the VerifyConfig
      function to return a complete list of duplicates.
      The new checks required some small updates to the unittests for the
      config module.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Fill the 'call' attribute of offline rpc results · 84b45587
      Iustin Pop authored
      When creating ‘fake’ results for offline nodes, we currently don't pass
      the call attribute. This complicates debugging, so even though this
      should not matter in practice, it's better to fix it.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      A couple of small fixes to iallocator · 8901997e
      Iustin Pop authored
      This removes some constraints:
        - only two disks supported, this is no longer true as the underlying
          functions can now compute size for a variable number of disks
        - error when the hypervisor was not being passed
        - typo error
      Reviewed-by: imsnah
  6. 22 Jan, 2009 1 commit
    • Iustin Pop's avatar
      luxi: close and reopen the socket on errors · 8d5b316c
      Iustin Pop authored
      This is less of an actual issue for regular gnt-* clients, but it's
      easily reproducible with burnin and possible with RAPI (depending on how
      the program uses luxi.Client(s)).
      In case of burnin, if we interrupt the client (^C) while it polls the
      job, it will abort and raise an error. After that, burnin issues a
      remove instance job, and at this point, we send the submit job (remove)
      call but the first thing we read from the socket will be the response to
      the previous poll job request, since that was queued already from the
      To solve this, whenever we detect an error in Transport.Call(), we close
      that transport and re-create a new one, to start anew. The other
      alternative would be to introduce a sequence to the protocol, but this
      is something that would be design-level change and it's not recommended
      at this stage.
      Reviewed-by: imsnah
  7. 21 Jan, 2009 2 commits
    • Guido Trotter's avatar
      ShutdownInstance: log instance name, not object · ca77edbc
      Guido Trotter authored
      When an instance fails to shut down we currently log its whole object,
      rather than just the instance name.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      KVM live migration: handle failure · c087266c
      Guido Trotter authored
      If the KVM live migration ends up in a 'failed' state it has been
      aborted at the kvm level, and the machine is still running locally.
      We support also the 'cancelled' state even though there should be no way
      of reaching it, without manual intervention.
      Reviewed-by: iustinp