Skip to content
Snippets Groups Projects
  1. Aug 03, 2009
  2. Jul 31, 2009
  3. Jul 22, 2009
  4. Jul 17, 2009
  5. Jun 19, 2009
  6. Jun 08, 2009
  7. May 27, 2009
    • Iustin Pop's avatar
      Add a node powercycle command · f5118ade
      Iustin Pop authored
      
      This (somewhat big) patch adds support for remotely rebooting the nodes
      via whatever support the hypervisor has for such a concept.
      
      For KVM/fake (and containers in the future) this just uses sysrq plus a
      ‘reboot’ call if the sysrq method failed. For Xen, it first tries the
      above, and then Xen-hypervisor reboot (we first try sysrq since that
      just requires opening a file handle, whereas xen reboot means launching
      an external utility).
      
      The user interface is:
      
          # gnt-node powercycle node5
          Are you sure you want to hard powercycle node node5?
          y/[n]/?: y
          Reboot scheduled in 5 seconds
      
      The node reboots hopefully after sending the reply. In case the clock is
      broken, “time.sleep(5)” might take ages (but then I suspect SSL
      negotiation wouldn't work).
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      f5118ade
  8. May 19, 2009
    • Iustin Pop's avatar
      Add -H/-B startup parameters to gnt-instance · d04aaa2f
      Iustin Pop authored
      
      This patch modifies the start instance script, opcode and logical unit
      to support temporary startup parameters.
      
      Different from 1.2, where only the kernel arguments were supporting
      changes (and thus xen-pvm specific), this version supports changing all
      hypervisor and backend parameters (with appropriate checks).
      
      This is much more flexible, and allows for example:
        - start with different, temporary kernel
        - start with different memory size
      
      Note: in later versions, this should be extended to cover disk
      parameters as well (e.g. start with drbd without flushes, start with
      drbd in async mode, etc.).
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      d04aaa2f
  9. Feb 24, 2009
    • Iustin Pop's avatar
      Remove the extra_args parameter in instance start · 07813a9e
      Iustin Pop authored
      This patch removes the extra_args parameter and instead switches the
      instance to the HV_KERNEL_ARGS hypervisor option.
      
      This is a big change, but it's a needed cleanup, this extra parameter on
      all RPC calls is not generic and we also need to have a persistent value
      here.
      
      Reviewed-by: imsnah
      07813a9e
  10. Feb 10, 2009
  11. Feb 06, 2009
    • Iustin Pop's avatar
      Fix rapi job listing · ee69c97f
      Iustin Pop authored
      This patch fixes a couple of issues with the job listing:
        - in case of a non-existing job, nicely raise 404 instead of 500
        - in the job detail listing, also list the job log, the job
          timestamps, etc.
        - the opcode migrate instance was missing its description field
      
      Reviewed-by: imsnah
      ee69c97f
  12. Feb 04, 2009
    • Iustin Pop's avatar
      Implement lockless query operations · ec79568d
      Iustin Pop authored
      This patch adds the framework for, and enables lockless OpQueryInstances. This
      means that instances will be shown in ERROR_up or ERROR_down state, even though
      this is not an error (but just an in-progress job).
      
      The framework is implemented as follows:
        - the OpQueryInstances, OpQueryNodes and OpQueryExports opcodes take
          an additional “use_locking” flag which will denote whether to lock
          or not; this patch only implements this for LUQueryInstances
        - the luxi query functions take an additional argument use_locking
          which is passed to the master daemon, and then passed to the above
          opcodes
        - cli.py export a new SYNC_OPT command line options which implement
          setting this flag to true
        - except for gnt-instance list, which uses this option, and for
          name-only queries (e.g. QueryNodes(fields=["names"])), all other
          callers are setting this flag to True
        - RAPI also sets the flag to True
      
      The patch was tested with a continuous (0.2s sleep in-between)
      gnt-instance list during a burnin, and no problems were observed.
      
      Reviewed-by: ultrotter
      ec79568d
  13. Jan 20, 2009
  14. Jan 13, 2009
    • Iustin Pop's avatar
      Forward port the live migration from 1.2 branch · 53c776b5
      Iustin Pop authored
      This is forward port via copy (and not individual patches cherry-pick)
      of the latest code on the 1.2 branch related to the migration.
      
      The changes compared to 1.2 are the fact that we don't need the
      IdentifyDisks step anymore (the drbd rpc calls are independent now), and
      the rpc module improvements.
      
      Reviewed-by: ultrotter
      53c776b5
  15. Jan 12, 2009
  16. Dec 08, 2008
    • Iustin Pop's avatar
      gnt-node modify: add the offline attribute · 3a5ba66a
      Iustin Pop authored
      This patch changes gnt-node modify and the associated opcode/lu to allow
      modification of the node offline attribute.
      
      Setting a node into offline mode automatically demotes it from the
      master role.
      
      Reviewed-by: ultrotter
      3a5ba66a
  17. Dec 02, 2008
    • Iustin Pop's avatar
      Add cluster candidate pool size parameter · 4b7735f9
      Iustin Pop authored
      This patch adds a new cluster paramater "candidate_pool_size" which
      tracks the desired size of the list of nodes with the master_candidate
      flag set.
      
      Reviewed-by: imsnah
      4b7735f9
    • Iustin Pop's avatar
      Add a gnt-node modify operation · b31c8676
      Iustin Pop authored
      This patch adds the OpCode, LogicalUnit and gnt-node command for
      modifying node parameters, more specifically the master candidate flag
      for a node.
      
      Reviewed-by: imsnah
      b31c8676
  18. Nov 25, 2008
    • Iustin Pop's avatar
      Implement support for multi devices changes · 24991749
      Iustin Pop authored
      This big patch adds support for:
        - changing NIC/disks in the multi-device model
        - adding/removing NICs
        - adding/removing disks
      
      The patch is big and not very nice; the error checking paths are not
      very clear.
      
      The biggest problem is that from a simple instance.ATTR=VAL change
      (which didn't throw errors before) now we are creating and removing
      disks in this LU.
      
      Reviewed-by: imsnah
      24991749
  19. Nov 24, 2008
  20. Nov 20, 2008
    • Iustin Pop's avatar
      Initial multi-disk/multi-nic support · 08db7c5c
      Iustin Pop authored
      This patch adds support for mult-disk/multi-nic in:
        - instance add
        - burnin
      
      The start/stop/failover/cluster verify work as expected. Replace disk
      and grow disk are TODO.
      
      There's also a change gnt-job to allow dictionaries to be listed in
      gnt-job info.
      
      Reviewed-by: imsnah
      08db7c5c
  21. Oct 16, 2008
    • Iustin Pop's avatar
      Enable gnt-cluster modify to hv/beparams · 779c15bb
      Iustin Pop authored
      This patch enables the cluster modify to change:
        - enabled hypervisor list
        - hvparams (per hypervisor)
        - beparams (only the default group)
      
      Syntax:
        gnt-cluster modify -B vcpus=3 -H xen-pvm:no_initrd_path
      
      Validation for parameters is somewhat missing - the individual
      hypervisors will be checked for syntax and validation, but beparams
      doesn't have validation yes (nowhere), it should be added here once we
      have a global method (will come soon).
      
      Reviewed-by: imsnah
      779c15bb
  22. Oct 14, 2008
    • Iustin Pop's avatar
      grow-disk: wait until resync is completed · 6605411d
      Iustin Pop authored
      The patch adds a new ‘--no-wait-for-sync’ parameter to grow-disk similar
      to the one in instance add, and changes the default to wait.
      
      This is cleaner as at the moment when the command returns, we either
      have a fully synced disk or there is an error.
      
      This is a forward-port of rev 1183 on the 1.2 branch.
      
      Reviewed-by: ultrotter
      6605411d
    • Iustin Pop's avatar
      Change over to beparams · 338e51e8
      Iustin Pop authored
      This big patch changes the master code to use the beparams. Errors might
      have crept in, but it passes a small burnin.
      
      Reviewed-by: ultrotter
      338e51e8
    • Iustin Pop's avatar
      Allow instance info to only query the config file · 57821cac
      Iustin Pop authored
      This patch adds a new '-s' parameter to ‘gnt-instance info’ that makes
      it return only 'static' information. This is much faster, especially for
      drbd instances.
      
      This is a forward-port of rev 1570 on the ganeti-1.2 branch, resending
      due to some conflicts.
      
      Reviewed-by: imsnah
      57821cac
    • Iustin Pop's avatar
      Change gnt-instance modify to the hvparams model · 74409b12
      Iustin Pop authored
      Reviewed-by: imsnah
      74409b12
    • Iustin Pop's avatar
      Switch instance hypervisor parameters to hvparams · 6785674e
      Iustin Pop authored
      This big patch changes instance create to the new hvparams structure.
      Old parameters are removed, so old jobs or old instances file will break
      current clusters.
      
      Reviewed-by: ultrotter
      6785674e
  23. Oct 08, 2008
    • Iustin Pop's avatar
      Move the hypervisor attribute to the instances · e69d05fd
      Iustin Pop authored
      This (big) patch moves the hypervisor type from the cluster to the
      instance level; the cluster attribute remains as the default hypervisor,
      and will be renamed accordingly in a next patch. The cluster also gains
      the ‘enable_hypervisors’ attribute, and instances can be created with
      any of the enabled ones (no provision yet for changing that attribute).
      
      The many many changes in the rpc/backend layer are due to the fact that
      all backend code read the hypervisor from the local copy of the config,
      and now we have to send it (either in the instance object, or as a
      separate parameter) for each function.
      
      The node list by default will list the node free/total memory for the
      default hypervisor, a new flag to it should exist to select another
      hypervisor. Instance list has a new field, hypervisor, that shows the
      instance hypervisor. Cluster verify runs for all enabled hypervisor
      types.
      
      The new FIXMEs are related to IAllocator, since now the node
      total/free/used memory counts are wrong (we can't reliably compute the
      free memory).
      
      Reviewed-by: imsnah
      e69d05fd
  24. Oct 01, 2008
    • Michael Hanselmann's avatar
      Add new query to get cluster config values · ae5849b5
      Michael Hanselmann authored
      This can be used to retrieve certain cluster config values from
      within clients.
      
      OpDumpClusterConfig was not used anywhere, hence I'm just reusing
      it. The way ConfigWriter.DumpConfig returned the configuration
      was not thread-safe, anyway (no deepcopy).
      
      Reviewed-by: iustinp
      ae5849b5
    • Iustin Pop's avatar
      Remove last use of utils.RunCmd from the watcher · 5188ab37
      Iustin Pop authored
      The watcher has one last use of ganeti commands as opposed to sending
      requests via luxi. The patch changes this to use the cli functions.
      
      The patch also has two other changes:
        - fix the docstring for OpVerifyDisks (found out while converting
          this)
        - enable stderr logging on the watcher when “-d” is passes
      
      Reviewed-by: imsnah
      5188ab37
  25. Sep 29, 2008
    • Iustin Pop's avatar
      Implement job summary in gnt-job list · 60dd1473
      Iustin Pop authored
      It is not currently possibly to show a summary of the job in the output
      of “gnt-job list”. The closes is listing the whole opcode(s), but that
      is too verbose. Also, the default output (id, status) is not very
      useful, unless one looks for (and knows about) an exact job ID.
      
      The patch adds a “summary” description of a job composed of the list of
      OP_ID of the individual opcodes. Moreover, if an opcode has a ‘logical’
      target in a certain opcode field (e.g. start instance has the instance
      name as the target), then it is included in the formatting also. It's
      easier to explain via a sample output:
      
      gnt-job list
      ID Status  Summary
      1  error   NODE_QUERY
      2  success NODE_ADD(gnta2)
      3  success CLUSTER_QUERY
      4  success NODE_REMOVE(gnta2.example.com)
      5  error   NODE_QUERY
      6  success NODE_ADD(gnta2)
      7  success NODE_QUERY
      8  success OS_DIAGNOSE
      9  success INSTANCE_CREATE(instance1.example.com)
      10 success INSTANCE_REMOVE(instance1.example.com)
      11 error   INSTANCE_CREATE(instance1.example.com)
      12 success INSTANCE_CREATE(instance1.example.com)
      13 success INSTANCE_SHUTDOWN(instance1.example.com)
      14 success INSTANCE_ACTIVATE_DISKS(instance1.example.com)
      15 error   INSTANCE_CREATE(instance2.example.com)
      16 error   INSTANCE_CREATE(instance2.example.com)
      17 success INSTANCE_CREATE(instance2.example.com)
      18 success INSTANCE_ACTIVATE_DISKS(instance1.example.com)
      19 success INSTANCE_ACTIVATE_DISKS(instance2.example.com)
      20 success INSTANCE_SHUTDOWN(instance1.example.com)
      21 success INSTANCE_SHUTDOWN(instance2.example.com)
      
      This is done by a simple change to the opcode classes, which allows an
      opcode to format itself. The additional function is small enough that it
      can go in opcodes.py, where it could also be used by a client if needed.
      
      Reviewed-by: imsnah
      60dd1473
  26. Sep 01, 2008
    • Guido Trotter's avatar
      Pass the force param to SetInstanceParms · 4300c4b6
      Guido Trotter authored
      It was already allowed in gnt-instance modify, but ignored.
      It will be used to force skipping parameter checks.
      
      This is a forward-port from branches/ganeti-1.2
      
      Original-Reviewed-by: imsnah
      Reviewed-by: iustinp
      4300c4b6
  27. Aug 29, 2008
  28. Aug 08, 2008
  29. Jul 30, 2008
    • Iustin Pop's avatar
      Rework master startup/shutdown/failover · b1b6ea87
      Iustin Pop authored
      This (big) patch reworks the master startup/shutdown and the fixes the
      master failover.
      
      What does the patch do?
      
      For master start/stop:
        - remove the old ganeti-master script and its associated man page
        - moves the ip start/stop directly into the backend.(Start|Stop)Master
        - adds start/stop of the master/rapi daemon into these functions,
          selectively based on the start/stop arguments
        - makes the master call via rpc StartMaster(start_daemons=False) to
          the local node so that the master IP is started
        - and finally changes the example init.d script to directly start and
          stop all three daemons, since they do the right thing (depending on
          master/not master role)
      
      For master failover:
        - moves the code from LUMasterFailover into bootstrap.MasterFailover,
          since we need to start/stop the master during this operation and
          thus it can't be executed from the master
        - removes the LUMasterFailover and its associated opcode
      
      Notes: ubuntu's /etc/lsb-base-logging.sh is dumb, so the messages 'not
      master' are not seen during startup on non-master nodes.
      
      Reviewed-by: ultrotter
      b1b6ea87
  30. Jul 15, 2008
    • Iustin Pop's avatar
      Documentation updates · a7399f66
      Iustin Pop authored
      Reviewed-by: imsnah
      a7399f66
    • Iustin Pop's avatar
      Rename BaseJO to BaseOpCode · 0e46916d
      Iustin Pop authored
      Since we don't have for now a job definition object anymore, we rename
      this class to BaseOpCode. It's still useful (and not merged with OpCode)
      since it holds all the 'pure' logic (no custom field handling, etc.)
      whereas OpCode holds opcode specific data (OP_ID handling, etc).
      
      The patch also fixes the module's docstring.
      
      Reviewed-by: imsnah
      0e46916d
  31. Jul 09, 2008
  32. Jun 23, 2008
    • Iustin Pop's avatar
      Fix gnt-cluster “command” and “copyfile” · b3989551
      Iustin Pop authored
      Since the disabling of forking in the master daemon, the two ssh-based
      subcommands were not working anymore. However, there is no need at all
      for the commands to be run from the master daemon (permissions to read
      the cluster private ssh key notwithstanding), they can be run directly
      from the command line utilities.
      
      The patch removes the two opcodes OpRunClusterCommand and
      OpClusterCopyFile (and their associated LUs) and changes the code in
      ‘gnt-cluster’ to query the list of nodes and run directly the SshRunner
      over the list. As such, all forking is done from the gnt-cluster script,
      and the commands are working again.
      
      Reviewed-by: imsnah
      b3989551
Loading