Skip to content
Snippets Groups Projects
  1. Oct 08, 2008
    • Iustin Pop's avatar
      Sanitize the hypervisor names · 00cd937c
      Iustin Pop authored
      Since in 2.0 the user will possibly have more interaction with the
      hypervisor names, we sanitize them by removing the version numbers
      (the version can be a prerequisite for the ganeti installation, we
      shouldn't document it in variable names).
      
      Reviewed-by: schreiberal
      00cd937c
    • Oleksiy Mishchenko's avatar
      Fix for gnt-cluster init. · 02f99608
      Oleksiy Mishchenko authored
      Reviewed-by: iustinp
      02f99608
    • Iustin Pop's avatar
      Move the hypervisor attribute to the instances · e69d05fd
      Iustin Pop authored
      This (big) patch moves the hypervisor type from the cluster to the
      instance level; the cluster attribute remains as the default hypervisor,
      and will be renamed accordingly in a next patch. The cluster also gains
      the ‘enable_hypervisors’ attribute, and instances can be created with
      any of the enabled ones (no provision yet for changing that attribute).
      
      The many many changes in the rpc/backend layer are due to the fact that
      all backend code read the hypervisor from the local copy of the config,
      and now we have to send it (either in the instance object, or as a
      separate parameter) for each function.
      
      The node list by default will list the node free/total memory for the
      default hypervisor, a new flag to it should exist to select another
      hypervisor. Instance list has a new field, hypervisor, that shows the
      instance hypervisor. Cluster verify runs for all enabled hypervisor
      types.
      
      The new FIXMEs are related to IAllocator, since now the node
      total/free/used memory counts are wrong (we can't reliably compute the
      free memory).
      
      Reviewed-by: imsnah
      e69d05fd
  2. Oct 07, 2008
    • Iustin Pop's avatar
      rpc.call_instance_migrate: pass the whole instance · 9f0e6b37
      Iustin Pop authored
      Currently the call_instance_migrate call only passes the instance name;
      we need to pass the whole object for the hypervisor_type changes (all
      the other individual instance rpc calls already pass the instance
      object).
      
      Reviewed-by: imsnah
      9f0e6b37
    • Iustin Pop's avatar
      Implement job 'waiting' status · e92376d7
      Iustin Pop authored
      Background: when we have multiple jobs in the queue (more than just a
      few), many of the jobs (up to the number of threads) will be in state
      'running', although many of them could be actually blocked, waiting for
      some locks. This is not good, as one cannot easily see what is
      happening.
      
      The patch extends the opcode/job possible statuses with another one,
      waiting, which shows that the LU is in the acquire locks phase. The
      mechanism for doing so is simple, we initialize (in the job queue) the
      opcode with OP_STATUS_WAITLOCK, and when the processor is ready to give
      control to the LU's Exec, it will call a notifier back into the
      _JobQueueWorker that sets the opcode status to OP_STATUS_RUNNING (with
      the proper queue locking). Because this mechanism does not save the job,
      all opcodes on disk will be in status WAITLOCK and not RUNNING anymore,
      so we also change the load sequence to consider WAITLOCK as RUNNING.
      
      With the patch applied, creating in parallel (via burnin) five instances
      on a five node cluster shows that only two are executing, while three
      are waiting for locks.
      
      Reviewed-by: imsnah
      e92376d7
  3. Oct 06, 2008
    • Iustin Pop's avatar
      Implement job auto-archiving · 07cd723a
      Iustin Pop authored
      This patch adds a new luxi call that implements auto-archiving of jobs
      older than a certain age (or -1 for all completed jobs), and the gnt-job
      command that makes use of this (with 'all' for -1).
      
      Reviewed-by: imsnah
      07cd723a
    • Iustin Pop's avatar
      Add a simple timespec parsing function · 2241e2b9
      Iustin Pop authored
      This function will be used for auto-archiving jobs via the command line.
      The function is pretty simple, we only support up to weeks since months
      and higher are not 'precise' entities, and dealing with them would
      require us to start using calendar functions.
      
      Reviewed-by: imsnah
      2241e2b9
    • Iustin Pop's avatar
      backend.py change to get cluster name from master · 62c9ec92
      Iustin Pop authored
      Currently there are three function in backend that need the cluster name
      in order to instantiate an SshRunner. The patch changes these to get the
      cluster name from the master in the rpc call; once the multi-hypervisor
      change is implemented, then very few places in which we need the SCR
      remain in the backend.
      
      Reviewed-by: killerfoxi, imsnah
      62c9ec92
    • Iustin Pop's avatar
      Disable re-reading of config file · 3d3a04bc
      Iustin Pop authored
      Since the objects read from the config file are passed to the various
      threads, it's unsafe to re-read the config file (and throw away
      ConfigWriter._config_data). As such, we disable the re-reading of the
      file (since now the master is the owner the file, it makes not sense to
      re-read it), and any modifications to the file must be done offline,
      otherwise they will be overwritten.
      
      Reviewed-by: imsnah
      3d3a04bc
    • Iustin Pop's avatar
      Fix gnt-job list with empty timestamps · e0ec0ff6
      Iustin Pop authored
      In case the job object doesn't have a timestamp (which is a separate
      issue), the listing should not break. We fix this by changing the
      FormatTimstamp function itself to return '?' in case the timestamp
      doesn't look good (note that it still can break if non-integers are
      returned, but this is unlikely).
      
      Reviewed-by: imsnah
      e0ec0ff6
    • Iustin Pop's avatar
      Increase the number of threads to 25 · 1daae384
      Iustin Pop authored
      Since our locks are not gathered nicely, we can have jobs that are
      actually blocking on locks (parallel burnin shows this), so at least we
      need to increase the number of threads above the usual number of jobs we
      could have in a such a case.
      
      Reviewed-by: imsnah
      1daae384
    • Iustin Pop's avatar
      Fix SshRunner breakage from the changed API · 6b0469d2
      Iustin Pop authored
      More places actually use the SshRunner than just the gnt-cluster
      commands.
      
      Reviewed-by: ultrotter
      6b0469d2
    • Iustin Pop's avatar
      Change SshRunner usage · 56bece1f
      Iustin Pop authored
      Currently the SshRunner uses a SimpleConfigReader instance, however this
      is not best. We change it to use the cluster name directly (and its
      constructor now takes this as parameter, instead of SCR), and its
      callers are change to pass the name directly.
      
      As a consequence, we can now remove the initialization of SCR in
      gnt-cluster (copyfile and command), and instead we query the master for
      the cluster name).
      
      Reviewed-by: imsnah
      56bece1f
  4. Oct 05, 2008
  5. Oct 01, 2008
  6. Sep 30, 2008
    • Iustin Pop's avatar
      Enhance the job-related timestamps · c56ec146
      Iustin Pop authored
      This patch adds start, stop, and received timestamp for jobs (and allows
      querying of them), and allows querying of the opcode timestamps.
      
      Reviewed-by: imsnah
      c56ec146
    • Iustin Pop's avatar
      Abstract the timestamp formatting into cli.py · 3386e7a9
      Iustin Pop authored
      Currently we format the timestamp inside the gnt-job info function. We
      will need this more times in the future, so move it to cli.py as a
      separate, exported function.
      
      Reviewed-by: imsnah
      3386e7a9
  7. Sep 29, 2008
    • Iustin Pop's avatar
      Add opcode execution log in job info · 5b23c34c
      Iustin Pop authored
      This patch adds the job execution log in “gnt-job info” and also allows
      its selection in “gnt-job list” (however here it's not very useful as
      it's not easy to parse). It does this by adding a new field in the query
      job call, named ‘oplog’.
      
      With this, one can get a very clear examination of the job. What remains
      to be added would be timestamps for start/stop of the processing for the
      job itself and its opcodes.
      
      Reviewed-by: imsnah
      5b23c34c
    • Iustin Pop's avatar
      Move a hardcoded constant to constants.py · 3c03759a
      Iustin Pop authored
      For now we only use the ‘C’ protocol so we can put it in constants.py
      instead of hardcoding it.
      
      Reviewed-by: imsnah
      3c03759a
    • Iustin Pop's avatar
      Enable the use of shared secrets · 2899d9de
      Iustin Pop authored
      This patch enables the use of the shared secrets for DRBD8 disks, using
      (hardcoded in constants.py) the md5 digest algorithm.
      
      For making this more flexible, either we implement a cluster parameter
      (once the new model is in place), or we can make it ./configure-time
      selectable.
      
      Reviewed-by: imsnah
      2899d9de
    • Iustin Pop's avatar
      Extend DRBD disks with shared secret attribute · f9518d38
      Iustin Pop authored
      This patch, which is similar to r1679 (Extend DRBD disks with minors
      attribute), extends the logical and physical id of the DRBD disks with a
      shared secret attribute. This is generated at disk creation time and
      saved in the config file.
      
      The generation of the secret is done so that we don't have duplicates in
      the configuration (otherwise the goal of preventing cross-connection
      will not be reached), so we add to config.py more than just a simple
      call to utils.GenerateSecret().
      
      The patch does not yet enable the use of the secrets.
      
      Reviewed-by: imsnah
      f9518d38
    • Iustin Pop's avatar
      Implement job summary in gnt-job list · 60dd1473
      Iustin Pop authored
      It is not currently possibly to show a summary of the job in the output
      of “gnt-job list”. The closes is listing the whole opcode(s), but that
      is too verbose. Also, the default output (id, status) is not very
      useful, unless one looks for (and knows about) an exact job ID.
      
      The patch adds a “summary” description of a job composed of the list of
      OP_ID of the individual opcodes. Moreover, if an opcode has a ‘logical’
      target in a certain opcode field (e.g. start instance has the instance
      name as the target), then it is included in the formatting also. It's
      easier to explain via a sample output:
      
      gnt-job list
      ID Status  Summary
      1  error   NODE_QUERY
      2  success NODE_ADD(gnta2)
      3  success CLUSTER_QUERY
      4  success NODE_REMOVE(gnta2.example.com)
      5  error   NODE_QUERY
      6  success NODE_ADD(gnta2)
      7  success NODE_QUERY
      8  success OS_DIAGNOSE
      9  success INSTANCE_CREATE(instance1.example.com)
      10 success INSTANCE_REMOVE(instance1.example.com)
      11 error   INSTANCE_CREATE(instance1.example.com)
      12 success INSTANCE_CREATE(instance1.example.com)
      13 success INSTANCE_SHUTDOWN(instance1.example.com)
      14 success INSTANCE_ACTIVATE_DISKS(instance1.example.com)
      15 error   INSTANCE_CREATE(instance2.example.com)
      16 error   INSTANCE_CREATE(instance2.example.com)
      17 success INSTANCE_CREATE(instance2.example.com)
      18 success INSTANCE_ACTIVATE_DISKS(instance1.example.com)
      19 success INSTANCE_ACTIVATE_DISKS(instance2.example.com)
      20 success INSTANCE_SHUTDOWN(instance1.example.com)
      21 success INSTANCE_SHUTDOWN(instance2.example.com)
      
      This is done by a simple change to the opcode classes, which allows an
      opcode to format itself. The additional function is small enough that it
      can go in opcodes.py, where it could also be used by a client if needed.
      
      Reviewed-by: imsnah
      60dd1473
    • Iustin Pop's avatar
      Nicely sort the job list · 3b87986e
      Iustin Pop authored
      Unless we decide to change the job identifiers to integer, we should at
      least sort the list returned by _GetJobIDsUnlocked.
      
      Reviewed-by: imsnah
      3b87986e
  8. Sep 28, 2008
    • Iustin Pop's avatar
      Move the pseudo-secret generation to utils.py · 33081d90
      Iustin Pop authored
      The bootstrap code needs a pseudo-secret and this is currently generated
      inside the InitGanetiServerSetup function. Since more users will need
      this, move it to utils.py
      
      Reviewed-by: ultrotter
      33081d90
Loading