Skip to content
Snippets Groups Projects
  1. Oct 16, 2008
    • Iustin Pop's avatar
      Add an interface for the drain flag changes/query · 3ccafd0e
      Iustin Pop authored
      This adds the set/reset in the jqueue and luxi modules, and a way to
      query it in OpQueryConfigValues, and also the comand line interface for
      it:
      $ gnt-cluster queue info
      The drain flag is unset
      $ gnt-cluster queue drain
      $ gnt-cluster queue info
      The drain flag is set
      $ gnt-cluster queue undrain
      $ gnt-cluster queue info
      The drain flag is unset
      
      The choice of making the setting via luxi and not an opcode is that
      opcodes can't be executed when drained, but we don't query via luxi
      since in the future it might become a cluster property as opposed to a
      node one.
      
      Reviewed-by: imsnah
      3ccafd0e
  2. Oct 15, 2008
  3. Oct 10, 2008
    • Iustin Pop's avatar
      Convert rpc module to RpcRunner · 72737a7f
      Iustin Pop authored
      This big patch changes the call model used in internode-rpc from
      standalong function calls in the rpc module to via a RpcRunner class,
      that holds all the methods. This can be used in the future to enable
      smarter processing in the RPC layer itself (some quick examples are not
      setting the DiskID from cmdlib code, but only once in each rpc call,
      etc.).
      
      There are a few RPC calls that are made outside of the LU code, and
      these calls are left as staticmethods, so they can be used without a
      class instance (which requires a ConfigWriter instance).
      
      Reviewed-by: imsnah
      72737a7f
  4. Oct 07, 2008
    • Iustin Pop's avatar
      Implement job 'waiting' status · e92376d7
      Iustin Pop authored
      Background: when we have multiple jobs in the queue (more than just a
      few), many of the jobs (up to the number of threads) will be in state
      'running', although many of them could be actually blocked, waiting for
      some locks. This is not good, as one cannot easily see what is
      happening.
      
      The patch extends the opcode/job possible statuses with another one,
      waiting, which shows that the LU is in the acquire locks phase. The
      mechanism for doing so is simple, we initialize (in the job queue) the
      opcode with OP_STATUS_WAITLOCK, and when the processor is ready to give
      control to the LU's Exec, it will call a notifier back into the
      _JobQueueWorker that sets the opcode status to OP_STATUS_RUNNING (with
      the proper queue locking). Because this mechanism does not save the job,
      all opcodes on disk will be in status WAITLOCK and not RUNNING anymore,
      so we also change the load sequence to consider WAITLOCK as RUNNING.
      
      With the patch applied, creating in parallel (via burnin) five instances
      on a five node cluster shows that only two are executing, while three
      are waiting for locks.
      
      Reviewed-by: imsnah
      e92376d7
  5. Oct 06, 2008
    • Iustin Pop's avatar
      Implement job auto-archiving · 07cd723a
      Iustin Pop authored
      This patch adds a new luxi call that implements auto-archiving of jobs
      older than a certain age (or -1 for all completed jobs), and the gnt-job
      command that makes use of this (with 'all' for -1).
      
      Reviewed-by: imsnah
      07cd723a
  6. Oct 01, 2008
    • Michael Hanselmann's avatar
      Convert ganeti-master · a42872ff
      Michael Hanselmann authored
      Use simpleconfig instead of ssconf.
      
      Reviewed-by: iustinp
      a42872ff
    • Michael Hanselmann's avatar
      Add new query to get cluster config values · ae5849b5
      Michael Hanselmann authored
      This can be used to retrieve certain cluster config values from
      within clients.
      
      OpDumpClusterConfig was not used anywhere, hence I'm just reusing
      it. The way ConfigWriter.DumpConfig returned the configuration
      was not thread-safe, anyway (no deepcopy).
      
      Reviewed-by: iustinp
      ae5849b5
  7. Sep 09, 2008
    • Iustin Pop's avatar
      Implement master startup safety check · 36205981
      Iustin Pop authored
      This is an initial version of the master startup checks. It's a very
      rudimentary change, however in normal usage (an old master was started,
      the rest of the cluster is functioning normally) it will succeed in
      preventing wrong startups.
      
      Reviewed-by: imsnah
      36205981
  8. Aug 29, 2008
    • Iustin Pop's avatar
      Make WaitForJobChanges deal with long jobs · 5c735209
      Iustin Pop authored
      This patch alters the WaitForJobChanges luxi-RPC call to have a
      configurable timeout, so that the call behaves nicely with long jobs
      that have no update.
      
      We do this by adding a timeout parameter in the RPC call, and returning
      a special constant when the timeout is reached without an update. The
      luxi client will repeatedly call the WaitForJobChanges until it gets a
      real change. The timeout is hardcoded as half the RWTO value.
      
      The patch also removes an unused variable (new_state) from the
      WaitForJobChanges method.
      
      Reviewed-by: imsnah,ultrotter
      5c735209
  9. Aug 27, 2008
    • Michael Hanselmann's avatar
      Make sure that client programs get all messages · 6c5a7090
      Michael Hanselmann authored
      This is a large patch, but I can't figure out how to split it without
      breaking stuff. The old way of getting messages by always getting the
      last one didn't bring all messages to the client if they were added
      too fast, thereby making commands like “gnt-cluster verify” less than
      useful. These changes now introduce some sort a serial number per
      log entry to keep track what message a client already received. They
      also remove the log lock per opcode to make reading log entries thread
      safe.
      
      Reviewed-by: ultrotter
      6c5a7090
  10. Aug 18, 2008
    • Michael Hanselmann's avatar
      Use Linux-specific way to name master socket · 9894ece7
      Michael Hanselmann authored
      By using this Linux-specific way we don't have to care about removing the
      socket file when quitting or starting (after an unclean shutdown). For a
      more detailed description, see the comment in the patch.
      
      Reviewed-by: schreiberal
      9894ece7
  11. Aug 11, 2008
  12. Aug 08, 2008
  13. Aug 06, 2008
  14. Jul 30, 2008
    • Iustin Pop's avatar
      Unify SetupDaemon/SetupLogging · 59f187eb
      Iustin Pop authored
      The 'old-style' info, error, debug logs do not make much sense. This
      patch unifies the SetupLogging and SetupDaemon functions. As a result,
      all the commands logs to a 'commands.log' file.
      
      The patch also changes the log setup to keep going if there's an error
      in setting up the file logging but we're logging to stderr.
      
      Also, burnin now logs to its own file (burnin.log).
      
      Reviewed-by: ultrotter
      59f187eb
    • Iustin Pop's avatar
      Rework master startup/shutdown/failover · b1b6ea87
      Iustin Pop authored
      This (big) patch reworks the master startup/shutdown and the fixes the
      master failover.
      
      What does the patch do?
      
      For master start/stop:
        - remove the old ganeti-master script and its associated man page
        - moves the ip start/stop directly into the backend.(Start|Stop)Master
        - adds start/stop of the master/rapi daemon into these functions,
          selectively based on the start/stop arguments
        - makes the master call via rpc StartMaster(start_daemons=False) to
          the local node so that the master IP is started
        - and finally changes the example init.d script to directly start and
          stop all three daemons, since they do the right thing (depending on
          master/not master role)
      
      For master failover:
        - moves the code from LUMasterFailover into bootstrap.MasterFailover,
          since we need to start/stop the master during this operation and
          thus it can't be executed from the master
        - removes the LUMasterFailover and its associated opcode
      
      Notes: ubuntu's /etc/lsb-base-logging.sh is dumb, so the messages 'not
      master' are not seen during startup on non-master nodes.
      
      Reviewed-by: ultrotter
      b1b6ea87
    • Iustin Pop's avatar
      Implement checking for the master role in rapi · 5675cd1f
      Iustin Pop authored
      This patch moves the CheckMaster function from ganeti-masterd to ssconf
      (most logical place, it cannot go in utils since we would have recursive
      imports between ssconf and utils) and changes ganeti-rapi to also call
      this function.
      
      This is needed so that starting ganeti-rapi on a non-master node does
      the right thing.
      
      Reviewed-by: ultrotter
      5675cd1f
  15. Jul 29, 2008
  16. Jul 24, 2008
  17. Jul 23, 2008
  18. Jul 21, 2008
  19. Jul 14, 2008
  20. Jul 10, 2008
  21. Jul 09, 2008
    • Iustin Pop's avatar
      Fix double-logging in daemons · ff5fac04
      Iustin Pop authored
      Currently, in debug mode, both the logfile handler and the stderr
      handler will log debug messages. Since the stderr is redirected to the
      same logfile (to catch non-logged errors), it means log entries are
      doubled.
      
      The patch adds an extra parameter to the logger.SetupDaemon() function
      that allows disabling of the stderr logging. The master and node daemon
      will use this to enable stderr logging only when running in foreground.
      
      Reviewed-by: imsnah
      ff5fac04
    • Iustin Pop's avatar
      Remove the old locking functions · d4fa5c23
      Iustin Pop authored
      This removes (hopefully) all traces of the old locking functions and
      uses.
      
      Reviewed-by: imsnah
      d4fa5c23
    • Michael Hanselmann's avatar
      Remove old job queue code · 2467e0d3
      Michael Hanselmann authored
      Reviewed-by: iustinp
      2467e0d3
    • Michael Hanselmann's avatar
      Change masterd/client RPC protocol · 0bbe448c
      Michael Hanselmann authored
      - Introduce abstraction class on client side
      - Use constants for method names
      - Adopt legacy function SubmitOpCode to use it
      
      Reviewed-by: iustinp
      0bbe448c
    • Michael Hanselmann's avatar
      Make luxi RPC more flexible · 3d8548c4
      Michael Hanselmann authored
      - Use constants for dict entries
      - Handle exceptions on server side
      - Rename client function to CallMethod to match server side naming
      
      Reviewed-by: iustinp
      3d8548c4
    • Michael Hanselmann's avatar
      Instantiate new job queue in master daemon · 50a3fbb2
      Michael Hanselmann authored
      Reviewed-by: iustinp
      50a3fbb2
  22. Jul 03, 2008
    • Iustin Pop's avatar
      Add custom logging setup for daemons · 3b316acb
      Iustin Pop authored
      It's better for daemons if:
        - they log only to one log file
        - the log level is included
        - for debug runs, the filename/line number is included
      
      This patch moves the custom formatter from the watcher to the logging
      module and generalizes it; then it changes the master daemon to use this
      function instead of the generic logging (which might be deprecated
      anyway in the future).
      
      Reviewed-by: imsnah
      3b316acb
  23. Jul 02, 2008
  24. Jul 01, 2008
Loading