1. 27 Aug, 2008 1 commit
    • Michael Hanselmann's avatar
      Make sure that client programs get all messages · 6c5a7090
      Michael Hanselmann authored
      This is a large patch, but I can't figure out how to split it without
      breaking stuff. The old way of getting messages by always getting the
      last one didn't bring all messages to the client if they were added
      too fast, thereby making commands like “gnt-cluster verify” less than
      useful. These changes now introduce some sort a serial number per
      log entry to keep track what message a client already received. They
      also remove the log lock per opcode to make reading log entries thread
      Reviewed-by: ultrotter
  2. 11 Aug, 2008 2 commits
  3. 08 Aug, 2008 3 commits
  4. 06 Aug, 2008 3 commits
  5. 05 Aug, 2008 1 commit
  6. 04 Aug, 2008 1 commit
  7. 31 Jul, 2008 2 commits
  8. 30 Jul, 2008 2 commits
    • Iustin Pop's avatar
      Fix pylint-detected issues · 38206f3c
      Iustin Pop authored
      This is mostly:
        - whitespace fix (space at EOL in some files, not all, broken
          indentation, etc)
        - variable names overriding others (one is a real bug in there)
        - too-long-lines
        - cleanup of most unused imports (not all)
      Reviewed-by: ultrotter
    • Michael Hanselmann's avatar
      Rewrite job queue · 85f03e0d
      Michael Hanselmann authored
      We found several issues in the old job queue implementation. It had race
      conditions, deadlocks and other deficiencies.
      Short summary:
      - _QueuedOpCode and _QueuedJob are now more or less data structures with a few
        utility functions. __Setup is gone.
      - DiskJobStorage and JobQueue classes merged into one to reduce code complexity.
      - One lock in JobQueue for almost everything. There's also a lock per opcode
        for log messages.
      Reviewed-by: iustinp
  9. 29 Jul, 2008 1 commit
  10. 28 Jul, 2008 2 commits
  11. 25 Jul, 2008 1 commit
  12. 24 Jul, 2008 2 commits
  13. 23 Jul, 2008 6 commits
    • Michael Hanselmann's avatar
      Move code formatting job ID into a base class · ce594241
      Michael Hanselmann authored
      A later patch will add a memory based job storage class, hence this
      code is going into a separate class. It also changes the number format
      to always use at least 10 digits, allowing up to 9'999'999'999 jobs to
      be sorted without using a custom function.
      Reviewed-by: iustinp
    • Michael Hanselmann's avatar
      Rename JobStorage to DiskJobStorage · 21cc1fbd
      Michael Hanselmann authored
      Reviewed-by: iustinp
    • Michael Hanselmann's avatar
      Fix logging with string job IDs · 205d71fd
      Michael Hanselmann authored
      The job ID is now a string, hence logging must use %s instead of %d.
      Reviewed-by: iustinp
    • Michael Hanselmann's avatar
      Make job ID a string · 3be9a705
      Michael Hanselmann authored
      The docstring says that _NewSerialUnlocked returns “a string
      representing the job identifier”. Until now it returned an
      integer and this patch changes it.
      Reviewed-by: iustinp
    • Iustin Pop's avatar
      Distribute the queue serial file after each update · c3f0a12f
      Iustin Pop authored
      This patch adds distribution of the queue serial file after each write
      to it (but before a new job is created and written with that ID, and
      before a response is returned, so we should be safe from crashes in
      Currently it only logs if a node cannot be contacted, it should abort if
      > 50% errors are seen.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Make the job storage init reuse a serial file · c4beba1c
      Iustin Pop authored
      This will be needed for master failover. If we don't have a valid queue
      directory, we need to reinitialize it, but we should keep the existing
      serial number.
      As such, we abstract the reading of the serial and if we find a valid
      serial, we do not reset it.
      Reviewed-by: imsnah
  14. 22 Jul, 2008 1 commit
    • Michael Hanselmann's avatar
      Make argument to CleanCacheUnlocked mandatory · 57f8615f
      Michael Hanselmann authored
      Not passing the argument means it has the value None. Iterating None
      doesn't work:
        >>> "123" in None
        Traceback (most recent call last):
          File "<stdin>", line 1, in ?
        TypeError: iterable argument required
      Hence I rename it to "exclude" instead of "exceptions", which may be
      confusing, and make it mandatory. If one wants to clean all cache
      entries, an empty list can be passed.
      Reviewed-by: iustinp
  15. 17 Jul, 2008 1 commit
  16. 15 Jul, 2008 1 commit
  17. 14 Jul, 2008 4 commits
    • Iustin Pop's avatar
      First version of user feedback fixes · f1048938
      Iustin Pop authored
      This patch contains a raw version for fixing feedback_fn.
      The new mechanism works as follows:
        - instead of a per-Processor feedback_fn, there's one for each
          ExecOpCode, so that feedback for different opcodes go via possibly
          different functions
        - each _QueuedOpCode gets a message buffer, a method for adding
          feedback and a method for retrieving (parts of) the feedback
        - the _QueuedJob object gets a new attribute that is equal to the
          index of the currently executing opcode
        - job queries get an extra parameter called 'ticker' that will return
          the latest message on the current executing opcode
        - the cli.py job completion poll will show the new status if different
          from the old one
      Of course, quick messages will be lost, as currently only the latest one
      is available. Also changes between opcodes are not represented at all.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Cache some jobs in memory · ac0930b9
      Iustin Pop authored
      This patch adds a caching mechanisms to the JobStorage. Note that is
      does not make the memory cache authoritative.
      The algorithm is:
        - all jobs loaded from disks are entered in the cache
        - all new jobs are entered in the cache
        - at each job save (in UpdateJobUnlocked), jobs which are not
          executing or queued are removed from the cache
      The end effect is that running jobs will always be in the cache (which
      will fix the opcode log changes) and finished jobs will be kept for a
      while in the cache after being loaded.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Fix JobStorage._GetJobIDsUnlocked · 8a70e415
      Iustin Pop authored
      The job ID returned must be an integer (and the regex enforces that),
      but we didn't convert it manually.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Change JobStorage to work with ids not filenames · 911a495b
      Iustin Pop authored
      Currently some of the functions in JobStorage work with filenames (which
      is an implementation detail and should only be used when dealing with
      the storage) and not with job IDs. We need to change this in order to
      implement a job cache.
      Reviewed-by: ultrotter
  18. 11 Jul, 2008 2 commits
    • Michael Hanselmann's avatar
      Add experimental persistency to job queue · f1da30e6
      Michael Hanselmann authored
      It's not perfect and it's not finished, but it's a start.
      - Serial number is read only once, but written on each update
      - Jobs are kept only on disk (caching will be implemented)
      Reviewed-by: iustinp
    • Michael Hanselmann's avatar
      Make "gnt-job list" work again · af30b2fd
      Michael Hanselmann authored
      "gnt-job list" was broken after my recent changes in the RPC
      between clients and the master. This patch makes it work again.
      Reviewed-by: iustinp
  19. 10 Jul, 2008 3 commits
  20. 09 Jul, 2008 1 commit