Skip to content
Snippets Groups Projects
  1. Sep 13, 2010
  2. Sep 10, 2010
  3. Sep 07, 2010
  4. Aug 24, 2010
  5. Aug 19, 2010
    • Michael Hanselmann's avatar
      jqueue: Remove lock status field · 9bdab621
      Michael Hanselmann authored
      
      With the job queue changes for Ganeti 2.2, watched and queried jobs are
      loaded directly from disk, rendering the in-memory “lock_status” field
      useless. Writing it to disk would be possible, but has a huge cost at
      runtime (when tested, processing 1'000 opcodes involved 4'000 additional
      writes to job files, even with replication turned off).
      
      Using an additional in-memory dictionary to just manage this field turned
      out to be a complicated task due to the necessary locking.
      
      The plan is to introduce a more generic lock debugging mechanism in the
      near future. Hence the decision is to remove this field now instead of
      spending a lot of time to make it working again.
      
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      9bdab621
  6. Aug 18, 2010
  7. Aug 17, 2010
  8. Jul 30, 2010
    • Iustin Pop's avatar
      Fix a few job archival issues · aa9f8167
      Iustin Pop authored
      
      This patch fixes two issues with job archival. First, the
      LoadJobFromDisk can return 'None' for no-such-job, and we shouldn't add
      None to the job list; we can't anyway, as this raises an exception:
      
        node1# gnt-job archive foo
        Unhandled protocol error while talking to the master daemon:
        Caught exception: cannot create weak reference to 'NoneType' object
      
      After fixing this, job archival of missing jobs will just continue
      silently, so we modify gnt-job archive to log jobs which were not
      archived and to return exit code 1 for any missing jobs.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      aa9f8167
  9. Jul 29, 2010
    • Iustin Pop's avatar
      Change handling of non-Ganeti errors in jqueue · 599ee321
      Iustin Pop authored
      
      Currently, if a job execution raises a Ganeti-specific error (i.e.
      subclass of GenericError), then we encode it as (error class, [error
      args]). This matches the RAPI documentation.
      
      However, if we get a non-Ganeti error, then we encode it as simply
      str(err), a single string. This means that the opresult field is not
      according to the RAPI docs, and thus it's hard to reliably parse the
      job results.
      
      This patch changes the encoding of a failed job (via failure) to always
      be an OpExecError, so that we always encode it properly. For the command
      line interface, the behaviour is the same, as any non-Ganeti errors get
      re-encoded as OpExecError anyway. For the RAPI clients, it only means
      that we always present the same type for results. The actual error value
      is the same, since the err.args is either way str(original_error);
      compare the original (doesn't contain the ValueError):
      
        "opresult": [
          "invalid literal for int(): aa"
        ],
      
      with:
      
        "opresult": [
          [
            "OpExecError",
            [
              "invalid literal for int(): aa"
            ]
          ]
        ],
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      599ee321
    • Michael Hanselmann's avatar
      workerpool: Change signature of AddTask function to not use *args · b2e8a4d9
      Michael Hanselmann authored
      
      By changing it to a normal parameter, which must be a sequence, we can
      start using keyword parameters.
      
      Before this patch all arguments to “AddTask(self, *args)” were passed as
      arguments to the worker's “RunTask” method. Priorities, which should be
      optional and will be implemented in a future patch, must be passed as a keyword
      parameter. This means “*args” can no longer be used as one can't combine *args
      and keyword parameters in a clean way:
      
      >>> def f(name=None, *args):
      ...   print "%r, %r" % (args, name)
      ...
      >>> f("p1", "p2", "p3", name="thename")
      Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
       TypeError: f() got multiple values for keyword argument 'name'
      
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      b2e8a4d9
  10. Jul 16, 2010
    • Iustin Pop's avatar
      Implement lock names for debugging purposes · 7f93570a
      Iustin Pop authored
      
      This patch adds lock names to SharedLocks and LockSets, that can be used
      later for displaying the actual locks being held/used in places where we
      only have the lock, and not the entire context of the locking operation.
      
      Since I realized that the production code doesn't call LockSet with the
      proper members= syntax, but directly as positional parameters, I've
      converted this (and the arguments to GlobalLockManager) into positional
      arguments.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      7f93570a
  11. Jul 15, 2010
    • Michael Hanselmann's avatar
      jqueue: Factorize code waiting for job changes · 989a8bee
      Michael Hanselmann authored
      
      By splitting the _WaitForJobChangesHelper class into multiple smaller
      classes, we gain in several places:
      
      - Simpler code, less interaction between functions and variables
      - Easy to unittest (close to 100% coverage)
      - Waiting for job changes has no direct knowledge of queue anymore (it
        doesn't references queue functions anymore, especially not private ones)
      - Activate inotify only if there was no change at the beginning (and
        checking again right away to avoid race conditions)
      
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      989a8bee
  12. Jul 12, 2010
  13. Jul 09, 2010
  14. Jul 06, 2010
    • Iustin Pop's avatar
      Fix opcode transition from WAITLOCK to RUNNING · 271daef8
      Iustin Pop authored
      
      With the recent changes in the job queue, an old bug surfaced: we never
      serialized the status change when in NotifyStart, thus a crash of the
      master would have left the job queue oblivious to the fact that the job
      was actually running.
      
      In the previous implementation, queries against the job status were
      using the in-memory object, so they 'saw' and reported correctly the
      running status. But the new implementation just looks at the on-disk
      version, and thus didn't see this transition.
      
      The patch also moves NotifyStart to a decorator-based version (like the
      other functions), which generates a lot of churn in the diff, sorry.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      271daef8
  15. Jun 28, 2010
  16. Jun 23, 2010
  17. Jun 17, 2010
  18. Jun 15, 2010
  19. Jun 11, 2010
Loading