Commit 271daef8 authored by Iustin Pop's avatar Iustin Pop
Browse files

Fix opcode transition from WAITLOCK to RUNNING



With the recent changes in the job queue, an old bug surfaced: we never
serialized the status change when in NotifyStart, thus a crash of the
master would have left the job queue oblivious to the fact that the job
was actually running.

In the previous implementation, queries against the job status were
using the in-memory object, so they 'saw' and reported correctly the
running status. But the new implementation just looks at the on-disk
version, and thus didn't see this transition.

The patch also moves NotifyStart to a decorator-based version (like the
other functions), which generates a lot of churn in the diff, sorry.
Signed-off-by: default avatarIustin Pop <iustin@google.com>
Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
parent e8cd390d
......@@ -418,6 +418,7 @@ class _OpExecCallbacks(mcpu.OpExecCbBase):
self._job = job
self._op = op
@locking.ssynchronized(_QUEUE, shared=1)
def NotifyStart(self):
"""Mark the opcode as running, not lock-waiting.
......@@ -427,22 +428,21 @@ class _OpExecCallbacks(mcpu.OpExecCbBase):
Processor.ExecOpCode) set to OP_STATUS_WAITLOCK.
"""
self._queue.acquire(shared=1)
try:
assert self._op.status in (constants.OP_STATUS_WAITLOCK,
constants.OP_STATUS_CANCELING)
assert self._op.status in (constants.OP_STATUS_WAITLOCK,
constants.OP_STATUS_CANCELING)
# All locks are acquired by now
self._job.lock_status = None
# All locks are acquired by now
self._job.lock_status = None
# Cancel here if we were asked to
if self._op.status == constants.OP_STATUS_CANCELING:
raise CancelJob()
# Cancel here if we were asked to
if self._op.status == constants.OP_STATUS_CANCELING:
raise CancelJob()
self._op.status = constants.OP_STATUS_RUNNING
self._op.exec_timestamp = TimeStampNow()
finally:
self._queue.release()
self._op.status = constants.OP_STATUS_RUNNING
self._op.exec_timestamp = TimeStampNow()
# And finally replicate the job status
self._queue.UpdateJobUnlocked(self._job)
@locking.ssynchronized(_QUEUE, shared=1)
def _AppendFeedback(self, timestamp, log_type, log_msg):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment