From 2915335f4e0a04ab58a8cab7271b8f11fa830787 Mon Sep 17 00:00:00 2001 From: Michael Hanselmann <hansmi@google.com> Date: Fri, 15 Jul 2011 23:45:04 +0200 Subject: [PATCH] Add implementation details to design for chained jobs As requested by Iustin. Signed-off-by: Michael Hanselmann <hansmi@google.com> Reviewed-by: Iustin Pop <iustin@google.com> --- doc/design-chained-jobs.rst | 80 +++++++++++++++++++++++++++++++++++++ 1 file changed, 80 insertions(+) diff --git a/doc/design-chained-jobs.rst b/doc/design-chained-jobs.rst index 8cf870272..4061d96c7 100644 --- a/doc/design-chained-jobs.rst +++ b/doc/design-chained-jobs.rst @@ -115,6 +115,86 @@ Example data structures:: } +Implementation details +---------------------- + +Status while waiting for dependencies +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Jobs waiting for dependencies are certainly not in the queue anymore and +therefore need to change their status from "queued". While waiting for +opcode locks the job is in the "waiting" status (the constant is named +``JOB_STATUS_WAITLOCK``, but the actual value is ``waiting``). There the +following possibilities: + +#. Introduce a new status, e.g. "waitdeps". + + Pro: + + - Clients know for sure a job is waiting for dependencies, not locks + + Con: + + - Code and tests would have to be updated/extended for the new status + - List of possible state transitions certainly wouldn't get simpler + - Breaks backwards compatibility, older clients might get confused + +#. Use existing "waiting" status. + + Pro: + + - No client changes necessary, less code churn (note that there are + clients which don't live in Ganeti core) + - Clients don't need to know the difference between waiting for a job + and waiting for a lock; it doesn't make a difference + - Fewer state transitions (see commit ``5fd6b69479c0``, which removed + many state transitions and disk writes) + + Con: + + - Not immediately visible what a job is waiting for, but it's the + same issue with locks; this is the reason why the lock monitor + (``gnt-debug locks``) was introduced; job dependencies can be shown + as "locks" in the monitor + +Based on these arguments, the proposal is to do the following: + +- Rename ``JOB_STATUS_WAITLOCK`` constant to ``JOB_STATUS_WAITING`` to + reflect its actual meanting: the job is waiting for something +- While waiting for dependencies and locks, jobs are in the "waiting" + status +- Export dependency information in lock monitor; example output:: + + Name Mode Owner Pending + job/27491 - - success:job/34709,job/21459 + job/21459 - - success,error:job/14513 + + +Cost of deserialization +~~~~~~~~~~~~~~~~~~~~~~~ + +To determine the status of a dependency job the job queue must have +access to its data structure. Other queue operations already do this, +e.g. archiving, watching a job's progress and querying jobs. + +Initially (Ganeti 2.0/2.1) the job queue shared the job objects +in memory and protected them using locks. Ganeti 2.2 (see :doc:`design +document <design-2.2>`) changed the queue to read and deserialize jobs +from disk. This significantly reduced locking and code complexity. +Nowadays inotify is used to wait for changes on job files when watching +a job's progress. + +Reading from disk and deserializing certainly has some cost associated +with it, but it's a significantly simpler architecture than +synchronizing in memory with locks. At the stage where dependencies are +evaluated the queue lock is held in shared mode, so different workers +can read at the same time (deliberately ignoring CPython's interpreter +lock). + +It is expected that the majority of executed jobs won't use +dependencies and therefore won't be affected. + + Other discussed solutions ========================= -- GitLab