diff --git a/doc/design-2.2.rst b/doc/design-2.2.rst
index 1545d774ff992f00c8b0121ea5141d9fad7a6f0d..2424a9f3e307313e168350405f4506df0f86ecfd 100644
--- a/doc/design-2.2.rst
+++ b/doc/design-2.2.rst
@@ -58,20 +58,75 @@ decrease lock contention, log pollution and memory usage.
 Also, with the current architecture, masterd suffers from quite a few
 scalability issues:
 
-- Since the 16 client worker threads handle one connection each, it's
-  very easy to exhaust them, by just connecting to masterd 16 times and
-  not sending any data. While we could perhaps make those pools
-  resizable, increasing the number of threads won't help with lock
-  contention.
-- Some luxi operations (in particular REQ_WAIT_FOR_JOB_CHANGE) make the
-  relevant client thread block on its job for a relatively long time.
-  This makes it easier to exhaust the 16 client threads.
-- The job queue lock is quite heavily contended, and certain easily
-  reproducible workloads show that's it's very easy to put masterd in
-  trouble: for example running ~15 background instance reinstall jobs,
-  results in a master daemon that, even without having finished the
-  client worker threads, can't answer simple job list requests, or
-  submit more jobs.
+Core daemon connection handling
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Since the 16 client worker threads handle one connection each, it's very
+easy to exhaust them, by just connecting to masterd 16 times and not
+sending any data. While we could perhaps make those pools resizable,
+increasing the number of threads won't help with lock contention nor
+with better handling long running operations making sure the client is
+informed that everything is proceeding, and doesn't need to time out.
+
+Wait for job change
+^^^^^^^^^^^^^^^^^^^
+
+The REQ_WAIT_FOR_JOB_CHANGE luxi operation makes the relevant client
+thread block on its job for a relative long time. This is another easy
+way to exhaust the 16 client threads, and a place where clients often
+time out, moreover this operation is negative for the job queue lock
+contention (see below).
+
+Job Queue lock
+^^^^^^^^^^^^^^
+
+The job queue lock is quite heavily contended, and certain easily
+reproducible workloads show that's it's very easy to put masterd in
+trouble: for example running ~15 background instance reinstall jobs,
+results in a master daemon that, even without having finished the
+client worker threads, can't answer simple job list requests, or
+submit more jobs.
+
+Currently the job queue lock is an exclusive non-fair lock insulating
+the following job queue methods (called by the client workers).
+
+  - AddNode
+  - RemoveNode
+  - SubmitJob
+  - SubmitManyJobs
+  - WaitForJobChanges
+  - CancelJob
+  - ArchiveJob
+  - AutoArchiveJobs
+  - QueryJobs
+  - Shutdown
+
+Moreover the job queue lock is acquired outside of the job queue in two
+other classes:
+
+  - jqueue._JobQueueWorker (in RunTask) before executing the opcode, after
+    finishing its executing and when handling an exception.
+  - jqueue._OpExecCallbacks (in NotifyStart and Feedback) when the
+    processor (mcpu.Processor) is about to start working on the opcode
+    (after acquiring the necessary locks) and when any data is sent back
+    via the feedback function.
+
+Of those the major critical points are:
+
+  - Submit[Many]Job, QueryJobs, WaitForJobChanges, which can easily slow
+    down and block client threads up to making the respective clients
+    time out.
+  - The code paths in NotifyStart, Feedback, and RunTask, which slow
+    down job processing between clients and otherwise non-related jobs.
+
+To increase the pain:
+
+  - WaitForJobChanges is a bad offender because it's implemented with a
+    notified condition which awakes waiting threads, who then try to
+    acquire the global lock again
+  - Many should-be-fast code paths are slowed down by replicating the
+    change to remote nodes, and thus waiting, with the lock held, on
+    remote rpcs to complete (starting, finishing, and submitting jobs)
 
 Proposed changes
 ++++++++++++++++
@@ -116,7 +171,10 @@ data, and not in fetching/serializing it.
 
 Other features to look at, when implementing this code are:
 
-  - Possibility not to need the job lock to know which updates to push.
+  - Possibility not to need the job lock to know which updates to push:
+    if the thread producing the data pushed a copy of the update for the
+    waiting clients, the thread sending it won't need to acquire the
+    lock again to fetch the actual data.
   - Possibility to signal clients about to time out, when no update has
     been received, not to despair and to keep waiting (luxi level
     keepalive).
@@ -126,11 +184,31 @@ Other features to look at, when implementing this code are:
 Job Queue lock
 ^^^^^^^^^^^^^^
 
-Our tests show that the job queue lock is a point of high contention.
-We'll try to decrease its contention, either by more granular locking,
-the use of shared/exclusive locks, or reducing the size of the critical
-sections. This section of the design should be updated with the proposed
-changes for the 2.2 release, with regards to the job queue.
+In order to decrease the job queue lock contention, we will change the
+code paths in the following ways, initially:
+
+  - A per-job lock will be introduced. All operations affecting only one
+    job (for example feedback, starting/finishing notifications,
+    subscribing to or watching a job) will only require the job lock.
+    This should be a leaf lock, but if a situation arises in which it
+    must be acquired together with the global job queue lock the global
+    one must always be acquired last (for the global section).
+  - The locks will be converted to a sharedlock. Any read-only operation
+    will be able to proceed in parallel.
+  - During remote update (which happens already per-job) we'll drop the
+    job lock level to shared mode, so that activities reading the lock
+    (for example job change notifications or QueryJobs calls) will be
+    able to proceed in parallel.
+  - The wait for job changes improvements proposed above will be
+    implemented.
+
+In the future other improvements may include splitting off some of the
+work (eg replication of a job to remote nodes) to a separate thread pool
+or asynchronous thread, not tied with the code path for answering client
+requests or the one executing the "real" work. This can be discussed
+again after we used the more granular job queue in production and tested
+its benefits.
+
 
 Remote procedure call timeouts
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~