Commit 887c7aa6 authored by Michael Hanselmann's avatar Michael Hanselmann

locking: Implement priorities in SharedLock and LockSet

For proper support of job priorities, jobs' locks need to respect
priorities.  Otherwise it could happen that a job with a lower priority
could get a lock before a job with a higher priority (depending on
timeouts and when they start acquiring).

This patch adds support for priorities in SharedLock and LockSet and
provides (unfortunately non-trivial) unittests. Outdated comments are also
adjusted and improved.
Signed-off-by: default avatarMichael Hanselmann <>
Reviewed-by: default avatarGuido Trotter <>
parent cbccd9ca
......@@ -205,8 +205,8 @@ Opcode priorities are synchronized to disk in order to be restored after
a restart or crash of the master daemon.
Priorities also need to be considered inside the locking library to
ensure opcodes with higher priorities get locks first, but the design
changes for this will be discussed in a separate section.
ensure opcodes with higher priorities get locks first. See
:ref:`locking priorities <locking-priorities>` for more details.
Worker pool
......@@ -243,6 +243,59 @@ changing its own priority. This is useful for the following cases:
With these changes, the job queue will be able to implement per-job
.. _locking-priorities:
In order to support priorities in Ganeti's own lock classes,
``locking.SharedLock`` and ``locking.LockSet``, the internal structure
of the former class needs to be changed. The last major change in this
area was done for Ganeti 2.1 and can be found in the respective
:doc:`design document <design-2.1>`.
The plain list (``[]``) used as a queue is replaced by a heap queue,
similar to the `worker pool`_. The heap or priority queue does automatic
sorting, thereby automatically taking care of priorities. For each
priority there's a plain list with pending acquires, like the single
queue of pending acquires before this change.
When the lock is released, the code locates the list of pending acquires
for the highest priority waiting. The first condition (index 0) is
notified. Once all waiting threads received the notification, the
condition is removed from the list. If the list of conditions is empty
it's removed from the heap queue.
Like before, shared acquires are grouped and skip ahead of exclusive
acquires if there's already an existing shared acquire for a priority.
To accomplish this, a separate dictionary of shared acquires per
priority is maintained.
To simplify the code and reduce memory consumption, the concept of the
"active" and "inactive" condition for shared acquires is abolished. The
lock can't predict what priorities the next acquires will use and even
keeping a cache can become computationally expensive for arguable
benefit (the underlying POSIX pipe, see ``pipe(2)``, needs to be
re-created for each notification anyway).
The following diagram shows a possible state of the internal queue from
a high-level view. Conditions are shown as (waiting) threads. Assuming
no modifications are made to the queue (e.g. more acquires or timeouts),
the lock would be acquired by the threads in this order (concurrent
acquires in parentheses): ``threadE1``, ``threadE2``, (``threadS1``,
``threadS2``, ``threadS3``), (``threadS4``, ``threadS5``), ``threadE3``,
``threadS6``, ``threadE4``, ``threadE5``.
(0, [exc/threadE1, exc/threadE2, shr/threadS1/threadS2/threadS3]),
(2, [shr/threadS4/threadS5]),
(10, [exc/threadE3]),
(33, [shr/threadS6, exc/threadE4, exc/threadE5]),
IPv6 support
This diff is collapsed.
......@@ -28,10 +28,12 @@ import time
import Queue
import threading
import random
import itertools
from ganeti import locking
from ganeti import errors
from ganeti import utils
from ganeti import compat
import testutils
......@@ -701,6 +703,106 @@ class TestSharedLock(_ThreadedTestCase):
self.assertRaises(Queue.Empty, self.done.get_nowait)
def testPriority(self):
# Acquire in exclusive mode
# Queue acquires
def _Acquire(prev, next, shared, priority, result):
prev.wait(), priority=priority, test_notify=next.set)
counter = itertools.count(0)
priorities = range(-20, 30)
first = threading.Event()
prev = first
# Data structure:
# {
# priority:
# [(shared/exclusive, set(acquire names), set(pending threads)),
# (shared/exclusive, ...),
# ...,
# ],
# }
perprio = {}
# References shared acquire per priority in L{perprio}. Data structure:
# {
# priority: (shared=1, set(acquire names), set(pending threads)),
# }
prioshared = {}
for seed in [4979, 9523, 14902, 32440]:
# Use a deterministic random generator
rnd = random.Random(seed)
for priority in [rnd.choice(priorities) for _ in range(30)]:
modes = [0, 1]
for shared in modes:
# Unique name
acqname = "%s/shr=%s/prio=%s" % (, shared, priority)
ev = threading.Event()
thread = self._addThread(target=_Acquire,
args=(prev, ev, shared, priority, acqname))
prev = ev
# Record expected aqcuire, see above for structure
data = (shared, set([acqname]), set([thread]))
priolist = perprio.setdefault(priority, [])
if shared:
priosh = prioshared.get(priority, None)
if priosh:
# Shared acquires are merged
for i, j in zip(priosh[1:], data[1:]):
assert data[0] == priosh[0]
prioshared[priority] = data
# Start all acquires and wait for them
# Check lock information
self.assertEqual(["name"]), [])
self.assertEqual(["mode", "owner"]),
["exclusive", [threading.currentThread().getName()]])
self.assertEqual(["name", "pending"]),
[(["exclusive", "shared"][int(bool(shared))],
sorted([t.getName() for t in threads]))
for acquires in [perprio[i]
for i in sorted(perprio.keys())]
for (shared, _, threads) in acquires]])
# Let threads acquire the lock
# Wait for everything to finish
# Check acquires by priority
for acquires in [perprio[i] for i in sorted(perprio.keys())]:
for (_, names, _) in acquires:
# For shared acquires, the set will contain 1..n entries. For exclusive
# acquires only one.
while names:
self.assertFalse(compat.any(names for (_, names, _) in acquires))
self.assertRaises(Queue.Empty, self.done.get_nowait)
class TestSharedLockInCondition(_ThreadedTestCase):
"""SharedLock as a condition lock tests"""
......@@ -1259,6 +1361,57 @@ class TestLockSet(_ThreadedTestCase):
self.assertEqual(self.done.get_nowait(), 'DONE')
def testPriority(self):
def _Acquire(prev, next, name, priority, success_fn):
self.assert_(, shared=0,
test_notify=lambda _: next.set()))
# Get all in exclusive mode
self.assert_(, shared=0))
done_two = Queue.Queue(0)
first = threading.Event()
prev = first
acquires = [("one", prio, self.done) for prio in range(1, 33)]
acquires.extend([("two", prio, done_two) for prio in range(1, 33)])
# Use a deterministic random generator
for (name, prio, done) in acquires:
ev = threading.Event()
args=(prev, ev, name, prio,
compat.partial(done.put, "Prio%s" % prio)))
prev = ev
# Start acquires
# Wait for last acquire to start
# Let threads acquire locks
# Wait for threads to finish
for i in range(1, 33):
self.assertEqual(self.done.get_nowait(), "Prio%s" % i)
self.assertEqual(done_two.get_nowait(), "Prio%s" % i)
self.assertRaises(Queue.Empty, self.done.get_nowait)
self.assertRaises(Queue.Empty, done_two.get_nowait)
class TestGanetiLockManager(_ThreadedTestCase):
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment