Commit 28f34054 authored by Klaus Aehlig's avatar Klaus Aehlig

Add a design document for removal of the config lock

This design document describes how the negative effect of the config
lock can be removed in an incremental way.
Signed-off-by: default avatarKlaus Aehlig <aehlig@google.com>
Reviewed-by: default avatarPetr Pudlak <pudlak@google.com>
parent 319f51ae
......@@ -616,6 +616,7 @@ docinput = \
doc/design-autorepair.rst \
doc/design-bulk-create.rst \
doc/design-ceph-ganeti-support.rst \
doc/design-configlock.rst \
doc/design-chained-jobs.rst \
doc/design-cmdlib-unittests.rst \
doc/design-cpu-pinning.rst \
......
===================================
Removal of the Config Lock Overhead
===================================
.. contents:: :depth: 4
This is a design document detailing how the adverse effect of
the config lock can be removed in an incremental way.
Current state and shortcomings
==============================
As a result of the :doc:`design-daemons`, the configuration is held
in a proccess different from the processes carrying out the Ganeti
jobs. Therefore, job processes have to contact WConfD in order to
change the configuration. Of course, these modifications of the
configuration need to be synchronised.
The current form of synchronisation is via ``ConfigLock``. Exclusive
possession of this lock guarantees that no one else modifies the
configuration. In other words, the current procedure for a job to
update the configuration is to
- acquire the ``ConfigLock`` from WConfD,
- read the configration,
- write the modified configuration, and
- release ``ConfigLock``.
The current procedure has some drawbacks. These also affect the
overall throughput of jobs in a Ganeti cluster.
- At each configuration update, the whole configuration is
transferred between the job and WConfD.
- More importantly, however, jobs can only release the ``ConfigLock`` after
the write; the write, in turn, is only confirmed once the configuration
is written on disk. In particular, we can only have one update per
configuration write. Also, having the ``ConfigLock`` is only confirmed
to the job, once the new lock status is written to disk.
Additional overhead is caused by the fact that reads are synchronised over
a shared config lock. This used to make sense when the configuration was
modifiable in the same process to ensure consistent read. With the new
structure, all access to the configuration via WConfD are consistent
anyway, and local modifications by other jobs do not happen.
Proposed changes for an incremental improvement
===============================================
Ideally, jobs would just send patches for the configuration to WConfD
that are applied by means of atomically updating the respective ``IORef``.
This, however, would require chaning all of Ganeti's logical units in
one big change. Therefore, we propose to keep the ``ConfigLock`` and,
step by step, reduce its impact till it eventually will be just used
internally in the WConfD process.
Unlocked Reads
--------------
In a first step, all configuration operations that are synchronised over
a shared config lock, and therefore necessarily read-only, will instead
use WConfD's ``readConfig`` used to obtain a snapshot of the configuration.
This will be done without modifying the locks. It is sound, as reads to
a Haskell ``IORef`` always yield a consistent value. From that snapshot
the required view is computed locally. This saves two lock-configurtion
write cycles per read and, additionally, does not block any concurrent
modifications.
In a second step, more specialised read functions will be added to ``WConfD``.
This will reduce the traffic for reads.
Set-and-release action
----------------------
As a typical pattern is to change the configuration and afterwards release
the ``ConfigLock``. To avoid unncecessary delay in this operation (the next
modification of the configuration can already happen while the last change
is written out), WConfD will offer a combined command that will
- set the configuration to the specified value,
- release the config lock,
- and only then wait for the configuration write to finish; it will not
wait for confirmation of the lock-release write.
If jobs use this combined command instead of the sequential set followed
by release, new configuration changes can come in during writeout of the
current change; in particular, a writeout can contain more than one change.
Short-lived ``ConfigLock``
--------------------------
For a lot of operations, the regular locks already ensure that only
one job can modify a certain part of the configuration. For example,
only jobs with an exclusive lock on an instance will modify that
instance. Therefore, it can update that entity atomically,
without relying on the configuration lock to achive consistency.
``WConfD`` will provide such operations. To
avoid interference with non-atomic operations that still take the
config lock and write the configuration as a whole, this operation
will only be carried out at times the config lock is not taken. To
ensure this, the thread handling the request will take the config lock
itself (hence no one else has it, if that succeeds) before the change
and release afterwards; both operations will be done without
triggering a writeout of the lock status.
Note that the thread handling the request has to take the lock in its
own name and not in that of the requesting job. A writeout of the lock
status can still happen, triggered by other requests. Now, if
``WConfD`` gets restarted after the lock acquisition, if that happend
in the name of the job, it would own a lock without knowing about it,
and hence that lock would never get released.
......@@ -26,6 +26,7 @@ Design document drafts
design-reservations.rst
design-sync-rate-throttling.rst
design-network2.rst
design-configlock.rst
.. vim: set textwidth=72 :
.. Local Variables:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment