Some updates on the job queue design doc

This clarifies the job storage and the reason for choosing it. Reviewed-by: imsnah

Some updates on the job queue design doc
This clarifies the job storage and the reason for choosing it. Reviewed-by: imsnah
e9f242e4 · Iustin Pop · 93c4f7f1 · e9f242e4
Commit e9f242e4 authored 16 years ago by Iustin Pop
--- a/doc/design-2.0-job-queue.rst
+++ b/doc/design-2.0-job-queue.rst
@@ -21,8 +21,8 @@ Job execution—“Life of a Ganeti job”
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 #. Job gets submitted by the client. A new job identifier is generated and
-   assigned to the job. The job is then automatically replicated to all nodes
+   assigned to the job. The job is then automatically replicated [#replic]_
-   in the cluster. The identifier is returned to the client.
+   to all nodes in the cluster. The identifier is returned to the client.
 #. A pool of worker threads waits for new jobs. If all are busy, the job has
   to wait and the first worker finishing its work will grab it. Otherwise any
   of the waiting threads will pick up the new job.
@@ -32,7 +32,43 @@ Job execution—“Life of a Ganeti job”
 #. As soon as the job is finished, its final result and status can be retrieved
   from the server.
 #. If the client archives the job, it gets moved to a history directory.
-   This could also be done regularily using a cron script.
+   There will be a method to archive all jobs older than a a given age.
+.. [#replic] We need replication in order to maintain the consistency across
+   all nodes in the system; the master node only differs in the fact that
+   now it is running the master daemon, but it if fails and we do a master
+   failover, the jobs are still visible on the new master (even though they
+   will be marked as failed).
+Failures to replicate a job to other nodes will be only flagged as
+errors in the master daemon log if more than half of the nodes failed,
+otherwise we ignore the failure, and rely on the fact that the next
+update (for still running jobs) will retry the update. For finished
+jobs, it is less of a problem.
+Future improvements will look into checking the consistency of the job
+list and jobs themselves at master daemon startup.
+Job storage
+~~~~~~~~~~~
+Jobs are stored in the filesystem as individual files, serialized
+using JSON (standard serialization mechanism in Ganeti).
+The choice of storing each job in its own file was made because:
+- a file can be atomically replaced
+- a file can easily be replicated to other nodes
+- checking consistency across nodes can be implemented very easily, since
+  all job files should be (at a given moment in time) identical
+The other possible choices that were discussed and discounted were:
+- single big file with all job data: not feasible due to difficult updates
+- in-process databases: hard to replicate the entire database to the
+  other nodes, and replicating individual operations does not mean wee keep
+  consistency
 Queue structure
@@ -126,12 +162,16 @@ Error status once the master started again.
 History
 ~~~~~~~
-Archived jobs are kept in a separate directory, /var/lib/ganeti/queue/archive/.
+Archived jobs are kept in a separate directory,
-The idea is to speed up the queue handling.
+/var/lib/ganeti/queue/archive/.  This is done in order to speed up the
+queue handling: by default, the jobs in the archive are not touched by
+any functions. Only the current (unarchived) jobs are parsed, loaded,
+and verified (if implemented) by the master daemon.
 Ganeti updates
 ~~~~~~~~~~~~~~
-The queue has to be completely empty for Ganeti updates with changes in the job
+The queue has to be completely empty for Ganeti updates with changes
-queue structure.
+in the job queue structure. In order to allow this, there will be a
+way to prevent new jobs entering the queue.