- Jan 18, 2011
-
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Dec 29, 2010
-
-
Michael Hanselmann authored
Since the recent change to leave jobs in the “waitlock” status (commit 5fd6b694), cancelling a job while it's back in the queue would break. This patch handles these cases and adds a unittest. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Dec 15, 2010
-
-
Michael Hanselmann authored
Iustin Pop reported that a job's file is updated many times while it waits for locks held by other thread(s). After an investigation it was concluded that the reason was a design decision for job priorities to return jobs to the “queued” status if they couldn't acquire all locks. Changing a jobs' status or priority requires an update to permanent storage. In a high-level view this is what happens: 1. Mark as waitlock 2. Write to disk as permanent storage (jobs left in this state by a crashing master daemon are resumed on restart) 3. Wait for lock (assume lock is held by another thread) 4. Mark as queued 5. Write to disk again 6. Return to workerpool Another option originally discussed was to leave the job in the “waitlock” status. Ignoring priority changes, this is what would happen: 1. If not in waitlock 1.1. Assert state == queued 1.2. Mark as waitlock 1.3. Set start_timestamp 1.4. Write to disk as permanent storage 3. Wait for locks (assume lock is held by another thread) 4. Leave in waitlock 5. Return to workerpool Now let's assume the lock is released by the other thread: […] 3. Wait for locks and get them 4. Assert state == waitlock 5. Set state to running 6. Set exec_timestamp 7. Write to disk As this change reduces the number of writes from two per lock acquire attempt to two per opcode and one per priority increase (as happens after 24 acquire attempts (see mcpu._CalculateLockAttemptTimeouts) until the highest priority is reached), here's the patch to implement it. Unittests are updated. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
- Verify job file updates - Ensure queue lock is released while executing opcode Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 12, 2010
-
-
Michael Hanselmann authored
If a job was cancelled while it was waiting for locks, an assertion would've failed. This patch fixes the problem and provides a unit test to check for this situation. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 24, 2010
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
As already noted in the design document, an opcode's priority is increased when the lock(s) can't be acquired within a certain amount of time, except at the highest priority, where in such a case a blocking acquire is used. A unittest is provided. Priorities are not yet used for acquiring the lock(s)—this will need further changes on mcpu. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Sep 23, 2010
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 22, 2010
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 20, 2010
-
-
Michael Hanselmann authored
In order to support priorities, the processing of jobs needs to be changed. Instead of processing jobs as a whole, the code is changed to process one opcode at a time and then return to the queue. See the Ganeti 2.3 design document for details. This patch does not yet use priorities for acquiring locks. The enclosed unittests increase the test coverage of jqueue.py from about 34% to 58%. Please note that they also test some parts not added by this patch, but testing them became only possible with some infrastructure added by this patch. For the first time, many implications and assumptions for the job queue are codified in these unittests. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 16, 2010
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 13, 2010
-
-
Michael Hanselmann authored
This allows clients to submit opcodes with a priority. Except for being tracked by the job queue, it is not yet used by any code. Unittests for jqueue._QueuedOpCode and jqueue._QueuedJob are provided for the first time. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 07, 2010
-
-
Michael Hanselmann authored
Comes with unittest. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 15, 2010
-
-
Michael Hanselmann authored
By splitting the _WaitForJobChangesHelper class into multiple smaller classes, we gain in several places: - Simpler code, less interaction between functions and variables - Easy to unittest (close to 100% coverage) - Waiting for job changes has no direct knowledge of queue anymore (it doesn't references queue functions anymore, especially not private ones) - Activate inotify only if there was no change at the beginning (and checking again right away to avoid race conditions) Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-