- Apr 22, 2013
-
-
Michele Tartara authored
The reason trail will contain an item indicating the job_id and the index number of the current opcode inside the job queue. Signed-off-by:
Michele Tartara <mtartara@google.com> Reviewed-by:
Helga Velroyen <helgav@google.com>
-
- Apr 10, 2013
-
-
Michele Tartara authored
If split users are used, the queue directory could only be accessed by masterd, but also confd needs to be able to read it, e.g. when it is queried as part of "gnt-job list" This commit fixes the permissions in such a way to allow proper access rights. Fixes Issue 406. Signed-off-by:
Michele Tartara <mtartara@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Dec 13, 2012
-
-
Michael Hanselmann authored
This addresses issue 218. When the number of inotify watches is exhausted, for example by being set too low from the beginning or by other programs, waiting for a job to change would just report a lost job (e.g. “Error checking job status: Job with id 7817 lost”). This patch changes the job watcher to no longer catch “errors.InotifyError” and, this is by far the larger part of this patch, adds unittests for this situation. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Dec 11, 2012
-
-
Michael Hanselmann authored
Until now, the flag was unset on a master failover unless the “$localstatedir/lib/ganeti/queue/drain” file existed. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Dec 05, 2012
-
-
Michael Hanselmann authored
Commit 4679547e implemented the ability to change job's priority after it was submitted. The code contained a bug whereby it would modify the input data for an opcode, something the job queue shouldn't do (logical units do for historical reasons). This patch removes the line modifying the opcode input and adjusts the tests. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 20, 2012
-
-
Iustin Pop authored
Currently, ht.py uses a bad terminology for positive/non-negative numbers. Per http://en.wikipedia.org/wiki/Positive_number , this is the correct terminology: - A number is positive if it is greater than zero. - A number is negative if it is less than zero. - A number is non-negative if it is greater than or equal to zero. - A number is non-positive if it is less than or equal to zero. So this patch renames things as follows: - TPositiveInt ⇒ TNonNegativeInt - TStrictPositiveInt ⇒ TPositiveInt - TMaybePositiveInt ⇒ dropped, not used anywhere - TMaybeStrictPositiveInt ⇒ TMaybePositiveInt - TPositiveFloat ⇒ TNonNegativeFloat - TStrictNegativeInt ⇒ TNegativeInt Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Nov 13, 2012
-
-
Michael Hanselmann authored
This is due to a feature request. Sometimes one wants to change the priority of a job after it has been submitted, e.g. after submitting an important job only to later notice many other pending jobs which will be processed first. Priority changes only take effect at the next lock acquisition or when the job is re-scheduled. The design is very similar to how jobs are cancelled. Unit tests for “_QueuedJob.ChangePriority” are included. Also rename “TestQueuedJob.test” to “TestQueuedJob.testError”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
Michael Hanselmann authored
The job ID is re-used as the task ID, as job IDs are unique. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
- Nov 08, 2012
-
-
Michael Hanselmann authored
Instead of being given the priority for acquiring locks by means of a parameter, mcpu will now call back. This is in preparation for implementing a command to change a job's priority on the fly and allows to change it while locks are being acquired (taking effect on the next lock acquire). Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
- Nov 01, 2012
-
-
Michael Hanselmann authored
When a job is still waiting for locks and the queue is shutting down, they should be returned and not actually start processing. Until now jobs which transitioned from “queued” to “waiting” were already considered to be running as far as the shutdown code was concerned. This fixes issue 296. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 25, 2012
-
-
Michael Hanselmann authored
A new function will be added to change a job's priority. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
Michael Hanselmann authored
Somehow this was missed in commit 0422250e. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Helga Velroyen <helgav@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Helga Velroyen <helgav@google.com>
-
- Oct 11, 2012
-
-
Michael Hanselmann authored
If requested via a filter or by including the “archived” output, archived jobs will be loaded and shown. This is significantly slower than just listing normal jobs, therefore by default they are not loaded at all. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
This attribute is set to True for jobs which were restored from an archived file. A new filter will act on this field. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
The description was not accurate. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 05, 2012
-
-
Michael Hanselmann authored
First: This enables the use of “gnt-job watch $id” for archived jobs. Now, the reason for actually making this work is that during sufficiently large group or node evacuations jobs are archived before the client gets to poll for their output. This led to situations where the jobs would finish successfully, but the client reported an error because it couldn't see the job anymore. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com> (cherry picked from commit 04569469)
-
Michael Hanselmann authored
First: This enables the use of “gnt-job watch $id” for archived jobs. Now, the reason for actually making this work is that during sufficiently large group or node evacuations jobs are archived before the client gets to poll for their output. This led to situations where the jobs would finish successfully, but the client reported an error because it couldn't see the job anymore. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
- Sep 25, 2012
-
-
Michael Hanselmann authored
- pathutils: Prepend node-specific prefix path - RPC: Use virtual paths (see vcluster.py) - SSH: Pass environment variables, use destination's node directory when copying files using scp, use GANETI_HOSTNAME to determine hostname Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Sep 18, 2012
-
-
Michael Hanselmann authored
File system paths moved from constants to pathutils. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 07, 2012
-
-
Iustin Pop authored
This has been a long-standing cleanup item, which we've always refrained from doing due to the high estimated effort needed. In reality, it turned out that after some infrastructure improvements (the previous patches), the actual job queue-related changes are quite small. We will need to update the NEWS file later, but so far the RAPI documentation doesn't mention that the job ID is a string (it only says it is "a number"), so it doesn't look like it needs update. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Jun 15, 2012
-
-
Michael Hanselmann authored
These don't really need to be in jqueue, and a new function will be added to convert job IDs to an integer for queries. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Mar 30, 2012
-
-
Michael Hanselmann authored
This enables the use of filters through query2 when listing jobs. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
… instead of re-calculating it on every file change. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
This rather inefficient implementation (fields are evaluated on every call to GetInfo) is not good for WaitForJobChanges and doesn't support filters, but that will be rectified in later patches. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
There was a typo and it's not necessary to repeat the class name. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Dec 22, 2011
-
-
Michael Hanselmann authored
This allows for more unittesting. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Dec 21, 2011
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Serializing to JSON using “simplejson” is significantly slower when indentation and/or sorting of dictionary keys is used. In simplejson 1.x the difference isn't that big, but with simplejson 2.x the difference can be up to a factor of 7.5. The reason is that the latter no longer uses C functions when sorting or indentation is used. With this patch we revert everything to simplejson's defaults, which should provide us with the best performance available. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
When an opcode is about to be processed its dependencies are evaluated using “_JobDependencyManager.CheckAndRegister”. Due to its nature that function requires a lock on the manager's internal structures. All of this happens while the job queue lock is held in shared mode (required for the job processor). When a job has been processed any pending dependencies are re-added to the job workerpool. Before this patch that would require the manager's lock and then, for adding the jobs, the job queue lock. Since this is in reverse order it will lead to deadlocks. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 21, 2011
-
-
Michael Hanselmann authored
Doing so will prevent job submissions (similar to a drained queue), but won't affect currently running jobs. No further jobs will be executed. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 17, 2011
-
-
Michael Hanselmann authored
This is in preparation for a clean(er) shutdown of masterd. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 27, 2011
-
-
Michael Hanselmann authored
If cmdlib.LUNodeMigrate was called for a node without primary instances it would try to submit an empty list of jobs. This was never visible via CLI as there we check the list of primary instances first. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Oct 26, 2011
-
-
Michael Hanselmann authored
With these changes job queue RPC will finally show up on the lock monitor. See below for an example. A job queue-specific class is used to restrict the use of a static list for name resolution to the job queue. Further improvements can be made to not re-create the whole RPC client for every call (e.g. by using a more dynamic resolver), but for now this works. rpc/node8.example.com/jobqueue_update Jq8/Job9/TEST_DELAY Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 06, 2011
-
-
Michael Hanselmann authored
Commit 66bd7445 added an assertion to ensure a finalized job has its “end_timestamp” attribute set. Unfortunately it didn't cover a case when the queue is recovering from an unclean master shutdown. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com> (cherry picked from commit 45df0793)
-
- Aug 30, 2011
-
-
Andrea Spadaccini authored
Running pylint 0.24.0 revealed 2 errors and 1 warning. Here is how I fixed them: * jqueue.py: silenced E1101 * netutils.py: rewrote the list comprehension using extend() * watcher/__init__.py: fixed a missing format string parameter These changes are backwards-compatible with pylint 0.21.1. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Andrea Spadaccini authored
In version 0.21, pylint unified all the disable-* (and enable-*) directives to disable (resp. enable). This leads to a lot of DeprecationWarning being emitted even if one uses the recommended version of pylint (0.21.1, as stated in devnotes.rst). This commit changes all the disable-msg directives to disable. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 19, 2011
-
-
Michael Hanselmann authored
This was a regression from 2.4. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Aug 02, 2011
-
-
Michael Hanselmann authored
By sleeping for 100ms after receiving a notification for a changed job file the job is given some additional time to change again. This significantly reduces the number of LUXI calls for WaitForJobChanges (depending on the job, in my tests with “gnt-cluster verify --debug-simulate-errors” by about 80%), and improves performance (the same job went from around 7 seconds to around 3.5 seconds). This method is not perfect. The algorithm could be made more complex, e.g. by increasing the delay on each change, etc., but for now this simple change provides a good improvement. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 21, 2011
-
-
Michael Hanselmann authored
This makes them visible to the user. Example: $ gnt-debug locks -o name,pending Name Pending job/890 job:891,892 job/892 job:894 Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-