- 24 Apr, 2014 2 commits
-
-
Petr Pudlak authored
Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
.. because modifying the queue inside the handler can have unexpected consequences. Since Python 2 doesn't have a nice way how to modify a variable from an inner function, we have to use a list as a wrapper. (Python 3 has the "nonlocal" keyword for it.) Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
- 22 Apr, 2014 6 commits
-
-
Klaus Aehlig authored
When failing a job, add an entry to the reason trail, indicating what made the job fail (e.g., failed to fork or detected job death). Signed-off-by:
Klaus Aehlig <aehlig@google.com> Reviewed-by:
Petr Pudlak <pudlak@google.com>
-
Klaus Aehlig authored
...to simplify manipulation of them. Signed-off-by:
Klaus Aehlig <aehlig@google.com> Reviewed-by:
Petr Pudlak <pudlak@google.com>
-
Klaus Aehlig authored
...to be able to operate on the MetaOpCode that is behind an InputOpCode (if we're in the right component of the sum). Signed-off-by:
Klaus Aehlig <aehlig@google.com> Reviewed-by:
Petr Pudlak <pudlak@google.com>
-
Klaus Aehlig authored
...so that manipulations deep within such an object get more simple. Signed-off-by:
Klaus Aehlig <aehlig@google.com> Reviewed-by:
Petr Pudlak <pudlak@google.com>
-
Klaus Aehlig authored
Move all the definition of objects to a spearate file. In this way, the lense module for JQueue can use these objects, while JQueue can use the lenses. For use outside, we reexport the objects. Signed-off-by:
Klaus Aehlig <aehlig@google.com> Reviewed-by:
Petr Pudlak <pudlak@google.com>
-
Klaus Aehlig authored
Signed-off-by:
Klaus Aehlig <aehlig@google.com> Reviewed-by:
Petr Pudlak <pudlak@google.com>
-
- 17 Apr, 2014 32 commits
-
-
Petr Pudlak authored
.. and get rid of unnecessary variable binding. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
.. because with the new mechanism, the process can be slower and the job sometimes returned successfully before it could have been cancelled. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Klaus Aehlig authored
Make the onTimeWatcher of the job queue scheduler also verify that all notionally running jobs are indeed alive. If a job is found dead, remove it from the list of running jobs and update the job file to reflect the unexpected death. Signed-off-by:
Klaus Aehlig <aehlig@google.com> Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Petr Pudlak <pudlak@google.com>
-
Petr Pudlak authored
We can only send the signal if the job is alive and if there is a process ID in the job file (which means that the signal handler has been installed). If it's missing, we need to wait and retry. In addition, after we send the signal, we wait for the job to actually die, to retain the original semantics. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
.. so that it can be viewed what lock file and with what result was tested. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
The functionality is kept the same, but instead of comparing for equality, a more general version based on a predicate is added. This allows to base the condition on only a part of the output. In addition, 'bracket' is added so that inotify data structure is properly cleaned up even if the inner IO action throws an exception. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
.. so that it's possible to use logging operations there. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
This is a bit problematic as there is no portable way how to list all open file descriptors, and we can't track them all, because they're also opened by third party libraries such as inotify. Therefore we use /proc/self/fd and /dev/fd, which should work for all Linux flavors and most *BSD as well. If both are missing, we don't do anything and just log a warning. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
`orElse` works just as `mplus` of ResultT, but it only requires `MonadError` and doesn't accumulate the errors, it just returns the second one, if both actions fail. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
If the endpoint (such as Luxid or WConfd) isn't running, don't fail immediately. Instead retry (within the given timeout) and try to reconnect. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
On the Python side it was assumed that the blacklisted private parameters were always dictionaries, but since they're optional, they could be 'None' as well. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
Since now each process only creates a 1-job queue, trying to use file locks only causes job deadlock. Also reduce the number of threads running in a job queue to 1. Later the job queue will be removed completely. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
If a Haskell program is compiled with -threaded, then inheriting open file descriptors doesn't work, which breaks our job death detection mechanism. (And on older GHC versions even forking doesn't work.) Therefore let Luxi daemon check and let it fail to start, if it detect it has been compiled with -threaded. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Klaus Aehlig authored
As luxid forks off processes now, it may receive SIGCHLD signals. Hence add a handler for this. Since we obtain the success of the child from the job file, ignoring is good enough. Signed-off-by:
Klaus Aehlig <aehlig@google.com> Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Petr Pudlak <pudlak@google.com>
-
Petr Pudlak authored
.. instead of just letting the master daemon to handle them. We try to start all given jobs independently and requeue those that failed. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
.. which will be used if the Luxi daemon attempts to start a job, but fails. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
The ID of the current process is stored in the job file. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
This will allow to check if a particular job is alive, and send signals to it when it's running. The fields aren't serialized, if missing, for backwards compatibility. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
.. using the POSIX type ProcessID. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
They will be used by Luxi daemon to spawn jobs as separate processes. The communication protocol between the Luxi daemon and a spawned process is described in the documentation of module Ganeti.Query.Exec. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
Use the function where appropriate. Also handling of CancelJob is slightly refactored to use ResultT, which is used by the new function. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
The file is initialized and kept within JQStatus. It is temporarily assigned to jobs spawned by Luxi until they create their own livelock files. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
.. as it has nothing special to do with WConfd and fits the new module better. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
Currently it exports a function for creating livelock files. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
.. so that job processes can supply the livelock inherited from the master process. Also add a logging statement for creating the job queue (which will be removed when we get rid of Python job queues completely). Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
This will be used by job processes temporarily, until they get rid of using job queue completely. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
.. so that it works for LiveLockName as well. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
Since job processes inherit their live-lock files from the master process, they don't directly work with the file, they just need to use the name. This class exposes the same interface as LiveLock for such pre-created livelocks. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
This allows to set up a client using the Luxi-like protocol over a pipe, which will be needed for job processes to communicate with their parent process. While at it, fix the style of calling __init__ in AbstractStubClient. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
.. so that Haskell code can create them at the proper place. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
This is needed for properly executing Python job processes. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-
Petr Pudlak authored
The purpose is to keep the communication channel open, while replacing a 'Client' with something else. Signed-off-by:
Petr Pudlak <pudlak@google.com> Reviewed-by:
Klaus Aehlig <aehlig@google.com>
-