- Oct 07, 2010
-
-
Iustin Pop authored
This makes almost all of the daemons show error messages, and not return until they finished listening on the appropriate sockets. Masterd is the only one "special", as it doesn't do enough initialization in the server creation, only later. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Currently, GenericMain does a two-staged workflow: - Check, before forking - then Exec, after forking This means we don't have any possibility to treat preparation work (before the daemon is ready for work) different from the actual work. The patch adds another PreExec function that is run just before Exec, and which should ensure that the daemon is ready for serving client before it returns. Its result is then sent as the third argument to Exec. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 24, 2010
-
-
Michael Hanselmann authored
As already noted in the design document, an opcode's priority is increased when the lock(s) can't be acquired within a certain amount of time, except at the highest priority, where in such a case a blocking acquire is used. A unittest is provided. Priorities are not yet used for acquiring the lock(s)—this will need further changes on mcpu. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Sep 13, 2010
-
-
Michael Hanselmann authored
This patch moves the code watching the users file into a a separate class to not mix it with HTTP serving. The users file is now driven from outside the HTTP server class. Also the documentation is updated to mention the automatic reloading. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Sep 10, 2010
-
-
René Nussbaumer authored
Please note: This only works if the file existed upon startup. If the file was created later, ganeti-rapi has to be restarted. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 07, 2010
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
This partially reverts commit 8b72b05c. Basically it removes the user involved changes Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 06, 2010
-
-
René Nussbaumer authored
The startup of the daemons would take a lot of time otherwise, also it's not needed to set the permissions of those file over and over again, because if the daemons are once migrated to the user they will keep creating the file for that user. The full run is intended as initial upgrade Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
Please note that this can and will be improved over time. There are discussions about automated file generation of ensure-dirs so we can _really_ keep all the permissions and file ownerships in one place. Because right now they are all in this file _and_ on every WriteFile call. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 02, 2010
-
-
Iustin Pop authored
Since the RAPI certificate is not necessarily self-signed, and we currently don't have any configuration variable for the real CA file, we disable for now the CA checks. This fixes the 'restart RAPI every 5 minutes' problem with non-self-signed certs. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Aug 24, 2010
-
-
Michael Hanselmann authored
This patch adds an initial implementation of a lock monitor, accessible for the user through “gnt-debug locks”. It currently shows all resource locks: BGL, nodes and instances. Config and job queue locks could be shown too, but wouldn't be of much help. The current owner(s) and mode are also shown. Showing pending acquires will require further changes on the SharedLock internals and is not yet implemented. Example output: $ gnt-debug locks -o name,mode,owner Name Mode Owner BGL/BGL shared JobQueue19/Job147 instances/inst1 exclusive JobQueue19/Job147 instances/inst2 - - instances/inst3 - - instances/inst4 - - nodes/node1 exclusive JobQueue19/Job147 nodes/node2 exclusive JobQueue19/Job147 Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 23, 2010
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Aug 19, 2010
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 18, 2010
-
-
Manuel Franceschini authored
This patch enables IPv6 name resolution by using socket.getaddrinfo instead of socket.gethostbyname_ex. It renames the HostInfo class to Hostname and unifies its use throughout the code. This is achieved by using static calls where no object is needed and removes some obsolete code. For now, we just resolve to IPv4 addresses, but this will change once it is needed. Signed-off-by:
Manuel Franceschini <livewire@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Manuel Franceschini authored
This patch unifies the netutils functions dealing with IP addresses to three classes: - IPAddress: Common IP address functionality - IPv4Address: IPv4 specific functionality - IPv6address: IPv6-specific functionality Furthermore it adds methods to check whether an address is a loopback address, replacing the .startswith("127") for IPv4 and adding IPv6 support. It also provides the basis for future IPv6 address handling. Methods to convert IP strings to their corresponding interger values will allow to canonicalize IPv6 addresses. Signed-off-by:
Manuel Franceschini <livewire@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 29, 2010
-
-
Michael Hanselmann authored
By changing it to a normal parameter, which must be a sequence, we can start using keyword parameters. Before this patch all arguments to “AddTask(self, *args)” were passed as arguments to the worker's “RunTask” method. Priorities, which should be optional and will be implemented in a future patch, must be passed as a keyword parameter. This means “*args” can no longer be used as one can't combine *args and keyword parameters in a clean way: >>> def f(name=None, *args): ... print "%r, %r" % (args, name) ... >>> f("p1", "p2", "p3", name="thename") Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: f() got multiple values for keyword argument 'name' Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 26, 2010
-
-
Iustin Pop authored
Currently, the master IP activation is done in the Exec function. Since the original masterd process returns after forking, and Exec is run in the (grand)child process, this means that after 'ganeti-masterd' has returned there are still initialization tasks running. Normally this is not a problem, but in cases where one does quick master failovers, this creates a race condition which hits the QA scripts especially hard. To solve this, and make the startup process cleaner (the system is in steady state after the command has returned, even though masterd startup could still fail), we move the IP activation to Check(). This also allows error messages about the IP activation to be seen on the console. With this patch enabled, I can no longer reproduce the double-failover errors, which were occuring before in 4/5 cases. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Iustin Pop authored
This is needed because not just the cli scripts need this decorator, but the master daemon too (and it already duplicated the code once). In cli.py we just leave a stub, so that we don't have to modify all the scripts to import rpc.py. We then change the master daemon code to reuse this decorator, instead of duplicating it. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Iustin Pop authored
This patch implements a few changes to the instance handling. First, old instances which no longer exist on the cluster are removed from the state file, to keep things clean. Second, the instance restart counters are reset every 8 hours, since some error cases might be transient (e.g. networking issues, or machine temporarily down), and if the problem takes more than 5 restarts but is not permanent, watcher will not restart the instance. The value of 8 hours is, I think, both conservative (as not to hammer the cluster too often with restarts) and fast enough to clear semi-transient problems. And last, if an instance is not restarted due to exhausted retries, this should be warned, otherwise it's hard to understand why watcher doesn't want to restart an ERROR_down instance. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Jul 16, 2010
-
-
Michael Hanselmann authored
Instead of using our custom HTTP client, using PycURL's multi interface allows us to get rid of the HTTP client threadpool. The majority of the code is still in the ganeti.http.client module. A simple per-thread HTTP client pool gives cURL a chance to cache and retain as much information as possible (e.g. SSL certs). Unused HTTP clients (e.g. due to removed nodes) are deleted after 25 requests going through the pool. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 12, 2010
-
-
Manuel Franceschini authored
This patch series basically adds a new parameter 'family' to the constructors of daemon.AsyncUDPSocket and confd.client.ConfdUDPClient. This enables the users of these two classes to support IPv6. In ganeti-confd.ConfdAsyncUDPClient a method to check the address families of all peers is added. Furthermore it adds unittests for the added functionality. Signed-off-by:
Manuel Franceschini <livewire@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 09, 2010
-
-
Manuel Franceschini authored
This patch moves network utility functions to a dedicated module. Signed-off-by:
Manuel Franceschini <livewire@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 07, 2010
-
-
Luca Bigliardi authored
Node daemon prints a lot of warnings if --no-mlock option is not specified and ctypes module is not present. With the following patch the warning is printed only at noded startup. Signed-off-by:
Luca Bigliardi <shammash@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jul 06, 2010
-
-
Luca Bigliardi authored
Signed-off-by:
Luca Bigliardi <shammash@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 02, 2010
-
-
Iustin Pop authored
This was "broken" for almost a year :) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jul 01, 2010
-
-
Michael Hanselmann authored
Currently the RAPI client uses the urllib2 and httplib modules from Python's standard library. They're used with pyOpenSSL in a very fragile way, and there are known issues when receiving large responses from a RAPI server. By switching to PycURL we leverage the power and stability of the widely-used curl library (libcurl). This brings us much more flexibility than before, and timeouts were easily implemented (something that would have involved a lot of work with the built-in modules). There's one small drawback: Programs using libcurl have to call curl_global_init(3) (available as pycurl.global_init) while exactly one thread is running (e.g. before other threads) and are supposed to call curl_global_cleanup(3) (available as pycurl.global_cleanup) upon exiting. See the manpages for details. A decorator is provided to simplify this. Unittests for the new code are provided, increasing the test coverage of the RAPI client from 74% to 89%. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jun 30, 2010
-
-
Manuel Franceschini authored
Signed-off-by:
Manuel Franceschini <livewire@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Guido Trotter authored
Why it's needed here but not a few lines above is a mistery that only pylint understands. Also fix an indentation error in another disable, for the same function. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jun 29, 2010
-
-
Guido Trotter authored
Each luxi connection now creates an asyncore MasterClientHandler (which is an AsyncTerminatedMessageStream subclass, sending each message to a client worker). This makes it harder to DOS the master daemon by just creating luxi connections, as each of them will use memory and file descriptors, but not a dedicated thread. Each connection will only handle one message at a time. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jun 23, 2010
-
-
Iustin Pop authored
While we only support the 'parameters' check today, the RPC call is generic enough that will be able to support other checks in the future. The backend function will both validate the parameters list (so as to make sure we don't pass in extra parameters that the OS validation doesn't care about) and the parameter values, via the OS verify script. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jun 14, 2010
-
-
Michael Hanselmann authored
This “magic” value will be used to ensure that we don't accidentially connect to the wrong daemon (e.g. due to a bug), comparable to DRBD's per-disk secret. Just depending on the SSL certificate isn't enough as it's always per instance and not per disk. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
The hostname and port received from the remote cluster should be validated, just in case. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
Upon sending signals, ESRCH can be reported when the target no longer exists. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jun 11, 2010
-
-
Guido Trotter authored
This call was introduced but never used. In two years. Since it's just creating/removing a file it can also be in simpler ways, without a special rpc call, if/when we need it again. In the meantime, let's give it to history. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jun 10, 2010
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jun 08, 2010
-
-
Michael Hanselmann authored
Once we have a size for an export (in the context of the import/export daemon), we can provide the user with a percentage and ETA. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
This reports the amount of data transferred and the throughput (averaged over 60 seconds) to the master daemon. While not yet fully implemented, once the export scripts report the expected data size, we can even provide an ETA and percentage. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jun 04, 2010
-
-
Guido Trotter authored
Sometimes a node has never been a master. Or ran rapi. In that case we need to create the file (because if later rapi gets started, it won't be able to create it itself). Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
This is a workaround until we fully switched to user separation and fixes the owners of directories/log files so ganeti-rapi will start flawlessly. This is right now run for every daemon but as it operates on a relatively small subset its impact is small. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-