- Oct 19, 2010
-
-
René Nussbaumer authored
This includes a new option gnt-cluster init and approriate output on gnt-cluster info. Though gnt-cluster modify is not yet prepared. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
* devel-2.2: Bump version to 2.2.1, update NEWS Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 15, 2010
-
-
Michael Hanselmann authored
* devel-2.2: http.client: Disable SSL session ID cache Crude workaround for pylint breakage Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Apollon Oikonomopoulos authored
This patch disables the SSL session ID cache for all cURL operations. This is needed because http.HttpBase's PyOpenSSL implementation does not currently set a context using SSL_set_session_id_context(3SSL), cURL tries to re-use the session ID and, according to SSL_set_session_id_context(3SSL): If the session id context is not set on an SSL/TLS server and client certificates are used, stored sessions will not be reused but a fatal error will be flagged and the handshake will fail. Ideally, session caching should be either controlled, or disabled in HttpBase, however PyOpenSSL does not seem to implement SSL_CTX_set_session_cache_mode nor SSL_CTX_set_session_id_context which are used for these purposes (it seems that only M2Crypto's SSL module supports these). Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Apollon Oikonomopoulos authored
This patch disables the SSL session ID cache for all cURL operations. This is needed because http.HttpBase's PyOpenSSL implementation does not currently set a context using SSL_set_session_id_context(3SSL), cURL tries to re-use the session ID and, according to SSL_set_session_id_context(3SSL): If the session id context is not set on an SSL/TLS server and client certificates are used, stored sessions will not be reused but a fatal error will be flagged and the handshake will fail. Ideally, session caching should be either controlled, or disabled in HttpBase, however PyOpenSSL does not seem to implement SSL_CTX_set_session_cache_mode nor SSL_CTX_set_session_id_context which are used for these purposes (it seems that only M2Crypto's SSL module supports these). Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
The way we currently call pylint, the exact order it inspect modules in lib/http/ depends on the filesystem order. This is not good, and if lib/http/server.py is loaded before lib/http/__init__.py, it will throw a "R0921:763:HttpMessageReader: Abstract class not referenced" (as that class is used in server.py). For the short-term fix, we just add server.py after "ganeti", so that it gets parsed (again?) and pylint sees the usage of the class. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
This was missing from commit 2287b920. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 14, 2010
-
-
Iustin Pop authored
* stable-2.2: Release 2.2.1~rc1 Require aclocal 1.11.1 or above for devel/release Revert "Require aclocal 1.11.1 or above for autogen.sh" Add mising --units in gnt-instance list man page Set list of trusted SSL CAs for client to verify Require aclocal 1.11.1 or above for autogen.sh Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
I did forgot this in the original patch. Sorry!!!! Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
The interaction with cron-launched watcher is a well-known failure mode of QA: ---- 2010-10-14 06:54:55.464839 time=0:00:56.764827 Test tools/move-instance For the following tests it's recommended to turn off the ganeti-watcher cronjob. ---- 2010-10-14 06:54:55.465255 start Test automatic restart of instance by ganeti-watcher … Error: Domain 'instance1' does not exist. Command: ssh -oEscapeChar=none -oBatchMode=yes -l root -t -oStrictHostKeyChecking=yes -oClearAllForwardings=yes -oForwardAgent=yes node2 'ganeti-watcher -d' 2010-10-13 23:55:04,479: pid=1659 ganeti-watcher:626 ERROR Can't acquire lock on state file /var/lib/ganeti/watcher.data: File already locked ---- 2010-10-14 06:55:04.513948 time=0:00:09.048693 Test automatic restart of instance by ganeti-watcher In order to fix this, we disable the watcher during these tests, and re-enable it afterwards. To protect against watcher being disabled, we enable it unconditionally at the start of the QA (we do want it enabled, in order to see the interaction between the watcher and many creation/disk replace jobs, etc.). Note: even after this patch, if a cron-watcher was started and is still running during the test, we'll have locking issues. I think for now this is OK, we'll have to see how often that happens. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
During cluster maintenance, when the watcher is disabled, it's useful to run it just once. This is incovenient to do currently, as the watcher needs to be unpaused, then run, then paused again. This patch adds an option “--ignore-pause” that can be used to ignore the cluster-level setting. Also the man page is updated as it was missing the options available. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
* stable-2.2: Require aclocal 1.11.1 or above for devel/release Revert "Require aclocal 1.11.1 or above for autogen.sh" Set list of trusted SSL CAs for client to verify Require aclocal 1.11.1 or above for autogen.sh Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 13, 2010
-
-
Michael Hanselmann authored
I didn't know why the code previously used “pyinotify.EventsCodes.ALL_FLAGS” instead of using the flags from “pyinotify.EventsCodes” directly. Turns out that Pyinotify 0.8 has them in “pyinotify”, not “pyinotify.EventsCodes”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
We noticed several issues when just watching the file, among them race conditions upon replacing the file using rename(2) (the new watcher would be created too soon). By just watching the directory for events on the rapi_users file, this can be avoided. A nice side-effect is that now the users file is also reloaded if it didn't exist upon ganeti-rapi's start (see the documentation update). Since ganeti-rapi now becomes active for virtually every change in the configuration directory (…/lib/ganeti), moving the rapi_users file to a separate directory will be considered. It doesn't have to happen in or before this patch, though. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
The base class can contain code useful to other inotify users. As it is “SingleFileEventHandler” can not be used in ganeti-rapi, therefore it'll use its own small inotify handler class based on this base class. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Reading the file before this function allows for better error reporting. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
This is for cleanup, and for later reuse in other parts of the code (outside of LUs). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Currently, masterd startup with old software versions is very confusing for users: we present two tracebacks, with a message in the middle about "version mismatch". This can lead to users believing that all that needs to be done is to fix the config file. This patch attempts to improve this by handling this case in masterd itself (not in the child), and showing a more friendly message for this case. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Guido Trotter authored
1.11.1 is the version in squeeze and lucid, and we know it works. We also know that 1.10.1 in hardy and lenny doesn't, nor do 1.10 in etch and 1.9.6 in dapper. We haven't tested any other version. With older versions python.m4 is buggy, and results in the package being built not working on python 2.6 (which uses dist-packages rather than site-packages as a module directory). Version comparison is done component-by-component, over a bash array. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Guido Trotter authored
The comparison is incorrect, and the check also breaks daily work on autobuilders and older distros. This reverts commit dbc4dda7. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
Currently, the custom instance parameters (hv, be, nicp) are only queryable via LUQueryInstanceData. LUQueryInstance returns only the filled parameters, thus its users (especially RAPI) have no way to know if a parameter is custom or the default value. This patch adds three new parameters: custom_hvparams, custom_beparams, custom_nicparams, that are also exported in RAPI. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Also fixes some wrapping issues, and one typo. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> (cherry picked from commit f8409165) Conflicts: man/gnt-instance.sgml (re-wrapped)
-
- Oct 12, 2010
-
-
Apollon Oikonomopoulos authored
As per SSL_CTX_set_client_CA_list(3SSL), set the list of acceptable CAs advertised to SSL clients to include the server's own certificate. This evidently fixes the pycurl/gnutls RPC client. During the TLS Handshake, when client verification is requested, the Server sends a CertificateRequest message which states that the client should send a valid certificate as a response. The CertificateRequest message contains a section called "certificate_authorities", which, according to the standard, is a list of the Distinguished Names (DNs) of acceptable certification authorities. The client uses this list to send a certificate signed by one of the acceptable CAs. Under OpenSSL's server implementation, this list must be set manually using some appropriate call, otherwise the list is empty. TLS 1.0[1] does not state whether the list may be left blank, whereas TLS 1.1[2] and 1.2[3] state that in case the list is blank, then the client *may* send any certificate of a valid type (valid types are specified elsewhere in the handshake). OpenSSL clients seem to obey the behaviour specified in TLS 1.1+, whereas at least curl+GnuTLS does not send any certificates if the list is empty (which is not wrong per the spec, but also evidently not configurable). [1] http://tools.ietf.org/html/rfc2246 [2] http://tools.ietf.org/html/rfc4346 [3] http://tools.ietf.org/html/rfc5246 Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Guido Trotter authored
1.11.1 is the version in squeeze and lucid, and we know it works. We also know that 1.10.1 in hardy and lenny doesn't, nor do 1.10 in etch and 1.9.6 in dapper. We haven't tested any other version. With older versions python.m4 is buggy, and results in the package being built not working on python 2.6 (which uses dist-packages rather than site-packages as a module directory). The autogen.sh interpreter is changed to bash, as we need to use the [[ builtin to compare versions with "<". [ doesn't have that functionality, and we can't of course rely on dpkg, which won't be installed on all distributions. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
The current message is not entirely clear, as it doesn't show the reason why the instance is not running. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
And sorry! Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
If a job was cancelled while it was waiting for locks, an assertion would've failed. This patch fixes the problem and provides a unit test to check for this situation. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Commit 5ef699a0 had to roll back an earlier attempt at implementing this. With the improved job queue processer, this is finally possible. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
These fields can help with debugging. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Removes code duplication. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
This is the first step for the support of wiping block devices prior to creation of the instance. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 11, 2010
-
-
Iustin Pop authored
* devel-2.2: RPC: disable curl's Expect header Conflicts: lib/rpc.py (trivial, copyright header) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This patch solves the very slow (~8-9 seconds) gnt-instance modify behaviour. Well, it solves in general the slow RPC behaviour, but it was most visible in that LU. It seems that curl's behaviour with regard to file uploads (via PUT) and the 'Expect' header are interacting badly with our http server. First, our http server doesn't properly handle this header. According to RFC 2616: Requirements for HTTP/1.1 origin servers: Upon receiving a request which includes an Expect request-header field with the "100-continue" expectation, an origin server MUST either respond with 100 (Continue) status and continue to read from the input stream, or respond with a final status code. Our server doesn't do this, and hence it triggers this behaviour in curl (from the curl FAQ): 4.16 My HTTP POST or PUT requests are slow! libcurl makes all POST and PUT requests (except for POST requests with a very tiny request body) use the "Expect: 100-continue" header. This header allows the server to deny the operation early so that libcurl can bail out already before having to send any data. This is useful in authentication cases and others. However, many servers don't implement the Expect: stuff properly and if the server doesn't respond (positively) within 1 second libcurl will continue and send off the data anyway. You can disable libcurl's use of the Expect: header the same way you disable any header, using -H / CURLOPT_HTTPHEADER, or by forcing it to use HTTP 1.0. This behaviour was detected by watching the captured traffic (in non-SSL mode), where between the initial HTTP headers (ending with the Expect one), there was a ~1-2 second pause until curl was sending the body. Properly RTFM-ing would have saved ~1 day of digging around, but hey… Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 08, 2010
-
-
Guido Trotter authored
* devel-2.2: Release Ganeti 2.2.0.1 Bump version to 2.2.1~rc0 Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Luca Bigliardi <shammash@google.com>
-
Guido Trotter authored
* commit 'v2.2.0.1': Release Ganeti 2.2.0.1 Conflicts: NEWS - merge configure.ac - keep 2.2.1~rc0 version Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Luca Bigliardi <shammash@google.com>
-
Guido Trotter authored
2.2.0 was built with old autotools, and it's incompatible with Python 2.6. Rebuilding with a newer autotools version fixes this. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Luca Bigliardi <shammash@google.com>
-
Iustin Pop authored
Currently, the logging in QA doesn't show the duration of the various steps, and if it is needed one has to perform log manipulation. This patch changes the output so that the log informatio is line based (as opposed to block-based), such that it's easy to grep for all log lines: ./qa/ganeti-qa.py --yes-do-it qa.json 2>&1|grep ^---- ---- 2010-10-08 14:40:21.730382 start Test SSH connection -------------- ---- 2010-10-08 14:40:23.156633 time=0:00:01.426251 Test SSH connection ---- 2010-10-08 14:40:23.156735 start ICMP ping each node -------------- ---- 2010-10-08 14:40:24.230479 time=0:00:01.073744 ICMP ping each node ---- 2010-10-08 14:40:24.230583 start Test availibility of Ganeti commands ---- 2010-10-08 14:40:32.314586 time=0:00:08.084003 Test availibility of Ganeti commands ---- 2010-10-08 14:40:32.314734 start gnt-node info -------------------- ---- 2010-10-08 14:40:32.860884 time=0:00:00.546150 gnt-node info ------ or just for the duration of the steps: ./qa/ganeti-qa.py --yes-do-it ../qa-mpgntac5.fra.json 2>&1|grep ^----.*time= ---- 2010-10-08 14:42:12.630067 time=0:00:01.239256 Test SSH connection ---- 2010-10-08 14:42:14.204393 time=0:00:01.574221 ICMP ping each node ---- 2010-10-08 14:42:22.170828 time=0:00:07.966331 Test availibility of Ganeti commands ---- 2010-10-08 14:42:22.701030 time=0:00:00.530037 gnt-node info ------ This will help with identifying slow steps or even graphing the QA duration. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-