- Oct 11, 2010
-
-
Iustin Pop authored
* devel-2.2: RPC: disable curl's Expect header Conflicts: lib/rpc.py (trivial, copyright header) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This patch solves the very slow (~8-9 seconds) gnt-instance modify behaviour. Well, it solves in general the slow RPC behaviour, but it was most visible in that LU. It seems that curl's behaviour with regard to file uploads (via PUT) and the 'Expect' header are interacting badly with our http server. First, our http server doesn't properly handle this header. According to RFC 2616: Requirements for HTTP/1.1 origin servers: Upon receiving a request which includes an Expect request-header field with the "100-continue" expectation, an origin server MUST either respond with 100 (Continue) status and continue to read from the input stream, or respond with a final status code. Our server doesn't do this, and hence it triggers this behaviour in curl (from the curl FAQ): 4.16 My HTTP POST or PUT requests are slow! libcurl makes all POST and PUT requests (except for POST requests with a very tiny request body) use the "Expect: 100-continue" header. This header allows the server to deny the operation early so that libcurl can bail out already before having to send any data. This is useful in authentication cases and others. However, many servers don't implement the Expect: stuff properly and if the server doesn't respond (positively) within 1 second libcurl will continue and send off the data anyway. You can disable libcurl's use of the Expect: header the same way you disable any header, using -H / CURLOPT_HTTPHEADER, or by forcing it to use HTTP 1.0. This behaviour was detected by watching the captured traffic (in non-SSL mode), where between the initial HTTP headers (ending with the Expect one), there was a ~1-2 second pause until curl was sending the body. Properly RTFM-ing would have saved ~1 day of digging around, but hey… Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 08, 2010
-
-
Guido Trotter authored
* devel-2.2: Release Ganeti 2.2.0.1 Bump version to 2.2.1~rc0 Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Luca Bigliardi <shammash@google.com>
-
Guido Trotter authored
* commit 'v2.2.0.1': Release Ganeti 2.2.0.1 Conflicts: NEWS - merge configure.ac - keep 2.2.1~rc0 version Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Luca Bigliardi <shammash@google.com>
-
Guido Trotter authored
2.2.0 was built with old autotools, and it's incompatible with Python 2.6. Rebuilding with a newer autotools version fixes this. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Luca Bigliardi <shammash@google.com>
-
Iustin Pop authored
Currently, the logging in QA doesn't show the duration of the various steps, and if it is needed one has to perform log manipulation. This patch changes the output so that the log informatio is line based (as opposed to block-based), such that it's easy to grep for all log lines: ./qa/ganeti-qa.py --yes-do-it qa.json 2>&1|grep ^---- ---- 2010-10-08 14:40:21.730382 start Test SSH connection -------------- ---- 2010-10-08 14:40:23.156633 time=0:00:01.426251 Test SSH connection ---- 2010-10-08 14:40:23.156735 start ICMP ping each node -------------- ---- 2010-10-08 14:40:24.230479 time=0:00:01.073744 ICMP ping each node ---- 2010-10-08 14:40:24.230583 start Test availibility of Ganeti commands ---- 2010-10-08 14:40:32.314586 time=0:00:08.084003 Test availibility of Ganeti commands ---- 2010-10-08 14:40:32.314734 start gnt-node info -------------------- ---- 2010-10-08 14:40:32.860884 time=0:00:00.546150 gnt-node info ------ or just for the duration of the steps: ./qa/ganeti-qa.py --yes-do-it ../qa-mpgntac5.fra.json 2>&1|grep ^----.*time= ---- 2010-10-08 14:42:12.630067 time=0:00:01.239256 Test SSH connection ---- 2010-10-08 14:42:14.204393 time=0:00:01.574221 ICMP ping each node ---- 2010-10-08 14:42:22.170828 time=0:00:07.966331 Test availibility of Ganeti commands ---- 2010-10-08 14:42:22.701030 time=0:00:00.530037 gnt-node info ------ This will help with identifying slow steps or even graphing the QA duration. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 07, 2010
-
-
Michael Hanselmann authored
This allows the use “gnt-job cancel” in scripts. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
This simplifies the code a bit--the status is only checked once. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Also update NEWS. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
* devel-2.2: Try again to fix the inter-cluster move QA test Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This time, we re-establish the old pri/sec nodes corretly. Unfortunately this will require now a 3-node cluster at least for drbd instances, hence it's somewhat suboptimal, but… The other option would be to move it simply from p:s to s:p and then back to p:s, without involving a third node (for DRBD case), but I think that moving it to a completely separate node is slightly better for testing. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
I've seen cases where the result from str(sys.exc_info()[1]) is ""; this breaks the error reporting as the parent relies on non-empty error messages to properly detect child status (otherwise it will try to read the pid and fail, so on). While this was always in case of asserts, we need to ensure this doesn't happen. Therefore we abstract this functionality (writing the error message) and ensure we write a non-empty string in the new function. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Since daemon startup error will be often related to socket errors, so it makes sense to change the original reporting: Error when starting daemon process: "(98, 'Address already in use')" Into: Error when starting daemon process: 'Socket-related error: Address already in use (errno=98)' Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This makes almost all of the daemons show error messages, and not return until they finished listening on the appropriate sockets. Masterd is the only one "special", as it doesn't do enough initialization in the server creation, only later. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This patch copies the pipe-based error reporting functionality from utils.StartDaemon (I gave up for now on tryin to merge the two). This patch will fix two longstanding bugs: - if we fork, we lose all error reporting from the child to the original parent - if we fork, the original parent exits before the child is ready to "work" (whatever the work might be) Both these are fixed once the users of daemon.GenericMain are converted to the three-state setup, as we'll get error reporting via the pipe and also not exit until the PrepFn is done. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Currently, GenericMain does a two-staged workflow: - Check, before forking - then Exec, after forking This means we don't have any possibility to treat preparation work (before the daemon is ready for work) different from the actual work. The patch adds another PreExec function that is run just before Exec, and which should ensure that the daemon is ready for serving client before it returns. Its result is then sent as the third argument to Exec. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This patch merges the pid file handling used for ganeti-* daemons and impexp daemons. The latter version is used, since it's more reliable: uses locked pid files as opposed to checking 'live' processes. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This does some slight changes: - Daemonize() doesn't explicitly close the file-descriptors anymore, but only implicitly via the usage of dup2 - StartDaemonChild uses separate devnull for stdin (rdonly) and stdout/stderr (wronly), or if using a log file, it uses it in append mode Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This patch abstracts the chdir/umask/setsid functionality, which is identical in the code functions, just that Daemonize did the chdir/umask in the second child; with this change it does it in the first, as StartDaemon. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 06, 2010
-
-
Iustin Pop authored
* devel-2.2: QA: Fix instance move tests Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
The instance move tests were moving the instance from node pair (A,_) to (B, A), and left it there. This patch makes sure that the first step moves the instance to (B,A) but the second one back to (A,B), so that the instance is left on the same primary node. The original secondary node is lost though, if I read the code correctly. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 05, 2010
-
-
Michael Hanselmann authored
* devel-2.2: Add simple unittest for utils.CommaJoin LUDelTags: Improve formatting of error message LUGetTags: Acquire locks in shared mode gnt-cluster: Replace hardcoded “xenvg” with value retrieved from master Export VG name via LUQueryConfigValues RAPI QA: Override MAC address when moving instance move-instance: Allow overriding instance parameters cli: Move parsing of --net option to separate function kvm: collapse two consecutive extend calls kvm: Introduce support for -mem-path Conflicts: test/ganeti.cli_unittest.py: Trivial Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Use utils.CommaJoin to add spaces after comma, clean up code a bit. Before: Tag(s) 'bar','baz','foo','moo' not found After: Tag(s) 'bar', 'baz', 'foo', 'moo' not found Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Retrieving tags can be done while the lock is shared. Only writing needs to be exclusive. Also add a FIXME for cluster tags, where the code currently doesn't use any locks except the config lock. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
This fixes issue 125 (http://code.google.com/p/ganeti/issues/detail?id=125 ) Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
This will be used by LUXI client programs to display the VG name. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
This will make this test work again. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
When moving a single instance within the same cluster, the NIC is not allowed to re-use an existing MAC address. To avoid this, NIC parameters must be overridden. BE, HV, OS and NIC parameters can be overridden after applying this patch. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
This function will also be used in tools/move-instance. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
- Typos - Fix capitalization - Fix quoting in some places - Rewrite part of privilege separation section to match with subsection titles Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
This patch enables all tests by default, unless when they're explicitely disabled in the config file. This will make sure newly added tests are run even when an old configuration file is used. A comment is also added qa-sample.json. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Miguel Di Ciurcio Filho authored
Using hugepages, KVM instances can get a good performance boost. To activate that, we need to pass the -mem-path argument to KVM along with the mount point of the hugetlbfs file system on the node. For the sake of memory availability computation, we use the -mem-prealloc argument when enabling hugepages, so KVM will reserve all hugepages it needs when it starts. This avoids allocating an instance on a node that will not have enough pages in case other instance needs more than what is available after it boots. Signed-off-by:
Miguel Di Ciurcio Filho <miguel.filho@gmail.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
* devel-2.2: Rename the _oss cluster vars to _os Conflicts: lib/objects.py (trivial, strange that this one, and only this one, conflicted) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Per the mailing list discussion, rename _oss to _os, both in cluster parameters and in the rest of the code. This is just an s/_oss/_os, with the exception of a small bit of cleanup around the helper_os function in cmdlib.py. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
* devel-2.2: gnt-job info: Sort input fields KVM: Add function to check the hypervisor version Bump version to 2.2.0, update NEWS Fix instance rename regression from 3fe11ba3 Fix instance rename regression from 3fe11ba3 Update RAPI documentation for /2/nodes/[node_name]/migrate Sort OS names and variants in LUDiagnoseOS Add some trivial QA tests for the new OS states Change behaviour of OpDiagnoseOS w.r.t. 'valid' Allow gnt-os modify to change the new OS params Add two more _T-type tests Add blacklisted/hidden OS support in LUDiagnoseOS Restrict blacklisted OSes in instance installation Add two new cluster settings Abstract OS name/variant functions Add OS new states to the design doc Remove the RPC changes from the 2.2 design Remove 'Detailed Design' from design-2.2.rst Conflicts: lib/cli.py lib/cmdlib.py lib/objects.py scripts/gnt-os All conflicts were trivial. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 04, 2010
-
-
Michael Hanselmann authored
* stable-2.2: Bump version to 2.2.0, update NEWS Fix instance rename regression from 3fe11ba3 Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
This helps to find a value for complex opcodes. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-