- Oct 18, 2011
-
-
Michael Hanselmann authored
* devel-2.4: Update NEWS for unreleased 2.4.5 Conflicts: NEWS: Trivial Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
I need this for another 2.5 release. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Oct 17, 2011
-
-
Michael Hanselmann authored
Commit d1c172de inadvertently changes the “/2/instances/[instance_name]/replace-disks” resource to use body parameters. There were no QA tests and the issue wasn't noticed. This patch re-introduces support for query parameters and adds a QA test. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Andrea Spadaccini <spadaccio@google.com>
-
- Oct 12, 2011
-
-
Michael Hanselmann authored
* devel-2.4: rpc: Disable HTTP client pool and reduce memory consumption Fix assertion error on unclean master shutdown Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
We noticed that “ganeti-masterd” can use large amounts of memory, especially on large clusters. Measurements showed a single PycURL client using about 500 kB of heap memory (the actual usage depends on versions, build options and settings). The RPC client uses a per-thread HTTP client pool with one client per node. At this time there are 41 non-main threads (25 for the job queue and 16 for client requests). This means the HTTP client pools use a lot of memory (ca. 200 MB for 10 nodes, ca. 1 GB for 50 nodes). This patch disables the per-thread HTTP client pool. No cleanup of unused code is done. That will be done in the master branch only. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 07, 2011
-
-
Michael Hanselmann authored
According to the iallocator documentation the “node-evacuate” call needs to return a list of jobs, not a list of lists of jobs. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 04, 2011
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
If a cluster has any non-master-candidate nodes, those don't contain all files (e.g. config.data). With commit aef59ae7 (March 31st, 2011) the logic was changed and subsequently verifying a cluster with non-mc nodes would complain. This patch fixes this issue by changing the algorithm. It also adds an additional check for files which shouldn't exist on a machine. A newly added unittest is included. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 03, 2011
-
-
Michael Hanselmann authored
This reverts commit 34aa8b7c. Writing error messages to stderr would also include backtraces, something we tried to avoid in the past. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Commit 64c7b383 changed the RPC call for verifying SSH connections. Unfortunately this case in adding nodes was missed. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 30, 2011
-
-
Michael Hanselmann authored
When verifying a group the code would always check SSH to all nodes in the same group, as well as the first node for every other group. On big clusters this can cause issues since many nodes will try to connect to the first node of another group at the same time. This patch changes the algorithm to choose a different node every time. A unittest for the selection algorithm is included. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
In the case we submit many pending jobs (> 100) to the masterd, the JobExecutor 'spams' the master daemon with status requests for the status of all the jobs, even though in the end it will only choose a single job for polling. This is very sub-optimal, because when the master is busy processing small/fast jobs, this query forces reading all the jobs from this. Restricting the 'window' of jobs that we query from the entire set to a smaller subset makes a huge difference (masterd only, 0s delay jobs, all jobs to tmpfs thus no I/O involved): - submitting/waiting for 500 jobs: - before: ~21 s - after: ~5 s - submitting/waiting for 1K jobs: - before: ~76 s - after: ~8 s This is with a batch of 25 jobs. With a batch of 50 jobs, it goes from 8s to 12s. I think that choosing the 'best' job for nice output only matters with a small number of jobs, and that for more than that people will not actually watch the jobs. So changing from 'perfect job' to 'best job in the first 25' should be OK. Note that most jobs won't execute as fast as 0 delay, but this is still a good improvement. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
If no arguments were specified the “exec_args” variable was “None”, leading to the command being run as “… ./… None”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
When “gnt-cluster copyfile” failed it would only print “Copy of file … to node … failed”. A detailed message is written using logging.error. Writing error messages to stderr can be helpful in figuring out what went wrong (the messages also go to the log file, but not everyone might know about it). Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
Also remove a bug note, since hbal can now for a long time directly execute jobs. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 28, 2011
-
-
Iustin Pop authored
The change to enforce boolean results for cluster verify group opcode missed the HooksCallBack, which uses a very ugly 1/0 logic. Furthermore, the logic is wrong, since it unconditionally resets the verify result to true. The patch is changed to simply treat hook failures as failures, and do nothing for offline/nodes. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This reverts to the old behaviour in Ganeti 2.4 and before. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 22, 2011
-
-
Michael Hanselmann authored
This would have detected the issue fixed in the previous patch. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Commit 7fa310f6 (April 1st, 2011) converted the RAPI resource for shutting down an instance to FillOpCode. Unfortunately it missed the fact that the shutdown resource gets its parameters as query arguments. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com> (cherry picked from commit c6e1a3ee) Signed-off-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 06, 2011
-
-
Michael Hanselmann authored
Commit 66bd7445 added an assertion to ensure a finalized job has its “end_timestamp” attribute set. Unfortunately it didn't cover a case when the queue is recovering from an unclean master shutdown. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com> (cherry picked from commit 45df0793)
-
- Aug 31, 2011
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 30, 2011
-
-
Michael Hanselmann authored
Some platforms apparently don't support “ln -s”, otherwise Autoconf wouldn't have AC_PROG_LN_S. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Andrea Spadaccini authored
Running pylint 0.24.0 revealed 2 errors and 1 warning. Here is how I fixed them: * jqueue.py: silenced E1101 * netutils.py: rewrote the list comprehension using extend() * watcher/__init__.py: fixed a missing format string parameter These changes are backwards-compatible with pylint 0.21.1. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Andrea Spadaccini authored
- Makefile.am: added QA directory to the paths checked by pep8 - qa/: fixed the reported errors - Makefile.am: also, added qa_group.py to qa_scripts Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
This wasn't possible until now. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Andrea Spadaccini authored
In version 0.21, pylint unified all the disable-* (and enable-*) directives to disable (resp. enable). This leads to a lot of DeprecationWarning being emitted even if one uses the recommended version of pylint (0.21.1, as stated in devnotes.rst). This commit changes all the disable-msg directives to disable. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 29, 2011
-
-
Michael Hanselmann authored
- str.split("/").pop() should be os.path.basename - str.split("\n") should be str.splitlines() Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Tsachy Shacham authored
Signed-off-by:
Tsachy Shacham <tsachy@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Aug 26, 2011
-
-
René Nussbaumer authored
Conflicts: NEWS (trivial) configure.ac (trivial) daemons/ensure-dirs.in (deleted) Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
If a value passed to UnescapeAndSplit ended with a backslash an exception would be raised: $ gnt-instance modify -H mem=x\\ inst1.example.com […] e2 = slist.pop(0) IndexError: pop from empty list Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Andrea Spadaccini authored
Added a step in cluster-merge that removes the cluster IP from the master node of the mergee clusters. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
This utility checks whether the code conforms to PEP8. Some checks had to be disabled for Ganeti. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
cmdlib: Avoid wrapping using backslash gnt_group: Avoid ** magic using keyword arguments (the “pep8” tool doesn't like the inline comment in this case and will complain about spaces around the “**” operator) Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 25, 2011
-
-
Michael Hanselmann authored
Until now it would only say that there was a line longer than 80 characters, but not where. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Identified using the “pep8” utility. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 24, 2011
-
-
Guido Trotter authored
Had to break it as well, today! ;) Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
Handle exceptions gracefully when trying to read the command's output. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 23, 2011
-
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Andrea Spadaccini <spadaccio@google.com>
-