Commits · 507692152a274e9c88f24738f030db3881b63558 · itminedu / snf-ganeti

Feb 18, 2011

Add QA rapi test for instance reinstall · 0220d2cf

Guido Trotter authored 14 years ago


This tests at least the basic case, unfortunately there is no way to
check all possibilities using the provided rapi client, as that will use
the new method unless the cluster doesn't support it.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

0220d2cf

Jan 28, 2011

Add RAPI resource for instance console · b82d4c5e

Michael Hanselmann authored 14 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

b82d4c5e

Small QA fixes: groups via RAPI, cluster OOB · b9e478fe

Michael Hanselmann authored 14 years ago


Add “cluster-oob” to sample configuration file. Don't run RAPI group
tests if disabled.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

b9e478fe

Jan 12, 2011

Run pylint over QA code too · 3582eef6

Iustin Pop authored 14 years ago


Right now, the QA code is not covered by pylint, and this shows at
least one low-impact bug.

This patch does the necessary changes to make QA pylint-clean, and the
changes the makefile to run pylint for it.

Notable changes:

- qa_utils.GenericQueryTest: randfields was not used at all, and my
  belief is that it was indented to be used in order not to modify the
  input list; so I replaced randfields with fields, so we only shuffle
  the our local copy
- qa_node.TestOutOfBand was using it's own copy of AcquireNode(), so I
  replaced it with the existing version
- qa_os: was using 'dir' in a couple of places, replaced with dirname

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

3582eef6

QA: use a persistent SSH connection to the master · f7e6f3c8

Iustin Pop authored 14 years ago


The recent additions to QA (many more tests) make QA slow if the
machine on which the QA runs is not very close to the tested nodes —
or in general, when the SSH handhaske is costly.

We discussed before about using a persistent connection, and here is
the patch that implements it. On a very small QA (very very small), it
cuts down a lot of time (almost half), so it should be useful even for
a full QA.

I've also thought about changing from external ssh to paramiko, but I
estimated that it would be more work to correctly interleave the IO
from the remote process than just running a background SSH.

Also note that yes, the global dict is ugly, but I don't know of
another simple way to implement this.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

f7e6f3c8

QA: Fix duplicated OOB tests · 69df9d2b

Iustin Pop authored 14 years ago


Patch f55312bd added the OOB tests to TestClusterVerify, which is not
actually a test for cluster verify, but a runner for cluster verify
that is called multiple times, for each instance type, etc. This led
to running the OOB commands multiple times, which is painful
especially as this is a slow test.

The patch moves this to a separate test, that is run only once.

Furthermore, the way that data files are copied around is very
inefficient: touch + mv + chmod + mv + rm for each node (5 times
number of nodes), whereas it could be simply: touch on master, chmod
on master, cluster copyfile, chmod on master, cluster copyfile,
cluster command rm, i.e. only 5 fixed ssh calls to the master. The
code is changed as such, for increased speed.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

69df9d2b

Jan 10, 2011

Add QA tests for OpAssignGroupNodes · f3fd2c9d

Adeodato Simo authored 14 years ago


Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

f3fd2c9d

Jan 06, 2011

qa_group.py: reimplement query tests with qa_utils · 7ab8b7d7

Adeodato Simo authored 14 years ago


Now that group queries use query2 infrastructure, update the QA tests to
use the generic functions in qa_utils.py.

Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

7ab8b7d7

Dec 20, 2010

ganeti-qa: Wrap lines longer than 80 chars · 930e77d1

Michael Hanselmann authored 14 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

930e77d1

Dec 17, 2010

QA: Run cluster-verify as part of all instance tests · d27150a9

Michael Hanselmann authored 14 years ago


“gnt-cluster verify” looks at some per-instance information as well, so
it should be run for each instance type QA tests.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

d27150a9

QA: Fix typo and add “not” · 65924a12

Michael Hanselmann authored 14 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

65924a12

Dec 16, 2010

QA: Add some basic OOB tests · a1de4b18

René Nussbaumer authored 14 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

a1de4b18

Dec 14, 2010
- QA: Extend unittests for query operations, add tests for list-fields · 2214cf14
  Michael Hanselmann authored 14 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
```
  2214cf14
Dec 13, 2010

More QA tests for group operations · 4b10fb65

Adeodato Simo authored 14 years ago


This adds QA tests for the SetGroupParams operation, both for CLI and
RAPI. Additionally, it adds tests for add/rename/remove groups via RAPI,
which had not been included in a previous patch series. Finally, it also
tests setting "alloc_policy" (and, for the CLI, "ndparams") at group
creation time.

Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

4b10fb65

Dec 10, 2010

QA: Improve tests for instance/node list · 288d6440

Michael Hanselmann authored 14 years ago


- Query all known fields
- Random combinations (using a PRNG with a fixed seed) of fields
- Order of result names

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

288d6440

Dec 09, 2010

qa: test same-name instance rename · 46747143

Guido Trotter authored 14 years ago


Use the simplified command and rapi version to perform an instance
rename to the same name. This is performed anytime the rename test is
enabled, while the "other-name" rename is performed when also an
alternative name is provided.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

46747143

Simplify instance rename qa test · e5c2accd

Guido Trotter authored 14 years ago


The current instance rename qa testing function can only perform
back-and-forth renames, both for command line and rapi. In order to be
able to perform same-name rename tests we change it to be able to
perform simple renames, and then we change qa to call it to perform both
sides of the renaming.

The same change is applied both to the local and the rapi test.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

e5c2accd

Dec 08, 2010

Group operations: add QA tests for add/remove/rename · 66787da5

Adeodato Simo authored 14 years ago


This is a single function that tests all of of the following:

  - creating groups
  - creating groups that exist fails
  - renaming an empty group
  - renaming a group with nodes
  - renaming to a name that already exists fails
  - removing an empty group works
  - removing a group with nodes fails

The "default" group is only used for the "rename group with nodes"
test.

Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

66787da5

Dec 01, 2010

Querying node groups: add QA tests · 30131294

Adeodato Simo authored 14 years ago


This adds QA tests for both CLI and RAPI.

Signed-off-by: Adeodato Simo <dato@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

30131294

Nov 30, 2010

Further cleanups on QA · 7d88f255

Iustin Pop authored 14 years ago


This is more of an RFC. The patch attempts to address two issues:

- running conditional tests is ugly right now
- we don't know what tests we skipped

By using the new RunTestIf, we solve both. But a significant number of
test decisions are more complex than just “is test enabled”, so those
remain to be run via RunTest, which means we don't get logging of when
they're not run. Hence the logging is not complete… Sugesstions on how
to solve it are welcome.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

7d88f255

Nov 17, 2010

QA: add tests for gnt-cluster modify -B · 9738ca94

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

9738ca94

Nov 03, 2010

QA: Run “gnt-cluster verify” while DRBD instance exists · 7b4eed05

Michael Hanselmann authored 14 years ago


This tests some parts of the disk information collection.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Luca Bigliardi <shammash@google.com>

7b4eed05

Oct 28, 2010

LUExportInstance: Accept instance already shut down · 3b01286e

Michael Hanselmann authored 14 years ago


To remove the instance after an export it needs to be stopped. This can
be achived using the parameter “shutdown”, or by explicitly shutting
down the instance before exporting. The latter would still require the
“shutdown” parameter to be set. To make it more intuitive, this
requirement is changed with this patch. Instances already stopped are
accepted for automatic removal.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

3b01286e

QA: Allow job queue test to be disabled · cd04f8c2

Michael Hanselmann authored 14 years ago


On my machine it takes over 30 seconds, disabling it can
speed up the QA.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

cd04f8c2

Oct 25, 2010

Fix QA mixup of node/instance tests · 729c4377

Iustin Pop authored 14 years ago

There are two node tests that are run from RunCommonInstanceTests, which is the
bad place—it causes these node tests to be run three times instead of once.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

729c4377

Oct 20, 2010

QA: Add test for “gnt-node modify” · d0cb68cb

Michael Hanselmann authored 14 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

d0cb68cb

Oct 14, 2010

Brown-bag fix for leftover comment · 76917d97

Iustin Pop authored 14 years ago


I did forgot this in the original patch. Sorry!!!!

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

76917d97

Rework QA interaction with the watcher · 8201b996

Iustin Pop authored 14 years ago


The interaction with cron-launched watcher is a well-known failure mode of QA:

---- 2010-10-14 06:54:55.464839 time=0:00:56.764827 Test tools/move-instance

For the following tests it's recommended to turn off the ganeti-watcher cronjob.

---- 2010-10-14 06:54:55.465255 start Test automatic restart of instance by ganeti-watcher
…
Error: Domain 'instance1' does not exist.
Command: ssh -oEscapeChar=none -oBatchMode=yes -l root -t -oStrictHostKeyChecking=yes
  -oClearAllForwardings=yes -oForwardAgent=yes node2 'ganeti-watcher -d'
2010-10-13 23:55:04,479:  pid=1659 ganeti-watcher:626
 ERROR Can't acquire lock on state file /var/lib/ganeti/watcher.data: File already locked
---- 2010-10-14 06:55:04.513948 time=0:00:09.048693 Test automatic restart of instance by ganeti-watcher

In order to fix this, we disable the watcher during these tests, and
re-enable it afterwards. To protect against watcher being disabled, we
enable it unconditionally at the start of the QA (we do want it enabled,
in order to see the interaction between the watcher and many
creation/disk replace jobs, etc.).

Note: even after this patch, if a cron-watcher was started and is still
running during the test, we'll have locking issues. I think for now this
is OK, we'll have to see how often that happens.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

8201b996

Oct 08, 2010

Change QA log output · f89d59b9

Iustin Pop authored 14 years ago


Currently, the logging in QA doesn't show the duration of the various
steps, and if it is needed one has to perform log manipulation. This
patch changes the output so that the log informatio is line based (as
opposed to block-based), such that it's easy to grep for all log lines:

./qa/ganeti-qa.py --yes-do-it qa.json  2>&1|grep ^----
---- 2010-10-08 14:40:21.730382 start Test SSH connection --------------
---- 2010-10-08 14:40:23.156633 time=0:00:01.426251 Test SSH connection
---- 2010-10-08 14:40:23.156735 start ICMP ping each node --------------
---- 2010-10-08 14:40:24.230479 time=0:00:01.073744 ICMP ping each node
---- 2010-10-08 14:40:24.230583 start Test availibility of Ganeti commands
---- 2010-10-08 14:40:32.314586 time=0:00:08.084003 Test availibility of Ganeti commands
---- 2010-10-08 14:40:32.314734 start gnt-node info --------------------
---- 2010-10-08 14:40:32.860884 time=0:00:00.546150 gnt-node info ------

or just for the duration of the steps:
./qa/ganeti-qa.py --yes-do-it ../qa-mpgntac5.fra.json  2>&1|grep ^----.*time=
---- 2010-10-08 14:42:12.630067 time=0:00:01.239256 Test SSH connection
---- 2010-10-08 14:42:14.204393 time=0:00:01.574221 ICMP ping each node
---- 2010-10-08 14:42:22.170828 time=0:00:07.966331 Test availibility of Ganeti commands
---- 2010-10-08 14:42:22.701030 time=0:00:00.530037 gnt-node info ------

This will help with identifying slow steps or even graphing the QA
duration.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

f89d59b9

Oct 07, 2010

Try again to fix the inter-cluster move QA test · 638a7266

Iustin Pop authored 14 years ago

This time, we re-establish the old pri/sec nodes corretly. Unfortunately this
will require now a 3-node cluster at least for drbd instances, hence it's
somewhat suboptimal, but… The other option would be to move it simply from p:s
to s:p and then back to p:s, without involving a third node (for DRBD case),
but I think that moving it to a completely separate node is slightly better for
testing.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

638a7266

Oct 06, 2010

QA: Fix instance move tests · 677e16eb

Iustin Pop authored 14 years ago


The instance move tests were moving the instance from node pair (A,_) to
(B, A), and left it there. This patch makes sure that the first step
moves the instance to (B,A) but the second one back to (A,B), so that
the instance is left on the same primary node.

The original secondary node is lost though, if I read the code
correctly.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

677e16eb

Sep 30, 2010

Add some trivial QA tests for the new OS states · e1df06f2

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e1df06f2

Aug 19, 2010

QA: Run simple job queue test · 1377433b

Michael Hanselmann authored 14 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

1377433b

Aug 18, 2010

RAPI client: Support modifying instances · 3b7158ef

Michael Hanselmann authored 14 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

3b7158ef

Aug 10, 2010

gnt-backup: Don't show confusing message w/o target node · bc696589

Michael Hanselmann authored 14 years ago


“gnt-backup export” requires the target node. Until now, the master
daemon would complain that the “parameter
'OP_BACKUP_EXPORT.target_node' fails validation”. With this patch,
an additional check is done in the client program.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Manuel Franceschini <livewire@google.com>

bc696589

QA: Test renaming instance via RAPI · 7fb50870

Michael Hanselmann authored 14 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

7fb50870

Jul 29, 2010

QA: Test instance migration via CLI and RAPI · 938bde86

Michael Hanselmann authored 14 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

938bde86

Jul 26, 2010

QA: add tests for the reserved lvs feature · 452913ed

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

452913ed

Jul 01, 2010

RAPI client: Switch to pycURL · 2a7c3583

Michael Hanselmann authored 14 years ago


Currently the RAPI client uses the urllib2 and httplib modules from
Python's standard library. They're used with pyOpenSSL in a very fragile
way, and there are known issues when receiving large responses from a RAPI
server.

By switching to PycURL we leverage the power and stability of the
widely-used curl library (libcurl). This brings us much more flexibility
than before, and timeouts were easily implemented (something that would
have involved a lot of work with the built-in modules).

There's one small drawback: Programs using libcurl have to call
curl_global_init(3) (available as pycurl.global_init) while exactly one
thread is running (e.g. before other threads) and are supposed to call
curl_global_cleanup(3) (available as pycurl.global_cleanup) upon exiting.
See the manpages for details. A decorator is provided to simplify this.

Unittests for the new code are provided, increasing the test coverage of
the RAPI client from 74% to 89%.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

2a7c3583

qa: shutdown instance before trying disk convert · f9f0ce7f

Guido Trotter authored 14 years ago


Because we have to. :)

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

f9f0ce7f