Commits · bc25ebb4e3f3dd4241dae956c47b43ed0f398ab2 · itminedu / snf-ganeti

Oct 16, 2012

Fixes to pass unittests (make check) · da1168c5

Dimitris Aragiorgis authored 12 years ago


Conflicts:

	doc/rapi.rst
	lib/ovf.py

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>

da1168c5

Aug 10, 2012

Add test for checking that all gnt-* subcommands run OK · b2631ce4

Iustin Pop authored 12 years ago


This is a bit of a shell munging trickery, but works for now. Making
it more generic can be done later.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

b2631ce4

Jun 28, 2012

Add a shell test for hbal and split instances · a95aa74c

Iustin Pop authored 12 years ago


This is not perfect, as we only test that hbal completes successfully
and that it show a score improvement, but it's better than nothing.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

a95aa74c

Add newline at the end of shelltest files · 5b93f2ec

Agata Murawska authored 12 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

5b93f2ec

Add forgotten unittest changes for instance_os_add · 19fe9138

René Nussbaumer authored 12 years ago


The previous patch which fixed disk parameters didn't adapt the
unittests so it lead to failing QA.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

19fe9138

Jun 27, 2012

Style fixes in shelltests · 47ed1d79

Agata Murawska authored 12 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

47ed1d79

Jun 25, 2012

Shelltestrunner tests for hcheck · 165b385b

Agata Murawska authored 12 years ago


Simple tests for hcheck using shelltestrunner. Among other, we check
that we can run hcheck on multi-group cluster.

Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

165b385b

Jun 20, 2012

Update the hooks documentation · 2fd213a6

René Nussbaumer authored 12 years ago


Also provide some extended unittests to catch those cases.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

2fd213a6

Jun 15, 2012

Verify user supplied dicts against defaults · 32be86da

René Nussbaumer authored 12 years ago


This verifies the user (especially in nested dicts) does not
provide a key which is not seen in the defaults dict for that dict.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

32be86da

Fix cfgupgrade unittests · a19d8cd5

Iustin Pop authored 12 years ago


Sorry, I broke the cfgupgrade unittests via 904910c4, since that
commit added the requirement for the "instances" dict in the
configuration.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

a19d8cd5

Jun 14, 2012

query2: Add <, >, <=, >= comparison operators · ad48eacc

Michael Hanselmann authored 12 years ago


These can be used, for example, to get jobs submitted after a certain
timestamp.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

ad48eacc

Jun 11, 2012

Fix race condition in test for *FileID functions · deb717a0

Michael Hanselmann authored 13 years ago


In this test the “file ID” of a temporary file is compared against the
file ID gathered via an open file descriptor to the same file. For
reasons unknown to me utime(2) is called in-between to update the
inode's a- and mtime. Depending on the file system's timestamp
resolution this can lead to a different file ID.

Found by chance during QA and reproduced by adding a delay before the
call to utime(2).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
(cherry picked from commit fbd55434)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

deb717a0

May 23, 2012

Adjust cfgupgrade for new minor version · 93fd9bb1

Iustin Pop authored 12 years ago


Also does some abstracting of the versions.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

93fd9bb1

May 15, 2012

Beautify disk ipolicy violations in cluster-verify · 0c2e59ac

Iustin Pop authored 12 years ago


Currently, we only get:

  instance3: ['disk-size value 512 is not in range [1024, 1048576]'

which doesn't explain which disk we are talking about. This patch
extends the verification functions to take an additional parameter
that qualifies the disk:

  instance3: ['disk-size/0 value 512 is not in range [1024, 1048576]'

Future patch will make the formatting of the list better.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

0c2e59ac

May 14, 2012

ganeti.query_unittest: Adding testcase for diskparams · d3b51156
René Nussbaumer authored 12 years ago
```
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
```
d3b51156

Make x509 unittest testClockSkew a bit less flaky · 302424e7

Iustin Pop authored 12 years ago


Since the tested function actually uses time.time(), it cannot be make
fully stable, but 1 second is very dangerous; let's just test SKEW * 2
and higher since that should be good (if the delta between _GenCert
and VerifyX509Certificate is 5 minutes, then…).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

302424e7

May 11, 2012

query: Expose diskparamters through query · 2c758845

René Nussbaumer authored 12 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

2c758845

Workaround changed LVM behaviour · 4c5dd3ff

Iustin Pop authored 13 years ago


The vgreduce command has changed behaviour from when we initially
wrote the code (2.02.02 versus 2.02.66, 4 years delta):

- if there are LVs which will be impacted, it requires --force
- otherwise refuses to proceed, but it still returns exit code 0

We handle this by looking to see if it returns "Wrote out consistent
volume group" (behaviour unchanged), or if it complains about
"--force"; in the case it didn't complete, we retry the operation.

We improve a bit the checking of "vgs", as it uses to fail silently
and we didn't detect it.

New tests for this function should test, I believe, all the expected
variations; at the least we now have data files with the expected
output.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
(cherry picked from commit 048eeb2b)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

4c5dd3ff

May 10, 2012
- cmdlib: Remove all diskparams calculations not required anymore · 99ccf8b9
  René Nussbaumer authored 12 years ago
```
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
  99ccf8b9
May 09, 2012

Fix exception re-raising in Python Luxi clients · 98dfcaff

Iustin Pop authored 12 years ago


Commit e687ec01 (present in 2.5 since the 2.5 beta 3) did consistency
fixes across the code-base. Unfortunately this was done without enough
checks on the actual meaning of one of the fixes, which means error
re-raising in lib/errors.py is broken.

The problem is that:

  raise cls, args

is different than:

  raise cls(args)

And our unit-tests didn't catch this (this patch updates the tests).

This breakage is usually trivial, like wrong error messages:

  $ gnt-instance remove no-such-instance
  Failure: prerequisites not met for this operation:
  ("Instance 'no-such-instance' not known", 'unknown_entity')

versus:

  $ gnt-instance remove no-such-instance
  Failure: prerequisites not met for this operation:
  error type: unknown_entity, error details:
  Instance 'no-such-instance' not known

or:

  $ gnt-instance add … no-such-instance
  Failure: prerequisites not met for this operation:
  ('The given name (no-such-instance) does not resolve: Name or service not known', 'resolver_error')

versus:

  $ gnt-instance add … no-such-instance
  Failure: prerequisites not met for this operation:
  error type: resolver_error, error details:
  The given name (no-such-instance) does not resolve: Name or service not known

But in some cases where we rely on a certain data representation
(e.g. HooksAbort), this actually breaks because we try to iterate over
the wrong type:

  File "/usr/lib/python2.6/dist-packages/ganeti/cli.py", line 1907, in FormatError
     for node, script, out in err.args[0]:
  ValueError: need more than 1 value to unpack

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

98dfcaff

Allow clock skews in certificate verification · f97a7ada

Iustin Pop authored 12 years ago


Currently we allow for up to NODE_MAX_CLOCK_SKEW time difference
between nodes in some operations, but not everywhere: SSL certificate
verification (import/export, both intra and inter-cluster) has a zero
limit (downwards), and a week upwards. This can cause even
intra-cluster backup problems, if the source node has a time even two
seconds in the future.

To fix this, when we verify certificates compare with a time offset
with the max skew, which fixes the lower bound and reduces the upper
bound by an insignificant amount (0.04%).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

f97a7ada

May 08, 2012

Correct capitalisation of two Luxi calls · 83c046a2

Iustin Pop authored 12 years ago


Two Luxi calls have inconsistent an name/value mapping (in the Python
code):

- REQ_AUTOARCHIVE_JOBS versus AutoArchiveJobs (versus AutoarchiveJobs)
- REQ_QUEUE_SET_DRAIN_FLAG versus SetDrainFlag (no Queue)

While these are only a consistency issue, let's fix them so that the
Haskell code (which uses the auto-generated camel-case form) doesn't
need to handle them case specially, and looks more like the Python
code (hah, joke!).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

83c046a2

May 04, 2012

QA: Enable use of OR conditions in test checks · a0c3e726

Michael Hanselmann authored 12 years ago


Until now “TestRunIf” and “TestEnabled” could only handle AND. With this
patch a new class named “Either” is added to “qa_config” and allows OR.
The name “Either” was chosen instead of “Or” as the latter is very close
to the reserved keyword “or”.

Examples:
  ["rapi", Either(["instance-rename", "instance-reboot"])]

  Either(["node-list", "instance-list", "job-list"])

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

a0c3e726

Apr 27, 2012

Fix rapi.testutils unittest · 59e67682

Iustin Pop authored 12 years ago


Since we use a testutils.InputTestClient(), then the actual error
expected is VerificationError, and not GanetiApiError (which is used
at real run-time).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

59e67682

Apr 26, 2012

Add more RAPI test utilities · 1afa108c

Michael Hanselmann authored 12 years ago


This patch adds a mock RAPI client to test input values to methods. All
methods either raise an exception if there was a problem or return None.
Third-party code can use this to test their input values to the RAPI
client.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

1afa108c

rapi.testutils.FakeCurl: Add header support · d9492490

Michael Hanselmann authored 12 years ago


With this patch headers are constructed from the PycURL options
and passed to the mock implementation.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

d9492490

Move _FakeCurl from tests/ganeti.rapi.client to ganeti.rapi.testutils · f90a1ab5

René Nussbaumer authored 13 years ago


This is preparation for the mock system, where we need the same cURL
mock.

Signed-off-by: René Nussbaumer <rn@google.com>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

f90a1ab5

Apr 19, 2012

Convert listing exports to query2 · 0fdf247d

Michael Hanselmann authored 12 years ago


This solves one case where locks are acquired during LUXI queries.
Pretty late into the transition I noticed that OpBackupQuery had a
“use_locking” parameter for a long time, but didn't use it. Since
most of the other changes were already and this allows exports to
be listed via RAPI (/2/query) I decided to finish.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

0fdf247d

Apr 16, 2012

Copy debug level, priority and set comment for LU-generated opcodes · 07923a3c

Michael Hanselmann authored 12 years ago


Before this patch, a node evacuation submitted with high priority would
only compute the solution at that priority, but the actual evacuation
ran at normal priority.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

07923a3c

Mar 30, 2012

Fix query unittests after converting jobs to query2 · d6f58310

Michael Hanselmann authored 12 years ago


I missed these among some shelltest-related failures.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

d6f58310

jqueue: Convert GetInfo to query2 · a06c6ae8

Michael Hanselmann authored 12 years ago


This rather inefficient implementation (fields are evaluated on every
call to GetInfo) is not good for WaitForJobChanges and doesn't support
filters, but that will be rectified in later patches.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

a06c6ae8

qlang.MakeFilter: Enable use of different name field · 03ec545a

Michael Hanselmann authored 12 years ago


Jobs don't have a “name” field, so we must be able to control
the field used for simple filters.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

03ec545a

Merge cli.FormatTimestamp and utils.FormatTime · 26a72a48

Michael Hanselmann authored 12 years ago


… to some degree at least. Unittests are included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

26a72a48

constants: Don't hardcode priorities for LOCK_ATTEMPTS_TIMEOUT · 0b04b188

Michael Hanselmann authored 12 years ago


Also include unittest for LOCK_ATTEMPTS_TIMEOUT.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

0b04b188

Mar 28, 2012

Add whitelist for opcodes using BGL · c9c33a28

Michael Hanselmann authored 12 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

c9c33a28

RPC: Add a new client type for DNS only · bd6d1202

René Nussbaumer authored 13 years ago


This patch moves the “call_version” to a new RPC client definition and
then adds a new runner using the DNS resolver for getting the host
address.

The standard “BootstrapRunner”, where the call was before, tries to
resolve node names using ssconf first, which doesn't work properly when
re-adding a node with a new primary IP address.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

bd6d1202

Mar 26, 2012

Add trivial tests for gnt-* cli · 30f2802f

Iustin Pop authored 13 years ago


While testing some other stuff, I realised that the gnt-* commands
could be broken (as in, the script fails with syntax errors), but make
check doesn't detect it. Since we have shelltest, we can now add
trivial tests for this case.

One downside is that starting the scripts seems to be much slower
than the htools binaries, so we can't add as many tests.

The other downside is that shelltest is now required for all
development work, but I think this is a small disadvantage compared to
the increased testing possibilities.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

30f2802f

Mar 22, 2012

cmdlib: Stop forking in LUClusterQuery · a20e4768

Michael Hanselmann authored 13 years ago


While debugging another issue we realized that LUClusterQuery forks.
This turned out to be the “platform.architecture” function from the
Python library. It uses the “file” command to determine the architecture
of the Python binary.

This patch adds two new functions to the “runtime” module to get this
information once per process instead of doing it every single time
LUClusterQuery is used. Forking is a no-go in a multi-threaded
environment anyway.

A future change will also have to change the terminology in “gnt-cluster
info”: it reports the binary architecture simply as “architecture”, when
it's actually the binaries' architecture. Kernel and userland can be
different.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>

a20e4768

Convert manual shell tests to shelltestrunner · 53d4cdf1

Iustin Pop authored 13 years ago


This is more of a RFC… Basically most of the shell-based tests are
converted from exec+grep to shelltestrunner.

Things are not all fine and nice though:

- we have dependencies between tests, as some generate some data files
  needed later; this is not nice, and we depend on serial execution in
  testrunner
- we can still fail with no so nice messages in the offline-test
  script (when we generate most of the data)

But overall, I think the tests are much nicer to
define/read/debug:

- each test is standalone, with the only dependency being an optional
  input data file; this is much better than a single monolithic shell
  script
- in case of failures, the failure is clearly shown by shell test,
  both for exit code and stdout/stderr
- shelltest can run in --debug mode, where the exact details are shown
  much better than the alternative of "set -x" for the shell script

Comments welcome!

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

53d4cdf1

locking: Handle spurious notifications on lock acquire · 8d7d8b57

Michael Hanselmann authored 13 years ago


This was already a TODO since the implementation of lock priorities in
September 2010. Under certain conditions a waiting acquire can be
notified at a time when it can't actually get the lock. In this case it
would try and fail to acquire the lock and then return to the caller
before the timeout ends.

While this is not bad (nothing breaks), it isn't nice either. A separate
patch will prevent unnecessary notifications when shared locks are
released.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>

8d7d8b57