Commits · a1ec8695a6b453acdc2fa746a27be73c614b2e87 · itminedu / snf-ganeti

Oct 11, 2011

Preserve bridge MTU in KVM ifup script · a1ec8695


Closes: #201 - KVM_IFUP does not set bridge-MTU on tap devices
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

a1ec8695

Oct 04, 2011

Merge branch 'stable-2.5' into devel-2.5 · a080bab8

Andrea Spadaccini authored 13 years ago


Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

a080bab8

cluster-merge: log an info message at node readd · 419bb2ef

Guido Trotter authored 13 years ago


node readd can take a long time, it's good to have info messages to see
progress.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

419bb2ef

Bump version to 2.5.0~rc1 · 07cea902

Michael Hanselmann authored 13 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

07cea902

Fix issue when verifying cluster files · 170b02b7

Michael Hanselmann authored 13 years ago


If a cluster has any non-master-candidate nodes, those don't contain all
files (e.g. config.data). With commit aef59ae7 (March 31st, 2011)
the logic was changed and subsequently verifying a cluster with non-mc
nodes would complain.

This patch fixes this issue by changing the algorithm. It also adds an
additional check for files which shouldn't exist on a machine. A newly
added unittest is included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

170b02b7

Oct 03, 2011

Revert "utils.log: Write error messages to stderr" · d728ac75

Michael Hanselmann authored 13 years ago


This reverts commit 34aa8b7c. Writing
error messages to stderr would also include backtraces, something we
tried to avoid in the past.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

d728ac75

Fix adding nodes after commit · ca6b16e5

Michael Hanselmann authored 13 years ago


Commit 64c7b383 changed the RPC call for verifying SSH connections.
Unfortunately this case in adding nodes was missed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

ca6b16e5

Sep 30, 2011

LUClusterVerifyGroup: Spread SSH checks over more nodes · 64c7b383

Michael Hanselmann authored 13 years ago


When verifying a group the code would always check SSH to all nodes in
the same group, as well as the first node for every other group. On big
clusters this can cause issues since many nodes will try to connect to
the first node of another group at the same time. This patch changes the
algorithm to choose a different node every time.

A unittest for the selection algorithm is included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

64c7b383

Optimise cli.JobExecutor with many pending jobs · 11705e3d

Iustin Pop authored 13 years ago


In the case we submit many pending jobs (> 100) to the masterd, the
JobExecutor 'spams' the master daemon with status requests for the
status of all the jobs, even though in the end it will only choose a
single job for polling.

This is very sub-optimal, because when the master is busy processing
small/fast jobs, this query forces reading all the jobs from
this. Restricting the 'window' of jobs that we query from the entire
set to a smaller subset makes a huge difference (masterd only, 0s
delay jobs, all jobs to tmpfs thus no I/O involved):

- submitting/waiting for 500 jobs:
  - before: ~21 s
  - after:   ~5 s
- submitting/waiting for 1K jobs:
  - before: ~76 s
  - after:   ~8 s

This is with a batch of 25 jobs. With a batch of 50 jobs, it goes from
8s to 12s. I think that choosing the 'best' job for nice output only
matters with a small number of jobs, and that for more than that
people will not actually watch the jobs. So changing from 'perfect
job' to 'best job in the first 25' should be OK.

Note that most jobs won't execute as fast as 0 delay, but this is
still a good improvement.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

11705e3d

Merge branch 'stable-2.5' into devel-2.5 · cea3abbd

Andrea Spadaccini authored 13 years ago


* stable-2.5:
  listrunner: Don't pass arguments if there are none
  ssh: Quote strings in error message
  utils.log: Write error messages to stderr
  Add signal handling doc to hbal man page
  Fix handling of cluster verify hooks
  Redistribute the RAPI certificate
  QA: Add tests for instance start/stop via RAPI
  RAPI: Fix wrong check on instance shutdown
  baserlib: Accept empty body in FillOpcode

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

cea3abbd

Use --yes to deactivate master ip in cluster merge · aeb24d97

Guido Trotter authored 13 years ago


Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

aeb24d97

Use deactivate-master-ip in cluster-merge · a3fad332

Andrea Spadaccini authored 13 years ago


Use the gnt-cluster deactivate-master-ip command in cluster-merge to
disable the master IP.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit e87e5afb)

a3fad332

Add gnt-cluster commands to toggle the master IP · fb44c6db

Andrea Spadaccini authored 13 years ago


lib/client/gnt_cluster.py:
* Add activate-master-ip and deactivate-master-ip commands

man/gnt-cluster.rst:
* Document the new commands

lib/opcodes.py lib/cmdlib.py
* Add two opcodes and the LU that call the relevant RPCs

test/docs_unittest.py
* Silence an error about RAPI not implemented for the two new opcodes

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit fb926117)

Conflicts:

	test/docs_unittest.py
	  - kept devel-2.5 version, without the RAPI opcode checks

fb44c6db

Split starting and stopping master IP and daemons · c06e0c83

Andrea Spadaccini authored 13 years ago


lib/backend.py
* split StartMaster() in ActivateMasterIp() and StartMasterDaemons()
* split StopMaster() in DeactivateMasterIp() and StopMasterDaemons()

lib/server/noded.py, lib/rpc.py
* adapt the call chains to the new functions, define new RPCs

lib/bootstrap.py, lib/cmdlib.py, lib/server/masterd.py
* use the new RPCs

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit fb460cf7)

c06e0c83

listrunner: Don't pass arguments if there are none · 0c009cc5

Michael Hanselmann authored 13 years ago


If no arguments were specified the “exec_args” variable was “None”,
leading to the command being run as “… ./… None”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

0c009cc5

ssh: Quote strings in error message · 9dc45ab1

Michael Hanselmann authored 13 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

9dc45ab1

utils.log: Write error messages to stderr · 34aa8b7c

Michael Hanselmann authored 13 years ago


When “gnt-cluster copyfile” failed it would only print “Copy of file …
to node … failed”. A detailed message is written using logging.error.
Writing error messages to stderr can be helpful in figuring out what
went wrong (the messages also go to the log file, but not everyone might
know about it).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

34aa8b7c

Add signal handling doc to hbal man page · 2b634302

Iustin Pop authored 13 years ago


Also remove a bug note, since hbal can now for a long time directly
execute jobs.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

2b634302

Sep 28, 2011

Migration: warn the user about hv version mismatch · 34fbc862

Andrea Spadaccini authored 13 years ago


* hv_kvm.py, hv_xen.py
  - return the hypervisor version (if available) from GetNodeInfo

* cmdlib.py
  - if hypervisor version is available during the migration, and the
    versions differ, warn the user

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

34fbc862

Fix handling of cluster verify hooks · 3656c889

Iustin Pop authored 13 years ago


The change to enforce boolean results for cluster verify group opcode
missed the HooksCallBack, which uses a very ugly 1/0
logic. Furthermore, the logic is wrong, since it unconditionally
resets the verify result to true.

The patch is changed to simply treat hook failures as failures, and do
nothing for offline/nodes.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

3656c889

Redistribute the RAPI certificate · 835f8b23

Iustin Pop authored 13 years ago


This reverts to the old behaviour in Ganeti 2.4 and before.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

835f8b23

Sep 22, 2011

QA: Add tests for instance start/stop via RAPI · a7418448

Michael Hanselmann authored 13 years ago


This would have detected the issue fixed in the previous patch.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

a7418448

RAPI: Fix wrong check on instance shutdown · d71369d7

Michael Hanselmann authored 13 years ago


Commit 7fa310f6 (April 1st, 2011) converted the RAPI resource for
shutting down an instance to FillOpCode. Unfortunately it missed the
fact that the shutdown resource gets its parameters as query arguments.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

d71369d7

baserlib: Accept empty body in FillOpcode · fa411651

Michael Hanselmann authored 13 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
(cherry picked from commit c6e1a3ee)

Signed-off-by: Michael Hanselmann <hansmi@google.com>

fa411651

Sep 20, 2011

Add tls_ciphers and use_vdagent options · 3e40b587

Andrea Spadaccini authored 13 years ago


Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

3e40b587

Updated man pages with new SPICE TLS options · b8a10435

Andrea Spadaccini authored 13 years ago


man/gnt-cluster.rst:
* documented the --new-spice-certificate, --spice-certificate and
  --spice-ca-certificate options of renew-crypto.

man/gnt-instance.rst:
* documented the spice_use_tls KVM hypervisor option.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

b8a10435

Implementation of TLS-protected SPICE connections · b6267745

Andrea Spadaccini authored 13 years ago


Added support for TLS-protected SPICE connections:

client/gnt_cluster.py, cli.py:
* added three new parameters to renew-crypto (--new-spice-certificate,
  --spice-certificate, --spice-ca-certificate) and their validation.

utils/x509.py:
* changed GenerateSelfSignedSslCert so that now also returns the
  generated key and certificate;
* added missing return value in the docstring of
  GenerateSelfSignedX509Cert.

lib/bootstrap.py:
* changed the signatures of the relevant functions and implemented
  certificates generation/writing.

tools/cfupgrade:
* changed GenerateClusterCrypto invocation to reflect the new signature;
* added SPICE certificate names.

lib/errors.py:
* added the X509CertError class.

lib/hypervisor/hv_kvm.py:
* silenced pylint warning R0915

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

b6267745

Added SPICE TLS option and related cert paths · bfe86c76

Andrea Spadaccini authored 13 years ago


Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

bfe86c76

Fix OS creation's error handling when pausing sync · fac30cea

Faidon Liambotis authored 13 years ago


Commit 41e1e79e introduced a feature in which when wait_for_sync is not
set, DRBD sync is paused during the OS installation.

Doing so, however, broke OS creation's error handling: the result value
from the instance_os_add RPC call was overwritten by the one of the
blockdev_pause_resume_sync call before there was a chance for it to
be raised and thus masking possible errors in the OS creation.

Note that the wipe method, from which the pause technique was inspired,
does not suffer from this bug.

Signed-off-by: Faidon Liambotis <faidon@noc.grnet.gr>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

fac30cea

Sep 14, 2011

htools: remove dead code · 6804faa0

Iustin Pop authored 13 years ago


The tryEvac/evacuateInstance functions are no longer used in the new
multi-group world order, so we remove them and change the unit-test to
test the actual IAllocator function.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

6804faa0

hail: don't select the primary as new secondary · 7073b3a8

Iustin Pop authored 13 years ago


This just adds the primary node of the instance as 'non-allocable'
during the choosing of the new secondary.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

7073b3a8

hail: add an extra safety check in relocate · f25508be

Iustin Pop authored 13 years ago


If we select the primary as new secondary, better to fail than return
wrong data to Ganeti.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

f25508be

Sep 13, 2011

Fix RAPI documentation for gnt-instance console · 6f4a2e9d

Andrea Spadaccini authored 13 years ago


Fix a failing pyassert in the RAPI docs and update it to reflect the
addition of SPICE to gnt-instance console.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

6f4a2e9d

Add SPICE compression and streaming options · ea064d24

Andrea Spadaccini authored 13 years ago


Add the following SPICE audio/image compression and video streaming
detection hypervisor options:

* spice_image_compression
* spice_jpeg_wan_compression
* spice_zlib_glz_wan_compression
* spice_streaming_video
* spice_playback_compression

Also add the related documentation and silence pylint R0914 warning
about too many local variables in hv_kvm._GenerateKVMRuntime.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

ea064d24

Add SPICE support to gnt-instance console · 4d2cdb5a

Andrea Spadaccini authored 13 years ago


Also update related unit tests.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

4d2cdb5a

Sep 07, 2011

Make KVM use the QXL vga driver with SPICE · 2ebdfbb5

Andrea Spadaccini authored 13 years ago


Enable by default the QXL paravirtualized graphic card if SPICE is
enabled. The QXL driver is VESA compatible, so it degrades gracefully if
the guest OS does not have QXL drivers.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

2ebdfbb5

Use a loop to check SPICE parameters dependency · 0e1b03b9

Andrea Spadaccini authored 13 years ago


Use a loop to check if the user specified any SPICE option and SPICE
support is disabled.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

0e1b03b9

Sep 06, 2011

import: Fix a logic error due to missing "not" · 945859e0

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

945859e0

Sep 05, 2011

import: Make sure the disk_dump path is in EXPORT_DIR · 748c9884
René Nussbaumer authored 13 years ago
```
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
```
748c9884

Switch other commonprefix to IsBelowDir · cf00dba0

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

cf00dba0