Commits · 997f690f2965538101eb012ca06b2a4381342359 · itminedu / snf-ganeti

Nov 30, 2011

Fix a bug in command line option parsing code · 997f690f

Nikos Skalkotos authored 13 years ago


Fix bug affecting command line options of "keyval" type. Although
escaping commands with \ is supported, it is is not applied to the
input recursively.

Signed-off-by: Nikos Skalkotos <skalkoto@grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

997f690f

Nov 24, 2011

ConfigWriter: Fix epydoc error · 1730d4a1

Michael Hanselmann authored 13 years ago


The parameter is called “mods”, not “modes”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

1730d4a1

LUGroupAssignNodes: Fix node membership corruption · 218f4c3d

Michael Hanselmann authored 13 years ago


Note: This bug only manifests itself in Ganeti 2.5, but since the
problematic code also exists in 2.4, I decided to fix it there.

If a node was assigned to a new group using “gnt-group assign-nodes” the
node object's group would be changed, but not the duplicate member list
in the group object. The latter is an optimization to require fewer
locks for other operations. The per-group member list is only kept in
memory and not written to disk.

Ganeti 2.5 starts to make use of the data kept in the per-group member
list and consequently fails when it is out of date. The following
commands can be used to reproduce the issue in 2.5 (in 2.4 the issue was
confirmed using additional logging):

  $ gnt-group add foo
  $ gnt-group assign-nodes foo $(gnt-node list --no-header -o name)
  $ gnt-cluster verify  # Fails with KeyError

This patch moves the code modifying node and group objects into
“config.ConfigWriter” to do the complete operation under the config
lock, and also to avoid making use of side-effects of modifying objects
without calling “ConfigWriter.Update”. A unittest is included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

218f4c3d

Nov 14, 2011

Ensure unused ports return to the free port pool · f396ad8c

Vangelis Koukis authored 13 years ago


Ensure ports previously allocated by calling ConfigWriter's AllocatePort() are
returned to the pool of free ports when no longer needed:

 * Return the network_port of an instance when it is removed
 * Return the port used by a DRBD-based disk when it is removed

Signed-off-by: Vangelis Koukis <vkoukis@grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

f396ad8c

Re-wrap a paragraph to eliminate a sphinx warning · ca8f5622

Iustin Pop authored 13 years ago


This just makes sure that the paragraph doesn't contains lines that
start with :, which make Sphinx (1.0.7) complain.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

ca8f5622

Oct 27, 2011

Update NEWS and increase to 2.4.5 · 7b790a6a

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

7b790a6a

Oct 20, 2011

Fix queue archive creation with wrong permissions · 8e5a705d

René Nussbaumer authored 13 years ago


On a master failover some of the archive dirs might have wrong
permissions in the non-root model. This is due to the nature of noded
still running as root and the job queue is synced that way. This patch
will fix this behaviour by setting the permissions accordingly.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

8e5a705d

Oct 18, 2011

Update NEWS for unreleased 2.4.5 · ac0abc56

Michael Hanselmann authored 13 years ago


I need this for another 2.5 release.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

ac0abc56

Oct 12, 2011

rpc: Disable HTTP client pool and reduce memory consumption · 05927995

Michael Hanselmann authored 13 years ago

We noticed that “ganeti-masterd” can use large amounts of memory,
especially on large clusters. Measurements showed a single PycURL client
using about 500 kB of heap memory (the actual usage depends on versions,
build options and settings).

The RPC client uses a per-thread HTTP client pool with one client per
node. At this time there are 41 non-main threads (25 for the job queue
and 16 for client requests). This means the HTTP client pools use a lot
of memory (ca. 200 MB for 10 nodes, ca. 1 GB for 50 nodes).

This patch disables the per-thread HTTP client pool. No cleanup of
unused code is done. That will be done in the master branch only.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

05927995

Sep 06, 2011

Fix assertion error on unclean master shutdown · fd121c8e

Michael Hanselmann authored 13 years ago


Commit 66bd7445 added an assertion to ensure a finalized job has its
“end_timestamp” attribute set. Unfortunately it didn't cover a case when
the queue is recovering from an unclean master shutdown.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
(cherry picked from commit 45df0793)

fd121c8e

Aug 26, 2011

utils: Fix UnescapeAndSplit parsing bug · e4a48c7b

Michael Hanselmann authored 13 years ago


If a value passed to UnescapeAndSplit ended with a backslash an
exception would be raised:

$ gnt-instance modify -H mem=x\\ inst1.example.com
[…]
    e2 = slist.pop(0)
IndexError: pop from empty list

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

e4a48c7b

Aug 23, 2011

Version bump 2.4.4 · c71d2178

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

c71d2178

Update NEWS file · ac9b996d

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

ac9b996d

Documentation fix for importing with --src-dir option · 82461fad

René Nussbaumer authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
(cherry picked from commit b7d7876b)

Conflicts:

	lib/cmdlib.py (easily fixed)

82461fad

Adding missing test data for commit · 8b046709

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

8b046709

Fix a parsing issue with DRBD 8.3.11 in the Linux Kernel · 7a380ddf

René Nussbaumer authored 13 years ago


In the Linux kernel commit 4b0715f096 introduced a display bug into
/proc/drbd which broke our regex.

The bug was first introduced into Linux 2.6.39-rc1. This bug is still
unfixed as of today.

This patch adapt the regular expression to workaround this bug for the
time being.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

7a380ddf

Aug 19, 2011

ensure-dirs: Fix a bug with queue/archive permissions · de5f8826

René Nussbaumer authored 13 years ago


While it sets the permission on all files in queue/archive accordingly
it doesn't do so for the created archive directories. This patch fixes
this problem.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

de5f8826

Aug 05, 2011

Fix typo in NEWS · ba98a8d1

Michael Hanselmann authored 13 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

ba98a8d1

doc/admin: s/grub/GRUB/ · 96514751

Michael Hanselmann authored 13 years ago


“GRUB” is an acronym for GRand Unified Bootloader.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

96514751

Bumping version to 2.4.3 · 2f994ece

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

2f994ece

Aug 04, 2011

Update the NEWS file for 2.4.3 · e20832af

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e20832af

Aug 03, 2011

Fix small typo in docstring · d5fca545

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

d5fca545

Fix typo in NEWS · b5ea70bf

Michael Hanselmann authored 13 years ago


“--dry-run” starts with two dashes.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

b5ea70bf

Add a flag to burnin to allow specifying VCPU count. · d0ffa390

Pedro Macedo authored 13 years ago


Signed-off-by: Pedro Macedo <pmacedo@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

d0ffa390

Jul 28, 2011

Add support for cluster/OS parameters in QA · 5abecc1c

Iustin Pop authored 13 years ago


Currently there is no way to QA with (for example) an initrd because
the QA only inits the cluster with the default parameters. This makes
it impossible to QA using anything but the default parameters, which
doesn't always work.

Additionally, we add OS parameters and OS hypervisor parameters, for
completeness and for testing that these commands also work.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

5abecc1c

Jul 26, 2011

Add OS search path to gnt-cluster info · f36c3e2d

Ben Lipton authored 13 years ago


Otherwise, it's pretty hard to figure it out from the command line.

Signed-off-by: Ben Lipton <benlipton@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

f36c3e2d

Jul 25, 2011

Reopen daemon's stdio on SIGHUP · 110f49ef

Michael Hanselmann authored 13 years ago


Before this patch daemons would continue to refer to an old logfile for
their standard I/O if they had been asked to reopen the log (SIGHUP).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

110f49ef

Reopen log file only once after SIGHUP · ad88650c

Michael Hanselmann authored 13 years ago


Commit b6fa9a44 added a re-openable log handler. The log file is
reopened when a daemon is sent a HUP signal. Due to a bug in the code,
fixed by this patch, the log file would be reopened for every single log
message thereafter.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

ad88650c

Don't leak file descriptors when setting up daemon output · 638ac34b

Michael Hanselmann authored 13 years ago


When a daemon's output is configured using “utils.SetupDaemonFDs”, the
function must use dup2(2). Unfortunately the code didn't close the
original file descriptors, leaking them in the process.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

638ac34b

Jul 22, 2011

Fix aliases in bash completion · 3f42b4f6

Michael Hanselmann authored 13 years ago


Ever since commit 2d48a3a2 aliases were not included in the bash
completion script. This patch also replaces one tab with two spaces.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

3f42b4f6

gnt-node volumes: Fix instance names · 4b413e49

Michael Hanselmann authored 13 years ago


Commit 84d7e26b changed “objects.Instance.MapLVsByN” to not just return
the LV name, but to include the volume group name (e.g.
“xenvg/d67e8700….disk0_data”). This in turn broke the mapping of volume
names in LUNodeQueryvols, stopping instance names from displayed in
“gnt-node volumes”.

This patch fixes the issue and does some cleanup.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

4b413e49

Jul 11, 2011

ht: Add new check for numbers · 697f49d5

Michael Hanselmann authored 13 years ago


Places which receive floats can usually also deal with integers, e.g.
OpTestDelay. Tests are added and the new check function is used for the
aforementioned opcode and verifying query results.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

697f49d5

Fix off-by-one bug in job serial generation · 3c88bf36

Michael Hanselmann authored 13 years ago


Commit 009e73d0 (September 2009) changed the job queue to generate
multiple job serials at once. Ever since it would return one more than
requested.

The “serial” file in the job queue directory is defined to contain the
“last job ID used” (design-2.0). With the change above, the serial file
would always contain the next serial number. The first value returned by
the generating function was the one contained in the file, so during the
switch in 2009 one job may have been overwritten.

This patch changes the code to always return the exact number of
serials, to keep the last used serial on disk and adds an assertion.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

3c88bf36

Jul 01, 2011

Shorten some unbreakable lines in man pages · 56a1d5cc

Iustin Pop authored 13 years ago


In order to make the display right on 80-columns terminals.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

56a1d5cc

Correct some spelling mistakes · f7b769b1

Iustin Pop authored 13 years ago


New lintian is even smarter:

- overriden → overridden
- allows to → allows one to

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

f7b769b1

Jun 28, 2011

Fix bug in recreate-disks for DRBD instances · b768099e

Iustin Pop authored 13 years ago


The new functionality in 2.4.2 for recreate-disks to change nodes is
broken for DRBD instances: it simply changes the nodes without caring
for the DRBD minors mapping, which will lead to conflicts in non-empty
clusters.

This patch changes Exec() method of this LU significantly, to both fix
the DRBD minor usage and make sure that we don't have partial
modification to the instance objects:

- the first half of the method makes all the checks and computes the
  needed configuration changes
- the second half then performs the configuration changes and
  recreates the disks

This way, instances will either be fully modified or not at all;
whether the disks are successfully recreate is another point, but at
least we'll have the configuration sane.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

b768099e

Fix a lint warning · 78ff9e8f

Iustin Pop authored 13 years ago


Patch db8e5f1c removed the use of feedback_fn, hence pylint warn
now.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

78ff9e8f

Jun 27, 2011

KVM: configure bridged NICs at migration start · cc8a8ed7

Apollon Oikonomopoulos authored 13 years ago

Commit 5d9bfd87 moved tap interface handling from KVM to Ganeti, partly
to also solve the problem of routed interfaces getting configured too
early during live migrations, causing network anomalies. In that
direction, configuration of NICs of incoming instances was deferred to
FinalizeMigration time.

However, this causes minor issues with bridged interfaces; KVM sends out
an ARP-like packet upon migration finish, which is lost because the tap
interface is not yet configured. As a consequence, intermediate network
equipment (i.e. switches) does not get notified about the topology
change, until the instance transmits another packet after the bridge has
been configured, or the switch's ARP cache expires.

The proper solution to that is to support different phases in network
configuration (pre/post migration), which also requires separate ifup
scripts. Until then we fall back to configuring bridged interfaces on
incoming instances at migration start, instead of finish.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

cc8a8ed7

Fix RAPI documentation regarding master role · 96747bda

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

96747bda

Fix bug in drbd8 replace disks on current nodes · db8e5f1c

Iustin Pop authored 13 years ago


Currently the drbd8 replace-disks on the same node (i.e. -p or -s) has
a bug in that it does modify the instance disk temporarily before
changing it back to the same value. However, we don't need to, and
shouldn't do that: what this operation do is simply change the LVM
configuration on the node, but otherwise the instance disks keep the
same configuration as before.

In the current code, this change back-and-forth is fine *unless* we
fail during attaching the new LVs to DRBD; in which case, we're left
with a half-modified disk, which is entirely wrong.

So we change the code in two ways:

- use temporary copies of the disk children in the old_lvs var
- stop updating disk.children

Which means that the instance should not be modified anymore (except
maybe for SetDiskID, which is a legacy and unfortunate decision that
will have to cleaned up sometime).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

db8e5f1c