Commits · 5fd6b69479c09ddac17121a31ec18504b089cf37 · itminedu / snf-ganeti

Dec 15, 2010

jqueue: Keep jobs in “waitlock” while returning to queue · 5fd6b694

Michael Hanselmann authored 14 years ago


Iustin Pop reported that a job's file is updated many times while it
waits for locks held by other thread(s). After an investigation it was
concluded that the reason was a design decision for job priorities to
return jobs to the “queued” status if they couldn't acquire all locks.
Changing a jobs' status or priority requires an update to permanent
storage.

In a high-level view this is what happens:
1. Mark as waitlock
2. Write to disk as permanent storage (jobs left in this state by a
   crashing master daemon are resumed on restart)
3. Wait for lock (assume lock is held by another thread)
4. Mark as queued
5. Write to disk again
6. Return to workerpool

Another option originally discussed was to leave the job in the
“waitlock” status. Ignoring priority changes, this is what would happen:
1. If not in waitlock
1.1. Assert state == queued
1.2. Mark as waitlock
1.3. Set start_timestamp
1.4. Write to disk as permanent storage
3. Wait for locks (assume lock is held by another thread)
4. Leave in waitlock
5. Return to workerpool

Now let's assume the lock is released by the other thread:
[…]
3. Wait for locks and get them
4. Assert state == waitlock
5. Set state to running
6. Set exec_timestamp
7. Write to disk

As this change reduces the number of writes from two per lock acquire
attempt to two per opcode and one per priority increase (as happens
after 24 acquire attempts (see mcpu._CalculateLockAttemptTimeouts) until
the highest priority is reached), here's the patch to implement it.
Unittests are updated.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

5fd6b694

Improve jqueue unittests · ebb2a2a3

Michael Hanselmann authored 14 years ago


- Verify job file updates
- Ensure queue lock is released while executing opcode

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

ebb2a2a3

Dec 14, 2010

Update manpages to display version 2.3 · e7441f80

Miguel Di Ciurcio Filho authored 14 years ago


Signed-off-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e7441f80

Dec 09, 2010

Merge branch 'devel-2.2' into devel-2.3 · d1a0ab50

Guido Trotter authored 14 years ago


* devel-2.2:
  Fix rename for file-backed instances

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

d1a0ab50

Merge branch 'stable-2.2' into devel-2.2 · be9f4904

Guido Trotter authored 14 years ago


* stable-2.2:
  Fix rename for file-backed instances

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

be9f4904

Fix rename for file-backed instances · 3721d2fe

Guido Trotter authored 14 years ago


Currently the code wrongly changes the disk logical/physical id
component representing the path from "$storage_dir/$iname/disk$seq" to
"$storage_dir/$iname/disk/$seq" (note the additional slash) breaking the
rename.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

3721d2fe

Dec 02, 2010

Merge branch 'stable-2.3' into devel-2.3 · 9a91d357

Michael Hanselmann authored 14 years ago


* stable-2.3:
  Bump version for 2.3.1~rc1 release

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

9a91d357

Dec 01, 2010

locking: Clarify message for removed locks · e1137eb6

Michael Hanselmann authored 14 years ago


Just being told that a lock doesn't exist can be confusing. One case
were this happens is when a job (e.g. instance modify) waits for a job
removing the instance (e.g. export with remove).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

e1137eb6

Bump version for 2.3.1~rc1 release · 563d5e72

Michael Hanselmann authored 14 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

563d5e72

impexpd: Disable OpenSSL compression in socat if possible · 29e8788e

Michael Hanselmann authored 14 years ago


This uses an option only available in patched socat versions. More
information is available from the INSTALL update included in this
patch.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

29e8788e

Merge branch 'stable-2.3' into devel-2.3 · cd22574b

Michael Hanselmann authored 14 years ago


* stable-2.3:
  Bump version for 2.3.0

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

cd22574b

Bump version for 2.3.0 · 7c324b88

Michael Hanselmann authored 14 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

7c324b88

Nov 30, 2010

Merge branch 'devel-2.2' into devel-2.3 · 5d9f9cba

Michael Hanselmann authored 14 years ago


* devel-2.2:
  Correct version check for release candidates
  Fix version check
  Add script to check version format

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

5d9f9cba

Correct version check for release candidates · cdb303ab

Michael Hanselmann authored 14 years ago


The tilde needs to be escaped and I forgot the space which should be
used instead.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

cdb303ab

config.py: need explicit %-formatting in errors.OpPrereqError. · c49b0092
Adeodato Simo authored 14 years ago
```
Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
```
c49b0092

Nov 25, 2010

Fix version check · 35576615

Michael Hanselmann authored 14 years ago


Don't ask … all I say is distcheck.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

35576615

Nov 24, 2010

Add script to check version format · 96602be4

Michael Hanselmann authored 14 years ago


Only versions of the format “x.y.z” and “x.y.z~(rc|beta)N” (for N>0) are
allowed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

96602be4

Merge branch 'devel-2.2' into devel-2.3 · b6ac86e0

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

b6ac86e0

Fix coverage reports · 577b170b

Iustin Pop authored 14 years ago


Currently, the coverage reports include the unittests themselves, and
this skewes unfairly the reports, as the coverage for the tests is very
high (since they all run).

To fix this, we export the ganeti temp dir from run-in-temp-dir, and we
use that to exclude the tests directory. The patch also fixes a but
related to multiple directories to be omitted (--omit a --omit b is
wrong, it needs to be --omit a,b).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

577b170b

Nov 19, 2010

Updates NEWS and configure.ac for 2.3.0~rc1 · ca6c2dcd

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

ca6c2dcd

Merge branch 'devel-2.2' into devel-2.3 · 2b613de4

Iustin Pop authored 14 years ago


* devel-2.2:
  Update NEWS & configure.ac for the 2.2.2 release
  Fix documentation regarding conversion to drbd

Conflicts:
	NEWS         (integrated 2.2 changes)
	configure.ac (kept our version)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

2b613de4

Update NEWS & configure.ac for the 2.2.2 release · 2596526d

Iustin Pop authored 14 years ago


This imports the 2.1.8 NEWS entry and adds the 2.2.2 one, then updates the
configure.ac version.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

2596526d

Fix documentation regarding conversion to drbd · a22eb33b

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

a22eb33b

Fix documentation regarding conversion to drbd · 3e039592

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

3e039592

Nov 18, 2010

Reinstall instance: disallow offline secondaries · 9aacb199

Iustin Pop authored 14 years ago


Currently, reinstallation of a DRBD instance with the secondary node offline does:

node1# gnt-instance reinstall -f instance1
Waiting for job 139053 for instance1...
Thu Nov 18 01:36:09 2010  - WARNING: Could not prepare block device disk/0 on node node3 (is_primary=False, pass=1): Node is marked offline
Thu Nov 18 01:36:09 2010  - WARNING: Could not shutdown block device disk/0 on node node3: Node is marked offline
Job 139053 for instance1 has failed: Failure: command execution error:
Disk consistency error

Since this fails anyway, let's check the secondary nodes, thus
preventing any modifications to the instance (e.g. OS type change):

node1# gnt-instance reinstall -f instance1
Waiting for job 139058 for instance1...
Job 139058 for instance1 has failed: Failure: prerequisites not met for this operation:
error type: wrong_state, error details:
Instance secondary node offline, cannot reinstall: node3

The patch needs modifications to the _CheckNodeOnline function, in order
to display meaningful messages ("Can't use offline node" would be very
confusing for an instance reinstall, since we didn't select a node
manually).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

9aacb199

QA: check that doubly modifying an OS state is OK · 89e8af70

Iustin Pop authored 14 years ago


This would have prevented the bug fixed in the previous patch :(

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

89e8af70

Fix breakage in OS state modify · e2334900

Iustin Pop authored 14 years ago


I was using the feedback_fn function incorrectly (it doesn't
automatically expand the arguments).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

e2334900

Nov 17, 2010

Merge branch 'devel-2.2' into devel-2.3 · 86c340af

Iustin Pop authored 14 years ago


* devel-2.2:
  QA: add tests for gnt-cluster modify -B
  LUSetClusterParms: fix validation of beparams

Conflicts:
	lib/cmdlib.py (reverted & applied manually the change)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

86c340af

QA: add tests for gnt-cluster modify -B · 9738ca94

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

9738ca94

LUSetClusterParms: fix validation of beparams · 52b783c2

Iustin Pop authored 14 years ago

Since the contents of the dict is validated via the ForceDictType, we can
simply require that it is a dict here. The previous check was wrong, as it was
copied from the HV checks (which also doesn't verify the leaf dict type).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

52b783c2

Nov 11, 2010

Add unittests for TemporaryReservationManager · 28a7318f

Iustin Pop authored 14 years ago


And fix an error message.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

28a7318f

TempReservationManager: Reserved() doesn't work · a7359d91

David Knowles authored 14 years ago


Note: It appears this has been around since the initial checkin of
TemporaryReservationManager. I have no idea what this could break, so
someone else may want to test this more thoroughly.

Signed-off-by: David Knowles <dknowles@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

a7359d91

Nov 09, 2010

Merge branch 'devel-2.2' into devel-2.3 · 1809bde5

Michael Hanselmann authored 14 years ago


* devel-2.2:
  devel/release: Use release-specific Makefile targets
  Makefile: Add new dist target for releases
  Makefile: Stricter checks for release distchecks

Conflicts:
	Makefile.am: Trivial

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

1809bde5

devel/release: Use release-specific Makefile targets · 2ba14c2f
Michael Hanselmann authored 14 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
2ba14c2f

Makefile: Add new dist target for releases · e627fe09

Michael Hanselmann authored 14 years ago


A new script, autotools/check-tar, is used to check the resulting
.tar.gz file for unwanted contents like wrong file owners or
permissions.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

e627fe09

Nov 08, 2010

Update ganeti-os-interface documentation · f1a791b6

Apollon Oikonomopoulos authored 14 years ago


man/ganeti-os-interace.sgml lacked complete information for the NIC-related
environment variables. Added a reference to NIC_%N_LINK and NIC_%N_MODE and
clarified the reference to NIC_%N_BRIDGE.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

f1a791b6

Nov 04, 2010

Makefile: Check for empty files and dirs on distcheck · bf0b21da

Michael Hanselmann authored 14 years ago


Including empty files can cause unnecessary warnings for packagers.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

bf0b21da

Revert commit , work around Automake bug · 2750b299

Michael Hanselmann authored 14 years ago

After commit e7e23e73 the build would fail in distcheck on systems with
Automake 1.10. An investigation identified Automake bug #533[1] as the
cause. Applying the changes in Automake commit 3a12ed5e[2] to the
generated Makefile.in file made distcheck work again.

The underlying problem is that in our case both doc/html and
doc/html/.dir were included in the distributed files. When distcheck
copied the former from the source to the staging directory, it was
marked as read-only (distcheck makes the whole source read-only). It
then tried to copy doc/html/.dir from the build directory, which failed.
Automake 1.11 and newer avoid this problem by adjusting the permissions.

Since depending on Automake 1.11 or above is not an option at this time,
a work-around was found by not using a “.dir” file in doc/html, but
using “index.html” as a flag for creating the directory.

[1] http://sourceware.org/cgi-bin/gnatsweb.pl?cmd=view&database=automake&pr=533
[2] http://git.savannah.gnu.org/gitweb/?p=automake.git;a=commit;h=3a12ed5e97dc193a38dd14e031658cbd329b50ca



Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

2750b299

Nov 03, 2010

Fix disk checks in “gnt-cluster verify” · c6a9dffa

Michael Hanselmann authored 14 years ago


Tests have shown that the changes in commit b8d26c6e don't work as
wanted. If any disk wasn't found on the node, all disks located on the
same node would show as faulty. The cause was incorrect exception
handling on the node.

This patch changes the RPC call to return a per-disk success/error
status, avoiding the problem.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Luca Bigliardi <shammash@google.com>

c6a9dffa

QA: Run “gnt-cluster verify” while DRBD instance exists · 7b4eed05

Michael Hanselmann authored 14 years ago


This tests some parts of the disk information collection.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Luca Bigliardi <shammash@google.com>

7b4eed05