Commits · 99cafe0f05b9e19e004a93bd3fe5ac271c1fa37b · itminedu / snf-ganeti

Nov 21, 2011

build-rpc: Fail if call is defined more than once · 99cafe0f
Michael Hanselmann authored 13 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
99cafe0f


In the last merge I erroneously discarded the changes introduced by
commit 2a6de57a "Check the results of master IP RPCs". This commit
reintroduces them.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

edea391e

Fix QA breakage caused by merge · f73e5568

Michael Hanselmann authored 13 years ago


Patch tested and confirmed to work by Andrea Spadaccini
<spadaccio@google.com>.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

f73e5568

masterd: Initialize job queue only after RPC client · cb4d3314

Michael Hanselmann authored 13 years ago


Otherwise jobs started after an unclean master shutdown will fail as
they depend on the RPC client.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

cb4d3314

masterd: Shutdown only once running jobs have been processed · 5483fd73

Michael Hanselmann authored 13 years ago


Until now, if masterd received a fatal signal, it would start shutting
down immediately. In the meantime it would hang while jobs are still
processed. Clients couldn't connect anymore to retrieve a jobs' status.

This this patch masterd checks if any job is running before shutting
down. If there is it'll check again every five seconds. Once all jobs
are finished, it waits another five seconds to give clients a chance to
retrieve the jobs' status. After that masterd will shutdown in a clean
fashion.

If a second signal is received the old behaviour is preserved.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

5483fd73

daemon: Support clean daemon shutdown · 2d6b5414

Michael Hanselmann authored 13 years ago


Instead of aborting the main loop as soon as a fatal signal (SIGTERM or
SIGINT) is received, additional logic allows waiting for tasks to finish
while I/O is still being processed.

If no callback function is provided the old behaviour--shutting down
on the first signal--is preserved.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

2d6b5414

daemon: Allow custom maximum timeout for scheduler · f5acf5d9

Michael Hanselmann authored 13 years ago


This is needed in case the scheduler user (daemon.Mainloop in this case)
has other timeouts at the same time. Needed for clean master shutdown.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

f5acf5d9

jqueue: Add code to prepare for queue shutdown · 6d5ea385

Michael Hanselmann authored 13 years ago


Doing so will prevent job submissions (similar to a drained queue),
but won't affect currently running jobs. No further jobs will be
executed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

6d5ea385

workerpool: Export function to check for running tasks · ef52306a
Michael Hanselmann authored 13 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
ef52306a

daemon: Use counter instead of boolean for mainloop abortion · e0545ee9

Michael Hanselmann authored 13 years ago


Also log a message when a fatal signal was received and use dict.items.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

e0545ee9

Nov 18, 2011

htools: adjust imports for newer compilers · 7345b69b

Iustin Pop authored 13 years ago


While testing with ghc 7.2, I saw that some imports we are using are
very old (from ghc 6.8 time), even though current libraries are using
different names.

We fix this and bump minimum documented version to ghc 6.12, as I
don't have 6.10 to test anymore (possibly still works with that
version, but better safe - both Ubuntu Lucid and Debian Squeeze ship
with 6.12 nowadays).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

7345b69b

Merge branch 'devel-2.5' · 0e82dcf9

Andrea Spadaccini authored 13 years ago


* devel-2.5: (24 commits)
  LUInstanceCreate: Release unused node locks
  htools: rework message display construction
  hbal: handle empty node groups
  Document OpNodeMigrate's result for RAPI
  Ensure unused ports return to the free port pool
  Re-wrap a paragraph to eliminate a sphinx warning
  Fix newer pylint's E0611 error in compat.py
  Fail if node/group evacuation can't evacuate instances
  Update init script description
  LUInstanceRename: Compare name with name
  LUClusterRepairDiskSizes: Acquire instance locks in exclusive mode
  Update synopsis for “gnt-cluster repair-disk-sizes”
  Move hooks PATH environment variable to constants
  Check the results of master IP RPCs
  Add documentation for the master IP hooks
  Add master IP turnup and turndown hooks
  Add RunLocalHooks decorator
  Generalize HooksMaster
  Update NEWS for 2.5.0~rc4
  Bump version to 2.5.0~rc4
  ...

Conflicts:
	NEWS
	doc/hooks.rst
	lib/backend.py
	lib/cmdlib.py
	lib/constants.py

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

0e82dcf9

Merge branch 'stable-2.5' into devel-2.5 · 0c1441eb

Michael Hanselmann authored 13 years ago


* stable-2.5:
  htools: rework message display construction
  hbal: handle empty node groups
  Document OpNodeMigrate's result for RAPI
  Fail if node/group evacuation can't evacuate instances
  LUInstanceRename: Compare name with name
  LUClusterRepairDiskSizes: Acquire instance locks in exclusive mode
  Update NEWS for 2.5.0~rc4
  Bump version to 2.5.0~rc4
  jqueue: Allow zero jobs to be submitted at once
  hail: don't select the primary as new secondary
  hail: add an extra safety check in relocate
  Bump version to 2.5.0~rc3

Conflicts:
	configure.ac: Trivial

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

0c1441eb

Merge branch 'devel-2.4' into devel-2.5 · 05c2e624

Michael Hanselmann authored 13 years ago


* devel-2.4:
  Ensure unused ports return to the free port pool
  Re-wrap a paragraph to eliminate a sphinx warning

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

05c2e624

Nov 17, 2011

admin.rst update regarding offline state of the instance · edc282ad
Agata Murawska authored 13 years ago
```
Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
edc282ad

NEWS update - offline instance state · 555d5304

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

555d5304

Backwards compatibity - added admin_up to query · 754cc530

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

754cc530

Man page update: online/offline state of instance · bafb5067

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

bafb5067

Add small node in admin.rst about confd disabling · 10d3f678

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

10d3f678

Warn if we enable maintain-node-health without confd · d29036c1

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

d29036c1

Adapt daemon-util to ENABLE_CONFD · c4e5d11e

Iustin Pop authored 13 years ago


We still allow explicit shutdown of confd, but we prevent manual
or automatic start-up.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

c4e5d11e

Adapt watcher for ENABLE_CONFD · aa224134

Iustin Pop authored 13 years ago


If confd is disabled, do not automatically restart it. Furthermore, we
can't run maintenance actions if it is disabled so log a warning.

Note that I haven't completely disabled the NodeMaintenance class with
ENABLE_CONFD = False because I think they are at two different levels
(e.g. we might have other maintenance actions done even with confd
disabled).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

aa224134

Prevent runnning of confd tests in burnin · db3780f9

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

db3780f9

Add toggle for enabling/disabling confd · cd8b0072

Iustin Pop authored 13 years ago


Doesn't do anything yet.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

cd8b0072

Fix unittest bug related to offline instances · b99d1638

Iustin Pop authored 13 years ago


Currently, the code in Node.hs is overly strict: once a node's free
memory reaches 0, it will refuse to add any instances (offline or
not). I think this is a safe safeguard (I don't expect nodes to run
without at least 1MB of free memory), so rather than change this
behaviour we need to restrict the Node generation in the unittest to
skip such nodes.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

b99d1638

htools: reindent the rest of the files · ebf38064

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

ebf38064

htools: re-indent IAlloc.hs · 00dd69a2

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

00dd69a2

htools: reindent hspace · 3c3690aa

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

3c3690aa

htools: reindent hbal · 2ba17362

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

2ba17362

htools: reindent CLI.hs · cd08cfa4

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

cd08cfa4

htools: re-indent QC.hs · d5dfae0a

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

d5dfae0a

htools: re-indent Node.hs · fd7a7c73

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

fd7a7c73

htools: finish re-indenting Cluster.hs · 9fc18384

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

9fc18384

masterd: Don't pass mainloop to server class · e8a701f6

Michael Hanselmann authored 13 years ago


It is not used.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

e8a701f6

workerpool: Allow processing of new tasks to be stopped · 27caa993

Michael Hanselmann authored 13 years ago

This is different from “Quiesce” in the sense that this function just
changes an internal flag and doesn't wait for the queue to be empty.
Tasks already being processed continue normally, but no new tasks will
be started. New tasks can still be added, but won't be processed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

27caa993

workerpool: Use loop to ignore spurious notifications · 2db05c94

Michael Hanselmann authored 13 years ago


This saves us from returning to the worker code when there is no
task to be processed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

2db05c94

jqueue: Factorize code checking for drained queue · c8d0be94

Michael Hanselmann authored 13 years ago


This is in preparation for a clean(er) shutdown of masterd.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

c8d0be94

LUInstanceCreate: Release unused node locks · ac2c8bc0

Michael Hanselmann authored 13 years ago


After iallocator ran we can release any unused node locks. Since they
must be in exclusive mode this should improve parallelization during
instance creation.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

ac2c8bc0

cmdlib.TLReplaceDisks: Use itertools.count · 69f0340a

Michael Hanselmann authored 13 years ago


… instead of a variable which needs to be incremented for every step.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

69f0340a

Nov 16, 2011

htools: rework message display construction · bdd8c739

Iustin Pop authored 13 years ago


While diagnosing some (unrelated) memory usage in htools, I've
stumbled upon some very bad behaviour in checkData: mapAccum is
non-strict, and the tuple we use also, so that results in the list of
list of messages being very bad space-wise (hundreds of MB of memory
for a simulated cluster with thousands of nodes, all with errors).

The new, explicit reuse of the old message list has a linear memory
behaviour. The only downside is that messages are listed in the
reverse order (which I'll fix on master).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

bdd8c739