Commits · dca1764e76bff0bebd616d4f369cb20866d6c92c · itminedu / snf-ganeti

Jul 14, 2008

First version of user feedback fixes · f1048938

Iustin Pop authored 16 years ago

This patch contains a raw version for fixing feedback_fn.

The new mechanism works as follows:
  - instead of a per-Processor feedback_fn, there's one for each
    ExecOpCode, so that feedback for different opcodes go via possibly
    different functions
  - each _QueuedOpCode gets a message buffer, a method for adding
    feedback and a method for retrieving (parts of) the feedback
  - the _QueuedJob object gets a new attribute that is equal to the
    index of the currently executing opcode
  - job queries get an extra parameter called 'ticker' that will return
    the latest message on the current executing opcode
  - the cli.py job completion poll will show the new status if different
    from the old one

Of course, quick messages will be lost, as currently only the latest one
is available. Also changes between opcodes are not represented at all.

Reviewed-by: imsnah

f1048938

Jul 08, 2008

Processor: Acquire locks before executing an LU · 68adfdb2

Guido Trotter authored 16 years ago

If we're running in a "new style" LU we may need some locks, as required
by the ExpandNames function, to be able to run. We'll walk up the lock
levels present in the needed_locks dictionary and acquire them, then run
the actual LU. LUs can release some or all the acquired locks, if they
want, before terminating, provided they update their needed_locks
dictionary appropriately, so that we know not to release a level if they
have already done so.

Reviewed-by: iustinp

68adfdb2

LogicalUnit: add ExpandNames function · d465bdc8

Guido Trotter authored 16 years ago

New concurrent LUs will need to call ExpandNames so that any names
passed in by the user are canonicalized, and can be used by hooks,
locking and other parts of the code. This was done in CheckPrereq
before, but it's now splitted out, as it's needed for locking, which in
turn CheckPrereq needs. Old LUs can be converted gradually.

Reviewed-by: iustinp

d465bdc8

Processor: Move LU execution to its own method · 36c381d7

Guido Trotter authored 16 years ago

This makes the try...finally code simplier, and helps adding a more
complex locking structure before the actual execution. It also fixes a
concurrency bug caused by the fact that write_count was read before
acquiring the BGL, and thus spurious config update hooks run could have
been triggered. This doesn't solve the issue of running config update
hooks for concurrent LUs.

Reviewed-by: iustinp

36c381d7

Pass context to LUs · 77b657a3

Guido Trotter authored 16 years ago

Rather than passing a ConfigWriter to the LUs we'll pass the whole
context, from which a ConfigWriter can be extracted, but we can also
access the GanetiLockManager. This also fixes the places where a FakeLU
is created.

Reviewed-by: iustinp

77b657a3

Jul 01, 2008

Context: s/GLM/glm/ · 984f7c32

Guido Trotter authored 16 years ago

Make the GanetiLockManager instance of GanetiContext lowercase

Reviewed-by: imsnah

984f7c32

Processor: acquire the BGL for LUs requiring it · 04864530

Guido Trotter authored 16 years ago

If a LU required the BGL (all LUs do, right now, by default) we'll
acquire it in the Processor before starting them. For LUs that don't
we'll still acquire it, but in a shared fashion, so that they cannot run
together with LUs that do.

We'll also note down whether we own the BGL exclusively, and if we don't
and we try to chain a LU that does, we'll fail.

More work will need to be done, of course, to convert LUs not to require
the BGL, but this basic infrastructure should guarantee the coexistance
of the old and new world for the time being.

Reviewed-by: iustinp

04864530

Processor: pass context in and use it. · 1c901d13

Guido Trotter authored 16 years ago

The processor used to create a new ConfigWriter when it was initialized.
We now have one in the context, so we'll just recycle it. First of all
we'll pass the context in when creating a new Processor object, then
we'll just use context.cfg, which is granted to be initialized, wherever
we used self.cfg, and stop checking whether the config is already
initialized or not.

In the future the Processor will be able to use the context also to
acquire the BGL for LUs that require it, and to push the context down to
LUs that don't in order for them to manage their own locking.

Reviewed-by: iustinp

1c901d13

Jun 30, 2008

Fix sstore handling in Processor · c6868e1d

Guido Trotter authored 16 years ago

- no need to keep the sstore as an object member, remove it
- don't reinitialize sstore only if self.cfg is None
    This is not an issue, as the Processor is recycled for every opcode,
    but in general we know that (a) we might need a different type of
    sstore for different opcodes and (b) initializating them is cheap
- recreate sstore when chaining opcodes
    Without this fix chaining an opcode which requires a writable sstore
    to one which doesn't would fail. This doesn't happen today, but it's
    better to fix it anyway

These changes are possible because nowadays all opcodes already require
a working cluster/configuration.

Reviewed-by: iustinp

c6868e1d

Jun 23, 2008

Fix gnt-cluster “command” and “copyfile” · b3989551

Iustin Pop authored 16 years ago

Since the disabling of forking in the master daemon, the two ssh-based
subcommands were not working anymore. However, there is no need at all
for the commands to be run from the master daemon (permissions to read
the cluster private ssh key notwithstanding), they can be run directly
from the command line utilities.

The patch removes the two opcodes OpRunClusterCommand and
OpClusterCopyFile (and their associated LUs) and changes the code in
‘gnt-cluster’ to query the list of nodes and run directly the SshRunner
over the list. As such, all forking is done from the gnt-cluster script,
and the commands are working again.

Reviewed-by: imsnah

b3989551

Jun 17, 2008

Implement disk grow at LU level · 8729e0d7

Iustin Pop authored 16 years ago

This patch adds a new opcode and LU for growing an instance's disk.

The opcode allows growing only one disk at time, and will throw an error
if the operation fails midway (e.g. on the primary node after it has
been increased on the secondary node). As such, it might actually leave
different sized LVs on different nodes, but this will not create
problems.

Reviewed-by: imsnah

8729e0d7

Jun 16, 2008

Move SetKey to WritableSimpleStore and use it · 05f86716

Guido Trotter authored 16 years ago

Before we used to be able to update SimpleStore by just calling SetKey, this
feature is now moved to an external class, which inherits from it. In this
patch the new WritableSimpleStore class is also put to use, in the LUs that
need it. Rather than making each LU instantiate it, we have a new LogicalUnit
flag REQ_WSSTORE which defaults to False, but when declared to be True asks the
LogicalUnit to be initialized with a writeable version of the SimpleStore.
LUMasterFailover and LURenameCluster are then changed to use it.

InitCluster is also changed to instantiate a WritableSimpleStore, rather
than a normal one.

Reviewed-by: imsnah

05f86716

Jun 12, 2008

Move InitCluster opcode into a single function · a0c9f010

Michael Hanselmann authored 16 years ago

This allows us to initialize a new cluster. The code certainly contains
bugs and hooks aren't implemented yet.

Reviewed-by: iustinp

a0c9f010

Remove REQ_CLUSTER from opcode handling code · c6d58a2b

Michael Hanselmann authored 16 years ago

It's not needed anymore now that all opcodes require a cluster. Cluster
initialization was the only exception.

Reviewed-by: iustinp

c6d58a2b

Apr 30, 2008

Add a LU Hooks notification function · 1fce5219

Guido Trotter authored 17 years ago

Previously LUs could be failed by pre-hooks, and post-hooks just had effects by
themselves. This patch allows a LU to define the HooksCallBack function if it
wants to know about its hooks' results and alter its results in response.

The ChainOpCode execution path contains some commented out hooks code, which
this patch modifies to run the HooksCallBack function, so this is not forgot if
it ever gets uncommented out.

Reviewed-by: iustinp

1fce5219

HooksMaster: Make RunPhase return the rpc output · b07a6922

Guido Trotter authored 17 years ago

Right now the hooks output is propagated from the nodes all the way up to
HooksMaster.RunPhase, which uses it for debugging PRE hooks, but then silently
discards them. We'll now propagate it up to the Processor.ExecOpCode function,
where they can be handled for other purposes (or discarded again, of course).
This patch also improves a bit the HooksMaster.RunPhase docstring.

Reviewed-by: iustinp

b07a6922

Apr 23, 2008
- Add gnt-backup remove functionality · 9ac99fda
  Guido Trotter authored 17 years ago
```
This patch also fixes the LUExportInstance Prereq docstring.

Reviewed-by: iustinp
```
  9ac99fda
Apr 16, 2008

Allocator framework, 1st part: allocator input generation · d61df03e

Iustin Pop authored 17 years ago

In preparation for the introduction of automatic instance allocator,
this patch adds an allocator simulation opcode, that based on the input
parameters, will return either the input message to the allocator
(implemented) or the result of the allocator run (not yet implemented).

This allows algorithm tests against simulated allocations and the
current cluster state.

The patch adds the following:
  - a function that generates the generic cluster information for the
    allocator
  - a function that generates the 'new instance' information
  - a function that generates the 'replace_secondary' information

These three functions will be used by the allocator framework later to
generate the actual information for the external algorithms. Currently
we just return the json-serialized text.

Reviewed-by: imsnah

d61df03e

Mar 31, 2008
- parms->params Refactoring · 7767bbf5
  Manuel Franceschini authored 17 years ago
```
- Substitute all occurences of name 'parms' with 'params'
- Small codestyle fix

Reviewed-by: ultrotter
```
  7767bbf5
- Map OpSetClusterParams to correponding LU · 0cc05d44
  Manuel Franceschini authored 17 years ago
```
Reviewed-by: iustinp
```
  0cc05d44
Mar 30, 2008

Change the order of config updates in some LUs · fe482621

Iustin Pop authored 17 years ago

In the start and stop instance LUs, the configuration update is done
right at the end. This means that if, for example, the instance shutdown
succeeds, but the drive deactivation fails, the next run of the watcher
will start the instance again, as it's still marked in running mode.

This patch changes these two LUs so that first the update the
configuration to the desired state, and only then we proceed to update
the config. This ensures that the state saved is the desired state.

Because the config might be updated even though the LU failed, this
patch also modifies the mcpu.Processor.ExecOpCode method to run the
RunConfigUpdate hook in a finally: phase while the lu.Exec is done in
its try phase. This ensures that config update hooks (tries to) run at
all times when the config is updated.

Reviewed-by: schreiberal

fe482621

Mar 25, 2008

Remove the add/remove mirror operations · 249069a1

Iustin Pop authored 17 years ago

These two operations are related to md/drbd7 code (remote_raid1). Remove
them as part of the md/drbd7 removal.

Reviewed-by: imsnah

249069a1

Mar 05, 2008
- Codestyle fixes: adding a few empty lines · 7c0d6283
  Michael Hanselmann authored 17 years ago
```
Reviewed-by: ultrotter
```
  7c0d6283
Feb 22, 2008
- Fixes small spell mistakes and comments · c99a3cc0
  Manuel Franceschini authored 17 years ago
  
  c99a3cc0
Feb 05, 2008
- Add a test opcode that sleeps for a given duration · 06009e27
  Iustin Pop authored 17 years ago
```
This can be used for testing purposes.

Reviewed-by: ultrotter,imsnah
```
  06009e27
Dec 12, 2007

Add the ‘gnt-cluster verify-disks’ command · f4d4e184

Iustin Pop authored 17 years ago

This patch adds the OpVerifyDisks handling in mcpu.py and the
verify-disks command in the gnt-cluster script, which for every instance
computed by LUVerifyDisks submits a new OpActivateInstanceDisks request.

Reviewed-by: imsnah

f4d4e184

Nov 09, 2007

Soften the requirements for hooks execution · 2395c322

Iustin Pop authored 17 years ago

Currently, an unreachable node (or one that return undetermined failure)
in the hooks pre-phase will abort the curren operation. This is not
good, as a down node could prevent many operation on the cluster.

This patch changes a RPC-level failure (and not a hook execution
failure) into a warning. It also modifies the related test cases.

This fixes issue 11.

Reviewed-by: ultrotter

2395c322

Nov 08, 2007

Changes related to logging · 5bfac263

Iustin Pop authored 17 years ago

This patch modifes:
  - mcpu.Processor.LogWarning to have its 'hint' parameter as optional
    and only log it if not None
  - cmdlib._WaitForSync to not log directly to stdout/stderr but via the
    proc.Log(Info|Warning) methods
  - the LU attribute 'processor' is renamed to 'proc' to shorten the
    name

Reviewed-by: imsnah

5bfac263

Nov 06, 2007

Add better error logging functions for LUs · 0fbbf897

Iustin Pop authored 17 years ago

Currently, some LUs use logger.Error, others just feedback_fn, etc. This
patch adds three functions to mcpu.Processor than can be used to log
messages to both the log and to the user.

These function will be used to enhance the output of replace-disks for
drbd8 (at least).

Reviewed-by: imsnah

0fbbf897

Nov 03, 2007

Implement tag searching · 73415719

Iustin Pop authored 17 years ago

This patch adds a search command for locating tags on all objects of the
cluster using a regex pattern.

Reviewed-by: aat

73415719

Oct 29, 2007

Change the signature of some methods of mcpu.Processor · 1a8c0ce1

Iustin Pop authored 17 years ago

This patch moves the passing of the feedback_fn argument from the
(Exec|Chain)OpCode to the initialization of the Processor instance.

Reviewed-by: imsnah

1a8c0ce1

Oct 18, 2007

Patch series for reboot feature, part 2 · bf6929a2

Alexander Schreiber authored 17 years ago

This patch series implements the reboot command for gnt-instance. It
supports three types of reboot: soft (hypervisor reboot), hard (instance
config rebuild and reboot) and full (full instance shutdown and startup
again).

This patch contains the opcode and lu part.

Reviewed-by: iustinp

bf6929a2

Oct 11, 2007

Implement post-configuration-update hook · 6a4aa7c1

Iustin Pop authored 17 years ago

This patch adds a special hook: the post-configuration update hook. This
hook has only a post phase that runs after a top-level LU that modified
the configuration.

Since the hook is a post-phase one, no error checking is done on the
results. The hook runs only on the master.

Reviewed-by: imsnah

6a4aa7c1

Split the hooks env building in two parts · 4167825b

Iustin Pop authored 17 years ago

This patch moves some of the environment processing from _BuildEnv to a
new _RunWrapper command which does the stringification and adds the
sstore variables.

The reasoning is that the sstore can be fresher than before the
execution (e.g.  in case of cluster init).

In order to support thise, we also need to modify cmdlib.LUInitCluster:
  - memorize the sstore and cfgw newly created in the Exec function
  - no need to build the custom environment in the BuildHooks

4167825b

Move hook execution decision to HooksMaster · 9a395a76

Iustin Pop authored 17 years ago

Currently, the HooksMaster creation and execution decision is in the
Processor class. This is not optimal, so we change to always create a
hooks master and instead make the decision inside that class, by
creating empty node lists for both pre and post if the lu doesn't
support hooks. This way, hooks decisions are moved to HooksMaster (where
they belong).

Reviewed-by: imsnah

9a395a76

Remove cfg and sstore parameters to HooksMaster · f97a6b10

Iustin Pop authored 17 years ago

The HooksMaster class doesn't use the cfg parameter, and it's better to
use it from the LU anyway (if needed). Let's remove it.

Also, the sstore of the LU can be fresher than the sstore we got at init
time, so use that instead and remove our own.

Reviewed-by: imsnah

f97a6b10

Oct 10, 2007

Remove the shebang from modules · 2f31098c

Iustin Pop authored 17 years ago

Since modules are not directly executables, remove the shebang from
them. This helps with lintian warnings.

Also make the autogenerated _autoconf.py contain two comment lines at
the beginning, like the other modules.

Reviewed-by: ultrotter

2f31098c

Oct 08, 2007

Change tags add/remove to process multiple tags · f27302fa

Iustin Pop authored 17 years ago

This patch changes the tags opcodes to work with multiple tags at once
instead of only one. As such, the opcodes and some parameters are
renamed.

Reviewed-by: imsnah

f27302fa

Sep 18, 2007

Implement cluster rename operation · 07bd8a51

Iustin Pop authored 17 years ago

This patch adds a new OpCode (and corresponding LU) that implements the
cluster rename functionality.

This is done by shutting down the master role, making the needed sstore
modifications and distributing the changed files to all nodes, and then
re-enabling the master role.

The modification to the man page of gnt-cluster also moves the section
on gnt-cluster destroy in order to correct alphabetical ordering.

Reviewed-by: imsnah

07bd8a51

Sep 17, 2007

Implement instance rename operation · decd5f45

Iustin Pop authored 17 years ago

This patch adds support for instance rename operation at all remaining
layers: RPC, OpCode/LU and CLI.

Reviewed-by: imsnah

decd5f45