Commits · 8c96d01fcdba4c757c3760daca58c67032057d20 · itminedu / snf-ganeti

Jul 24, 2009

Guido Trotter authored 15 years ago


Currently rapi is the only daemon which accepts a port option, rather
than querying its own port from services, and failing back to the
default if not found. Changing this to conform to what other daemons do.

Also update the ganeti-rapi(8) manpage

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

8c96d01f

Change GetNodeDaemonPort to GetDaemonPort in utils · cd50653c

Guido Trotter authored 15 years ago


GetNodeDaemonPort is used to lookup the node daemon port in the services
file, and if not found to return the default one. We make it a generic
function, which accepts the daemon name in input, so that it can be used
by confd as well, to lookup its own udp port.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

cd50653c

Merge branch 'next' into branch-2.1 · 775c6d3e

Guido Trotter authored 15 years ago

* next:
  lvmstrap: Change diskinfo to use GenerateTable
  Get rid of constants.RAPI_ENABLE
  Remove references to utils.debug
  ganeti-rapi, replace hardcoded exit value
  Add the bind-address option to ganeti-rapi
  noded: Abstract hard-coded sys.exit value
  Add an example "ethers" hook
  burnin: move batch init/commit into a decorator
  burnin: move instance alive checks to a decorator
  burnin: Implement retryable operations
  Ignore vim swap files
  burnin: fix removal errors hiding real errors

775c6d3e

lvmstrap: Change diskinfo to use GenerateTable · e194129a

Stephen Shirley authored 15 years ago


This way the produced table is formatted nicely.

Signed-off-by: Stephen Shirley <diamond@google.com>
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e194129a

Get rid of constants.RAPI_ENABLE · e1876432

Guido Trotter authored 15 years ago


This constant is unused, except in qa. Removing it since it's always True.

This patch also removes the unused qa_rapi.PrintRemoteAPIWarning
function, and removes a comment about temporary constants "until we have
cluster parameters".

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e1876432

Jul 23, 2009

cmdlib: Add __init__ to Tasklet class · 464243a7

Michael Hanselmann authored 15 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

464243a7

Remove references to utils.debug · 68b1fcd5

Guido Trotter authored 15 years ago


Various modules set it to True when called in debugging mode, but the
utils module supports no such global.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

68b1fcd5

ganeti-rapi, replace hardcoded exit value · be73fc79

Guido Trotter authored 15 years ago


substitute exit(1) with exit(constants.EXIT_FAILURE).
Also fix a wrongly indented line.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

be73fc79

Add the bind-address option to ganeti-rapi · 8790ac54

Guido Trotter authored 15 years ago


Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

8790ac54

Jul 22, 2009

cmdlib: Move LUMigrateInstance functionality to tasklet · 3e06e001
Michael Hanselmann authored 15 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
```
3e06e001

gnt-node: Use new opcode to evacuate nodes · 80dd50bf

Michael Hanselmann authored 15 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

80dd50bf

Add new opcode to evacuate nodes · 7ffc5a86

Michael Hanselmann authored 15 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

7ffc5a86

cmdlib: Convert _DiskReplacer to tasklet · c68174b6

Michael Hanselmann authored 15 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

c68174b6

cmdlib: Function to get all secondary instances on a certain node · 692738fc
Michael Hanselmann authored 15 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
692738fc

noded: Abstract hard-coded sys.exit value · 46479775

Guido Trotter authored 15 years ago


On machines without the ssl file noded exists '5'.
Changing this to constants.EXIT_NOTCLUSTER.

Also utils.GetNodeDaemonPort hasn't risen errors.ConfigurationError for
a while, so removing that try/except block.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

46479775

cmdlib: Add tasklet support to logical unit base class · 6fd35c4d
Michael Hanselmann authored 15 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
```
6fd35c4d

cmdlib: Add tasklet base class · 9a6800e1

Michael Hanselmann authored 15 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

9a6800e1

Add an example "ethers" hook · 93db3d8f

Guido Trotter authored 15 years ago


This hook can be used to update /etc/ethers with instance's mac
addresses. A dhcp server on the nodes can then serve to the instances
their correct address. (This has been tested with dnsmasq's dhcp
implementation)

Signed-off-by: Guido Trotter <ultrotter@google.com>

93db3d8f

Jul 21, 2009

ganeti-confd design doc · c0446a46

Guido Trotter authored 15 years ago


Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

c0446a46

burnin: move batch init/commit into a decorator · c70481ab

Iustin Pop authored 15 years ago


Many burnin steps initialize the batch queue at the beginning and commit
it at the end of their operation. This patch moves this code to a
decorator, in order to reduce redundant code.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>

c70481ab

burnin: move instance alive checks to a decorator · d9b7a0b4

Iustin Pop authored 15 years ago


Many burn steps to a manual check of instance aliveness, via duplicate
code. This patch moves this code to a decorator.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

d9b7a0b4

burnin: Implement retryable operations · 73ff3118

Iustin Pop authored 15 years ago


Some burnin steps are idempotent: e.g. reinstalling an instance (from
burning p.o.v.) can be done multiple times without any side-effects that
would affect later burnin steps. As such, failing the whole burnin
process due a reinstall failure is undesirable.

This patch modifies burnin by marking each opcode (in case of individual
execution) and job set retryable or not. Retryable actions will be
retried up to a number of times, after which we give up and return
failure.

One side-effect is that in case of full-failure in retryable job sets we
lose the original exception (but we do log its string format), so we
have a little bit less information in this case.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

73ff3118

Jul 20, 2009

Generate a shared HMAC key at cluster init time · 4a34c5cf

Guido Trotter authored 15 years ago


This key is shared on all nodes (via cmdlib._RedistributeAncillaryFiles)
and will be used for HMAC authentication of confd messages.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

4a34c5cf

Fix unittests broken by commit · c071c5b3

Michael Hanselmann authored 15 years ago


File "../test/ganeti.hooks_unittest.py", line 239, in setUp
  self.lu = FakeLU(FakeProc(), self.op, self.context, None)
File "…/ganeti/cmdlib.py", line 92, in __init__
  self.LogStep = processor.LogStep
AttributeError: FakeProc instance has no attribute 'LogStep'

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

c071c5b3

cmdlib: Move code doing disk replacements into separate class · 2bb5c911

Michael Hanselmann authored 15 years ago


This class will be used for a new opcode to evacuate nodes.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

2bb5c911

cmdlib: Pass config and rpc objects directly to IAllocator · 923ddac0

Michael Hanselmann authored 15 years ago


Before IAllocator would access them using “self.lu.cfg” and “self.lu.rpc”.
It shouldn't know about the internals of the LU.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

923ddac0

Ignore vim swap files · 699d856f

Michael Hanselmann authored 15 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

699d856f

Fix backend import errors from GetHypervisorClass · e5a45a16

Iustin Pop authored 15 years ago


The merge of commit 360b0dc2 into branch-2.1 broke import of backend,
since it uses hypervisor.GetHypervisor() which returns an instance of
the hypervisor. Some of the hypervisors create directories at init time,
thus the import of backend failed due this chain if it's not done on a
(proper) ganeti node, such as during unittest time.

This patch adds in hypervisor a GetHypervisorClass() function, which
returns the class not the instance of the hypervisor, and uses that in
_BuildUploadFiles(). The existing GetHypervisor is then changed to use
this function.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e5a45a16

Jul 19, 2009

burnin: fix removal errors hiding real errors · 8629a543

Iustin Pop authored 15 years ago


A long-standing bug in burnin makes errors during the removal phase
(e.g. because an import has failed, or because the initial creation has
failed) hide the original error.

This patch suppresses removal errors if we are already in ‘has_err’
mode, and otherwise it displays them normally.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

8629a543

Merge branch 'next' into branch-2.1 · b397a7d2
Iustin Pop authored 15 years ago
```
Conflicts:
	lib/backend.py: non-trivial conflict but easy to solve
```
b397a7d2

backend: Only build once the list of upload files · 360b0dc2

Iustin Pop authored 15 years ago


The list of upload files is built currently at every UploadFile() call.
This patch moves it to a separate variable which is initialized only
once.

This won't make much difference but I regard it as cleanup.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

360b0dc2

Merge commit 'origin/next' into branch-2.1 · 25f9901f
Iustin Pop authored 15 years ago
```
Conflicts:
	lib/cli.py: trivial extra empty line
```
25f9901f

Fix gnt-instance reinstall · b8f31860

Iustin Pop authored 15 years ago


Commit 55efe6da "Convert instance
reinstall to multi instance model" actually broke instance reinstall for
single-instance cases. This one-liner fixes it.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit b6e243ab)

b8f31860

Fix a couple of epydoc warnings · 6af6270a

Iustin Pop authored 15 years ago


It seems epydoc needs fully-qualified references, and doesn't deal with
relative ones (not even in the current module) if there are any
ambiguities.

There are other epydoc warnings, in the rapi docstrings, but those are
left as-is as they're removed in 2.1.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

6af6270a

job queue: fix loss of finalized opcode result · 34327f51

Iustin Pop authored 15 years ago


Currently, unclean master daemon shutdown overwrites all of a job's
opcode status and result with error/None. This is incorrect, since the
any already finished opcode(s) should have their status and result
preserved, and only not-yet-processed opcodes should be marked as
‘error’. Cancelling jobs between opcodes does the same (but this is not
allowed currently by the code, so it's not as important as unclean
shutdown).

This patch adds a new _QueuedJob function that only overwrites the
status and result of finalized opcodes, which is then used in job queue
init and in the cancel job functions. The patch also adds some comments
and a new set constants in constants.py highlighting the finalized vs.
non-finalized opcode statuses.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

34327f51

Switch gnt-debug submit-job to JobExecutor · b59252fe

Iustin Pop authored 15 years ago


Currently gnt-debug submits jobs individually, but in 2.1 JobExecutor
uses the optimized SubmitManyJobs luxi call and as such should be used
whenever multiple jobs need to be submitted.

This patch converts gnt-debug submit-job to use it and also removes an
extra empty line in the JobExecutor class.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

b59252fe

Convert instance reinstall to multi instance model · 3d2ca95d

Iustin Pop authored 15 years ago


This patch converts ‘gnt-instance reinstall’ from single-instance to
multi-instance model; since this is dangerours, it's required to pass
“--force --force-multiple” to skip the confirmation.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit 55efe6da)

3d2ca95d

gnt-instance batch-create: use the job executor · dd7dcca7

Iustin Pop authored 15 years ago


This small patch changed the batch create functionality to use the job
executor instead of single-job submits.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit d4dd4b74)

dd7dcca7

Modify cli.JobExecutor to use SubmitManyJobs · f2921752

Iustin Pop authored 15 years ago


This patch changes the generic "multiple job executor" to use the many
jobs submit model, which automatically makes all its users use the new
model.

This makes, for example, startup/shutdown of a full cluster much more
logical (all the submitted job IDs are visible fast, and then waiting
for them proceeds normally).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit 23b4b983)

f2921752

Add a luxi call for multi-job submit · 56d8ff91

Iustin Pop authored 15 years ago


As a workaround for the job submit timeouts that we have, this patch
adds a new luxi call for multi-job submit; the advantage is that all the
jobs are added in the queue and only after the workers can start
processing them.

This is definitely faster than per-job submit, where the submission of
new jobs competes with the workers processing jobs.

On a pure no-op OpDelay opcode (not on master, not on nodes), we have:
  - 100 jobs:
    - individual: submit time ~21s, processing time ~21s
    - multiple:   submit time 7-9s, processing time ~22s
  - 250 jobs:
    - individual: submit time ~56s, processing time ~57s
                  run 2:      ~54s                  ~55s
    - multiple:   submit time ~20s, processing time ~51s
                  run 2:      ~17s                  ~52s

which shows that we indeed gain on the client side, and maybe even on
the total processing time for a high number of jobs. For just 10 or so I
expect the difference to be just noise.

This will probably require increasing the timeout a little when
submitting too many jobs - 250 jobs at ~20 seconds is close to the
current rw timeout of 60s.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit 2971c913)

56d8ff91