Commits · caad16e24a639597fb04e2daa40a42e526b63d36 · itminedu / snf-ganeti

Oct 12, 2008

Abstract checking own address into a function · caad16e2

Iustin Pop authored 16 years ago

Currently, we check if we have a given ip address (i.e. it's alive on
one of our interfaces) but manually calling TcpPing(source=localhost).
This works, but having it spread all over the code makes it hard to
change the implementation.

The patch abstracts this into a separate utils.OwnIpAddress(addr)
function. We add a rpc call for it, which we use instead of the
(single-use of) call_node_tcp_ping. We leave node_tcp_ping in, as seems
useful and eventually it should be removed in a separate patch.

Reviewed-by: imsnah

caad16e2

Oct 10, 2008

Mark call_node_leave_cluster as a static method · 15396f60
Michael Hanselmann authored 16 years ago
```
Reviewed-by: iustinp
```
15396f60

OS API: support for multiple versions in an OS · 082a7f91

Guido Trotter authored 16 years ago

Allow multiple api versions in an OS. This is according to the OS API
changes design doc, by which an OS can support multiple versions of the
Ganeti API and if one is supported by Ganeti it will work. Since up to
version 5 of the API mandates an OS could support only one version, this
change is retrocompatible with it and requires no version bump up.

Reviewed-by: iustinp

082a7f91

LUVerifyCluster: fix error from rpc call · 2eb78bc8

Guido Trotter authored 16 years ago

When calling node_verify leads to an error _VerifyNodes tries to iterate
over a non-sequence. Catch the error before and avoid this from
happening.

Reviewed-by: iustinp

2eb78bc8

Convert ganeti-noded to new HTTP server class · cc28af80
Michael Hanselmann authored 16 years ago
```
Reviewed-by: iustinp
```
cc28af80

Add new HTTP server implementation · 42242313

Michael Hanselmann authored 16 years ago

This patch adds another implementation of an HTTP server. It's
based on code of Python's BaseHTTPServer, from both version
2.4 and 3k. In the future we can write code to decide whether
we should fork for a request or not. Keep-alive is not supported.

Reviewed-by: iustinp

42242313

Add daemon library with mainloop · 821d9e43
Michael Hanselmann authored 16 years ago
```
This mainloop can be used in daemons like ganeti-noded.

Reviewed-by: iustinp
```
821d9e43
Some updates on the job queue design doc · e9f242e4
Iustin Pop authored 16 years ago
```
This clarifies the job storage and the reason for choosing it.

Reviewed-by: imsnah
```
e9f242e4

Update design-2.0-job-queue to reflect changes · 93c4f7f1

René Nussbaumer authored 16 years ago

With change 1773 a new status WAITLOCK was introduced if a job/opcode is
waiting for a lock. This change updates the document about the job-queue
accordingly.

Reviewed-by: iustinp

93c4f7f1

Convert rpc module to RpcRunner · 72737a7f

Iustin Pop authored 16 years ago

This big patch changes the call model used in internode-rpc from
standalong function calls in the rpc module to via a RpcRunner class,
that holds all the methods. This can be used in the future to enable
smarter processing in the RPC layer itself (some quick examples are not
setting the DiskID from cmdlib code, but only once in each rpc call,
etc.).

There are a few RPC calls that are made outside of the LU code, and
these calls are left as staticmethods, so they can be used without a
class instance (which requires a ConfigWriter instance).

Reviewed-by: imsnah

72737a7f

Cleanup in cmdlib for standalone function calls · b9bddb6b

Iustin Pop authored 16 years ago

This patch is a cleanup of the standalone functions in cmdlib. Many of
them too as argument a ConfigWriter instance, but some also took other
parameters from the lu (e.g. proc), and in the future, if we want to
also pass the RpcRunner, we would have to add yet another parameter.

One option is to make all these methods of top-level LogicalUnit class.
I took another approach, and made (almost) all these functions take as
first parameter the lu instance. It's like methods, just not declared
under LogicalUnit.

Reviewed-by: imsnah

b9bddb6b

Small random fixes · 7b3a8fb5

Iustin Pop authored 16 years ago

Indentation in bootstrap was wrong and some names in cmdlib.py were not
right.

Reviewed-by: imsnah

7b3a8fb5

Oct 09, 2008

Move instance hypervisor check to ExpandNames · 4b2f38dd

Iustin Pop authored 16 years ago

This check can be done earlier, in ExpandNames, and is needed here for
the hypervisor parameter check.

Reviewed-by: ultrotter

4b2f38dd

Oct 08, 2008

Update documentation & man pages for changed hypervisor names. · c2b05d85
Alexander Schreiber authored 16 years ago
```
Reviewed-by: imsnah
```
c2b05d85
Ajust config unittest. · c666722f
Oleksiy Mishchenko authored 16 years ago
```
Reviewed-by: iustinp
```
c666722f
Update scripts and qa config for changed hypervisor names. · e49099a4
Alexander Schreiber authored 16 years ago
```
Reviewed-by: ultrotter
```
e49099a4
Shorten variable names. · c63b161a
Alexander Schreiber authored 16 years ago
```
Reviewed-by: iustinp
```
c63b161a

Sanitize the hypervisor names · 00cd937c

Iustin Pop authored 16 years ago

Since in 2.0 the user will possibly have more interaction with the
hypervisor names, we sanitize them by removing the version numbers
(the version can be a prerequisite for the ganeti installation, we
shouldn't document it in variable names).

Reviewed-by: schreiberal

00cd937c

Also export OS_API to the OS scripts · 188fbf41

Iustin Pop authored 16 years ago

The idea is that if the OSes will support multiple version (e.g. both
1.2 and 2.0), then Ganeti should be able to talk to it using version
2.0, but then the script needs to be told nicely what version Ganeti is
using.

Reviewed-by: imsnah

188fbf41

Fix for gnt-cluster init. · 02f99608
Oleksiy Mishchenko authored 16 years ago
```
Reviewed-by: iustinp
```
02f99608

Move the hypervisor attribute to the instances · e69d05fd

Iustin Pop authored 16 years ago

This (big) patch moves the hypervisor type from the cluster to the
instance level; the cluster attribute remains as the default hypervisor,
and will be renamed accordingly in a next patch. The cluster also gains
the ‘enable_hypervisors’ attribute, and instances can be created with
any of the enabled ones (no provision yet for changing that attribute).

The many many changes in the rpc/backend layer are due to the fact that
all backend code read the hypervisor from the local copy of the config,
and now we have to send it (either in the instance object, or as a
separate parameter) for each function.

The node list by default will list the node free/total memory for the
default hypervisor, a new flag to it should exist to select another
hypervisor. Instance list has a new field, hypervisor, that shows the
instance hypervisor. Cluster verify runs for all enabled hypervisor
types.

The new FIXMEs are related to IAllocator, since now the node
total/free/used memory counts are wrong (we can't reliably compute the
free memory).

Reviewed-by: imsnah

e69d05fd

Oct 07, 2008

Updates to the security document · 6884c0ca

Iustin Pop authored 16 years ago

This patch changes formatting and the DRBD shared secret details, and
adds master daemon socket details to the security doc.

Reviewed-by: imsnah

6884c0ca

Move the SECURITY document to the doc/ dir · 73100cf5
Iustin Pop authored 16 years ago
```
Reviewed-by: imsnah
```
73100cf5
Fix formatting in design-2.0-os-interface · 43f30ee6
Iustin Pop authored 16 years ago
```
Reviewed-by: imsnah
```
43f30ee6
Small changes to the index design doc · 109509e4
Iustin Pop authored 16 years ago
```
This is just some additions of not-yet-mentioned docs.

Reviewed-by: ultrotter
```
109509e4
Change default instance reboot type to hard. · bf2fd71e
Alexander Schreiber authored 16 years ago
```
Merged r1777 from branches/ganeti/ganeti-1.2

Reviewed-by: imsnah
```
bf2fd71e
Add new design docs to Makefile.am · 0e861960
Alexander Schreiber authored 16 years ago
```
Reviewed-by: imsnah
```
0e861960

rpc.call_instance_migrate: pass the whole instance · 9f0e6b37

Iustin Pop authored 16 years ago

Currently the call_instance_migrate call only passes the instance name;
we need to pass the whole object for the hypervisor_type changes (all
the other individual instance rpc calls already pass the instance
object).

Reviewed-by: imsnah

9f0e6b37

Slightly change the hypervisor parameter example. · cd55576a
Alexander Schreiber authored 16 years ago
```
Reviewed-by: iustinp
```
cd55576a
Ganeti 2.0 cluster parameters design doc · 132b4ba2
Alexander Schreiber authored 16 years ago
```
Reviewed-by: ultrotter
```
132b4ba2

Implement job 'waiting' status · e92376d7

Iustin Pop authored 16 years ago

Background: when we have multiple jobs in the queue (more than just a
few), many of the jobs (up to the number of threads) will be in state
'running', although many of them could be actually blocked, waiting for
some locks. This is not good, as one cannot easily see what is
happening.

The patch extends the opcode/job possible statuses with another one,
waiting, which shows that the LU is in the acquire locks phase. The
mechanism for doing so is simple, we initialize (in the job queue) the
opcode with OP_STATUS_WAITLOCK, and when the processor is ready to give
control to the LU's Exec, it will call a notifier back into the
_JobQueueWorker that sets the opcode status to OP_STATUS_RUNNING (with
the proper queue locking). Because this mechanism does not save the job,
all opcodes on disk will be in status WAITLOCK and not RUNNING anymore,
so we also change the load sequence to consider WAITLOCK as RUNNING.

With the patch applied, creating in parallel (via burnin) five instances
on a five node cluster shows that only two are executing, while three
are waiting for locks.

Reviewed-by: imsnah

e92376d7

OS Interface design doc · 12222048
Guido Trotter authored 16 years ago
```
Reviewed-by: imsnah
```
12222048
Add .. contents:: marker to design docs · 47eb4b45
Guido Trotter authored 16 years ago
```
Reviewed-by: imsnah
```
47eb4b45

Oct 06, 2008

Implement job auto-archiving · 07cd723a

Iustin Pop authored 16 years ago

This patch adds a new luxi call that implements auto-archiving of jobs
older than a certain age (or -1 for all completed jobs), and the gnt-job
command that makes use of this (with 'all' for -1).

Reviewed-by: imsnah

07cd723a

Add a simple timespec parsing function · 2241e2b9

Iustin Pop authored 16 years ago

This function will be used for auto-archiving jobs via the command line.
The function is pretty simple, we only support up to weeks since months
and higher are not 'precise' entities, and dealing with them would
require us to start using calendar functions.

Reviewed-by: imsnah

2241e2b9

backend.py change to get cluster name from master · 62c9ec92

Iustin Pop authored 16 years ago

Currently there are three function in backend that need the cluster name
in order to instantiate an SshRunner. The patch changes these to get the
cluster name from the master in the rpc call; once the multi-hypervisor
change is implemented, then very few places in which we need the SCR
remain in the backend.

Reviewed-by: killerfoxi, imsnah

62c9ec92

Disable re-reading of config file · 3d3a04bc

Iustin Pop authored 16 years ago

Since the objects read from the config file are passed to the various
threads, it's unsafe to re-read the config file (and throw away
ConfigWriter._config_data). As such, we disable the re-reading of the
file (since now the master is the owner the file, it makes not sense to
re-read it), and any modifications to the file must be done offline,
otherwise they will be overwritten.

Reviewed-by: imsnah

3d3a04bc

RAPI Desing Doc · a72b3711
Oleksiy Mishchenko authored 16 years ago
```
Reviewed-by: iustinp
```
a72b3711

Start implementation of parallel burnin · ec5c88dc

Iustin Pop authored 16 years ago

This patch introduces a simple framework for executing jobs in parallel
in burnin (the ExecJobSet function) and the "--parallel" command line
flag.

The patch also changes the instance creation to run in parallel when the
above flag is given. Error handling/instance removal is currently flacky
with this options if there are errors in the instance creation.

We also modify burnin to reuse a single client.

Reviewed-by: imsnah

ec5c88dc

Fix gnt-job list with empty timestamps · e0ec0ff6

Iustin Pop authored 16 years ago

In case the job object doesn't have a timestamp (which is a separate
issue), the listing should not break. We fix this by changing the
FormatTimstamp function itself to return '?' in case the timestamp
doesn't look good (note that it still can break if non-integers are
returned, but this is unlikely).

Reviewed-by: imsnah

e0ec0ff6