- Oct 08, 2008
-
-
Iustin Pop authored
Since in 2.0 the user will possibly have more interaction with the hypervisor names, we sanitize them by removing the version numbers (the version can be a prerequisite for the ganeti installation, we shouldn't document it in variable names). Reviewed-by: schreiberal
-
Oleksiy Mishchenko authored
Reviewed-by: iustinp
-
Iustin Pop authored
This (big) patch moves the hypervisor type from the cluster to the instance level; the cluster attribute remains as the default hypervisor, and will be renamed accordingly in a next patch. The cluster also gains the ‘enable_hypervisors’ attribute, and instances can be created with any of the enabled ones (no provision yet for changing that attribute). The many many changes in the rpc/backend layer are due to the fact that all backend code read the hypervisor from the local copy of the config, and now we have to send it (either in the instance object, or as a separate parameter) for each function. The node list by default will list the node free/total memory for the default hypervisor, a new flag to it should exist to select another hypervisor. Instance list has a new field, hypervisor, that shows the instance hypervisor. Cluster verify runs for all enabled hypervisor types. The new FIXMEs are related to IAllocator, since now the node total/free/used memory counts are wrong (we can't reliably compute the free memory). Reviewed-by: imsnah
-
- Oct 07, 2008
-
-
Iustin Pop authored
Currently the call_instance_migrate call only passes the instance name; we need to pass the whole object for the hypervisor_type changes (all the other individual instance rpc calls already pass the instance object). Reviewed-by: imsnah
-
Iustin Pop authored
Background: when we have multiple jobs in the queue (more than just a few), many of the jobs (up to the number of threads) will be in state 'running', although many of them could be actually blocked, waiting for some locks. This is not good, as one cannot easily see what is happening. The patch extends the opcode/job possible statuses with another one, waiting, which shows that the LU is in the acquire locks phase. The mechanism for doing so is simple, we initialize (in the job queue) the opcode with OP_STATUS_WAITLOCK, and when the processor is ready to give control to the LU's Exec, it will call a notifier back into the _JobQueueWorker that sets the opcode status to OP_STATUS_RUNNING (with the proper queue locking). Because this mechanism does not save the job, all opcodes on disk will be in status WAITLOCK and not RUNNING anymore, so we also change the load sequence to consider WAITLOCK as RUNNING. With the patch applied, creating in parallel (via burnin) five instances on a five node cluster shows that only two are executing, while three are waiting for locks. Reviewed-by: imsnah
-
- Oct 06, 2008
-
-
Iustin Pop authored
This patch adds a new luxi call that implements auto-archiving of jobs older than a certain age (or -1 for all completed jobs), and the gnt-job command that makes use of this (with 'all' for -1). Reviewed-by: imsnah
-
Iustin Pop authored
This function will be used for auto-archiving jobs via the command line. The function is pretty simple, we only support up to weeks since months and higher are not 'precise' entities, and dealing with them would require us to start using calendar functions. Reviewed-by: imsnah
-
Iustin Pop authored
Currently there are three function in backend that need the cluster name in order to instantiate an SshRunner. The patch changes these to get the cluster name from the master in the rpc call; once the multi-hypervisor change is implemented, then very few places in which we need the SCR remain in the backend. Reviewed-by: killerfoxi, imsnah
-
Iustin Pop authored
Since the objects read from the config file are passed to the various threads, it's unsafe to re-read the config file (and throw away ConfigWriter._config_data). As such, we disable the re-reading of the file (since now the master is the owner the file, it makes not sense to re-read it), and any modifications to the file must be done offline, otherwise they will be overwritten. Reviewed-by: imsnah
-
Iustin Pop authored
In case the job object doesn't have a timestamp (which is a separate issue), the listing should not break. We fix this by changing the FormatTimstamp function itself to return '?' in case the timestamp doesn't look good (note that it still can break if non-integers are returned, but this is unlikely). Reviewed-by: imsnah
-
Iustin Pop authored
Since our locks are not gathered nicely, we can have jobs that are actually blocking on locks (parallel burnin shows this), so at least we need to increase the number of threads above the usual number of jobs we could have in a such a case. Reviewed-by: imsnah
-
Iustin Pop authored
More places actually use the SshRunner than just the gnt-cluster commands. Reviewed-by: ultrotter
-
Iustin Pop authored
Currently the SshRunner uses a SimpleConfigReader instance, however this is not best. We change it to use the cluster name directly (and its constructor now takes this as parameter, instead of SCR), and its callers are change to pass the name directly. As a consequence, we can now remove the initialization of SCR in gnt-cluster (copyfile and command), and instead we query the master for the cluster name). Reviewed-by: imsnah
-
- Oct 05, 2008
-
-
Iustin Pop authored
The ssconf migration left this out. Reviwed-by: imsnah,ultrotter
-
- Oct 01, 2008
-
-
Michael Hanselmann authored
Remove leftovers from ssconf. Reviewed-by: iustinp
-
Michael Hanselmann authored
sstore is no longer used in LUs. Reviewed-by: iustinp
-
Michael Hanselmann authored
Replace ssconf with configuration. Reviewed-by: iustinp
-
Michael Hanselmann authored
Replacing ssconf with configuration. Cluster rename is broken and stays that way. Reviewed-by: iustinp
-
Michael Hanselmann authored
Get rid of ssconf and convert to configuration instead. Reviewed-by: iustinp
-
Michael Hanselmann authored
Replacing ssconf with utility functions. Reviewed-by: iustinp
-
Michael Hanselmann authored
Replacing ssconf with configuration. Reviewed-by: iustinp
-
Michael Hanselmann authored
Replacing ssconf with configuration. Reviewed-by: iustinp
-
Michael Hanselmann authored
The configuration version is now again in the configuration file. Reviewed-by: iustinp
-
Michael Hanselmann authored
Replacing ssconf with simpleconfig. Reviewed-by: iustinp
-
Michael Hanselmann authored
This can be used to retrieve certain cluster config values from within clients. OpDumpClusterConfig was not used anywhere, hence I'm just reusing it. The way ConfigWriter.DumpConfig returned the configuration was not thread-safe, anyway (no deepcopy). Reviewed-by: iustinp
-
Michael Hanselmann authored
These functions will be used to access config values instead of using ssconf. Reviewed-by: iustinp
-
Michael Hanselmann authored
This will be used to read the configuration file in the node daemon. The write functionality is needed for master failover. Reviewed-by: iustinp
-
Iustin Pop authored
The watcher has one last use of ganeti commands as opposed to sending requests via luxi. The patch changes this to use the cli functions. The patch also has two other changes: - fix the docstring for OpVerifyDisks (found out while converting this) - enable stderr logging on the watcher when “-d” is passes Reviewed-by: imsnah
-
Michael Hanselmann authored
ssconf will become write-only from ganeti-masterd's point of view, therefore all settings in there need to go into the main configuration file. Reviewed-by: iustinp
-
Michael Hanselmann authored
Future patches will add even more variables to the cluster config. Adding more parameters wouldn't make the function easier to use and it doesn't make sense to pass them to another function, as it's only done once in bootstrap.py on cluster initialization. Reviewed-by: iustinp
-
Iustin Pop authored
Curently PollJob accepts a generic job, but will return (history artifact) only the first opcode result. This is wrong, as it doesn't allow polling of a job with multiple results. Its only caller (for now) is also changed, so no functional changes should happen. Reviewed-by: ultrotter, amishchenko
-
- Sep 30, 2008
-
-
Iustin Pop authored
This patch adds start, stop, and received timestamp for jobs (and allows querying of them), and allows querying of the opcode timestamps. Reviewed-by: imsnah
-
Iustin Pop authored
Currently we format the timestamp inside the gnt-job info function. We will need this more times in the future, so move it to cli.py as a separate, exported function. Reviewed-by: imsnah
-
- Sep 29, 2008
-
-
Iustin Pop authored
This patch adds the job execution log in “gnt-job info” and also allows its selection in “gnt-job list” (however here it's not very useful as it's not easy to parse). It does this by adding a new field in the query job call, named ‘oplog’. With this, one can get a very clear examination of the job. What remains to be added would be timestamps for start/stop of the processing for the job itself and its opcodes. Reviewed-by: imsnah
-
Iustin Pop authored
For now we only use the ‘C’ protocol so we can put it in constants.py instead of hardcoding it. Reviewed-by: imsnah
-
Iustin Pop authored
This patch enables the use of the shared secrets for DRBD8 disks, using (hardcoded in constants.py) the md5 digest algorithm. For making this more flexible, either we implement a cluster parameter (once the new model is in place), or we can make it ./configure-time selectable. Reviewed-by: imsnah
-
Iustin Pop authored
This patch, which is similar to r1679 (Extend DRBD disks with minors attribute), extends the logical and physical id of the DRBD disks with a shared secret attribute. This is generated at disk creation time and saved in the config file. The generation of the secret is done so that we don't have duplicates in the configuration (otherwise the goal of preventing cross-connection will not be reached), so we add to config.py more than just a simple call to utils.GenerateSecret(). The patch does not yet enable the use of the secrets. Reviewed-by: imsnah
-
Iustin Pop authored
It is not currently possibly to show a summary of the job in the output of “gnt-job list”. The closes is listing the whole opcode(s), but that is too verbose. Also, the default output (id, status) is not very useful, unless one looks for (and knows about) an exact job ID. The patch adds a “summary” description of a job composed of the list of OP_ID of the individual opcodes. Moreover, if an opcode has a ‘logical’ target in a certain opcode field (e.g. start instance has the instance name as the target), then it is included in the formatting also. It's easier to explain via a sample output: gnt-job list ID Status Summary 1 error NODE_QUERY 2 success NODE_ADD(gnta2) 3 success CLUSTER_QUERY 4 success NODE_REMOVE(gnta2.example.com) 5 error NODE_QUERY 6 success NODE_ADD(gnta2) 7 success NODE_QUERY 8 success OS_DIAGNOSE 9 success INSTANCE_CREATE(instance1.example.com) 10 success INSTANCE_REMOVE(instance1.example.com) 11 error INSTANCE_CREATE(instance1.example.com) 12 success INSTANCE_CREATE(instance1.example.com) 13 success INSTANCE_SHUTDOWN(instance1.example.com) 14 success INSTANCE_ACTIVATE_DISKS(instance1.example.com) 15 error INSTANCE_CREATE(instance2.example.com) 16 error INSTANCE_CREATE(instance2.example.com) 17 success INSTANCE_CREATE(instance2.example.com) 18 success INSTANCE_ACTIVATE_DISKS(instance1.example.com) 19 success INSTANCE_ACTIVATE_DISKS(instance2.example.com) 20 success INSTANCE_SHUTDOWN(instance1.example.com) 21 success INSTANCE_SHUTDOWN(instance2.example.com) This is done by a simple change to the opcode classes, which allows an opcode to format itself. The additional function is small enough that it can go in opcodes.py, where it could also be used by a client if needed. Reviewed-by: imsnah
-
Iustin Pop authored
Unless we decide to change the job identifiers to integer, we should at least sort the list returned by _GetJobIDsUnlocked. Reviewed-by: imsnah
-
- Sep 28, 2008
-
-
Iustin Pop authored
The bootstrap code needs a pseudo-secret and this is currently generated inside the InitGanetiServerSetup function. Since more users will need this, move it to utils.py Reviewed-by: ultrotter
-