- 17 Oct, 2008 2 commits
-
-
Guido Trotter authored
Reviewed-by: iustinp
-
Guido Trotter authored
Plus update it with the real variable name Reviewed-by: iustinp
-
- 15 Oct, 2008 1 commit
-
-
Iustin Pop authored
A new multi-node call is added that sets/resets the drain flag. Reviewed-by: imsnah
-
- 14 Oct, 2008 3 commits
-
-
Iustin Pop authored
The backend.FinalizeExport function is changed to use the beparams instead of the instance attributes. Future enhancements should be done in order to export and import/reuse the whole be/hv params. Reviewed-by: ultrotter
-
Iustin Pop authored
We have a problem with the current model of combining instance lists from multiple hypervisors: we don't allow duplicates, but "xm list" gives the same output for both pvm and hvm. This is a lack in the actual xen hypervisor implementation/split between pvm and hvm, but for now we implement a weak workaround: identical instance params will be allowed, and merged. This breaks because there is a delta in listing, and should be treated as temporary workaround only. Note that there are two cases for duplicate instance: the above one (xen is the same, whether pvm or hvm), and the other case, the real error, when we have two different hypervisors reporting the same instance name. The latter case needs to be handled better (not by refusing to list the instances in the backend). Reviewed-by: ultrotter
-
Iustin Pop authored
The newly-added node-specific ValidateParams hypervisor method is exported over RPC, using the semi-standard (success, message) return value. Multi-node call, so that we call on both primary and secondary at once. Reviewed-by: ultrotter
-
- 12 Oct, 2008 1 commit
-
-
Iustin Pop authored
Currently, we check if we have a given ip address (i.e. it's alive on one of our interfaces) but manually calling TcpPing(source=localhost). This works, but having it spread all over the code makes it hard to change the implementation. The patch abstracts this into a separate utils.OwnIpAddress(addr) function. We add a rpc call for it, which we use instead of the (single-use of) call_node_tcp_ping. We leave node_tcp_ping in, as seems useful and eventually it should be removed in a separate patch. Reviewed-by: imsnah
-
- 10 Oct, 2008 1 commit
-
-
Guido Trotter authored
Allow multiple api versions in an OS. This is according to the OS API changes design doc, by which an OS can support multiple versions of the Ganeti API and if one is supported by Ganeti it will work. Since up to version 5 of the API mandates an OS could support only one version, this change is retrocompatible with it and requires no version bump up. Reviewed-by: iustinp
-
- 08 Oct, 2008 1 commit
-
-
Iustin Pop authored
This (big) patch moves the hypervisor type from the cluster to the instance level; the cluster attribute remains as the default hypervisor, and will be renamed accordingly in a next patch. The cluster also gains the ‘enable_hypervisors’ attribute, and instances can be created with any of the enabled ones (no provision yet for changing that attribute). The many many changes in the rpc/backend layer are due to the fact that all backend code read the hypervisor from the local copy of the config, and now we have to send it (either in the instance object, or as a separate parameter) for each function. The node list by default will list the node free/total memory for the default hypervisor, a new flag to it should exist to select another hypervisor. Instance list has a new field, hypervisor, that shows the instance hypervisor. Cluster verify runs for all enabled hypervisor types. The new FIXMEs are related to IAllocator, since now the node total/free/used memory counts are wrong (we can't reliably compute the free memory). Reviewed-by: imsnah
-
- 07 Oct, 2008 1 commit
-
-
Iustin Pop authored
Currently the call_instance_migrate call only passes the instance name; we need to pass the whole object for the hypervisor_type changes (all the other individual instance rpc calls already pass the instance object). Reviewed-by: imsnah
-
- 06 Oct, 2008 2 commits
-
-
Iustin Pop authored
Currently there are three function in backend that need the cluster name in order to instantiate an SshRunner. The patch changes these to get the cluster name from the master in the rpc call; once the multi-hypervisor change is implemented, then very few places in which we need the SCR remain in the backend. Reviewed-by: killerfoxi, imsnah
-
Iustin Pop authored
More places actually use the SshRunner than just the gnt-cluster commands. Reviewed-by: ultrotter
-
- 01 Oct, 2008 3 commits
-
-
Michael Hanselmann authored
Get rid of ssconf and convert to configuration instead. Reviewed-by: iustinp
-
Michael Hanselmann authored
Replacing ssconf with configuration. Reviewed-by: iustinp
-
Michael Hanselmann authored
Replacing ssconf with simpleconfig. Reviewed-by: iustinp
-
- 09 Sep, 2008 2 commits
-
-
Michael Hanselmann authored
Otherwise, corruption could occur in some corner cases. E.g. when LeaveNode is running in a child and is in the process of removing queue files, the main process gets killed, started again and gets a request to update the queue. This is rather extreme corner case, but we should opt for safety. Reviewed-by: iustinp
-
Iustin Pop authored
The _GetMasterInfo() function needs to export the master name too to be useful in master safety checks. This patch makes it a public (no _) function and adds a third element in the return tuple. Its callers are modified too. Reviewed-by: imsnah
-
- 14 Aug, 2008 1 commit
-
-
Guido Trotter authored
It's handy to make the os scripts know which hypervisor the instance is going to run under. In order not to change the os API we pass this information in the environment, where the os scripts can access it if they're hypervisor-aware. Reviewed-by: imsnah
-
- 08 Aug, 2008 7 commits
-
-
Michael Hanselmann authored
The lock should only be removed if ganeti-noded is going to quit. Otherwise it needs to be kept to prevent another process from creating it again while we're still holding the (removed) lock. This is due to POSIX filesystem semantics. Reviewed-by: iustinp
-
Michael Hanselmann authored
The code cleaning the queue will make use of it. Reviewed-by: iustinp
-
Michael Hanselmann authored
This will be used to archive jobs. Reviewed-by: iustinp
-
Michael Hanselmann authored
Another function will need to check whether its parameters are job queue files. Reviewed-by: iustinp
-
Michael Hanselmann authored
The job queue is now updated through its own RPC functions. Reviewed-by: iustinp
-
Michael Hanselmann authored
jobqueue_update: Uploads a job queue file's content to a node. The most common operation is to upload something that we already have in a string. Unlike in the upload_file function, the file is not read again when distributing changes, but content has to be passed as a string. jobqueue_purge: Removes all queue related files from a node. Reviewed-by: iustinp
-
Michael Hanselmann authored
JobQueuePurge() will be used by an RPC function. Reviewed-by: iustinp
-
- 06 Aug, 2008 1 commit
-
-
Michael Hanselmann authored
Old job files shouldn't be left on nodes removed from a cluster. Reviewed-by: iustinp
-
- 31 Jul, 2008 1 commit
-
-
Michael Hanselmann authored
This is needed for job queue replication. Reviewed-by: iustinp
-
- 30 Jul, 2008 4 commits
-
-
Iustin Pop authored
This is mostly: - whitespace fix (space at EOL in some files, not all, broken indentation, etc) - variable names overriding others (one is a real bug in there) - too-long-lines - cleanup of most unused imports (not all) Reviewed-by: ultrotter
-
Iustin Pop authored
Reviewed-by: imsnah
-
Iustin Pop authored
This (big) patch reworks the master startup/shutdown and the fixes the master failover. What does the patch do? For master start/stop: - remove the old ganeti-master script and its associated man page - moves the ip start/stop directly into the backend.(Start|Stop)Master - adds start/stop of the master/rapi daemon into these functions, selectively based on the start/stop arguments - makes the master call via rpc StartMaster(start_daemons=False) to the local node so that the master IP is started - and finally changes the example init.d script to directly start and stop all three daemons, since they do the right thing (depending on master/not master role) For master failover: - moves the code from LUMasterFailover into bootstrap.MasterFailover, since we need to start/stop the master during this operation and thus it can't be executed from the master - removes the LUMasterFailover and its associated opcode Notes: ubuntu's /etc/lsb-base-logging.sh is dumb, so the messages 'not master' are not seen during startup on non-master nodes. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch adds a new, unused for now, parameter to the start and stop master operations in backend. The idea behind it is that we need to be able to control whether the IP (de)activation is coupled with daemon startup/shutdown. The callers are also modified to pass this parameter (even if unused for now). Reviewed-by: ultrotter
-
- 23 Jul, 2008 1 commit
-
-
Iustin Pop authored
This patch adds distribution of the queue serial file after each write to it (but before a new job is created and written with that ID, and before a response is returned, so we should be safe from crashes in between). Currently it only logs if a node cannot be contacted, it should abort if > 50% errors are seen. Reviewed-by: imsnah
-
- 11 Jul, 2008 3 commits
-
-
Iustin Pop authored
The patch also switches some of the exception logs to use logging.exception (and therefore the log message will have a diferent format). (Note that this might not be a good choice in all cases, though) Reviewed-by: imsnah
-
Iustin Pop authored
This is the same fix as for GetVolumeList. I've checked manually and all other places that call lvm commands are already checking the output validity in terms of correct number of fields. Reviewed-by: ultrotter
-
Iustin Pop authored
Sometimes ‘lvs’ can spit error messages on stdout, even when one wants to parse the output: ... Inconsistent metadata copies found - updating to use version 2776 ... So we need to validate the output to guard against such cases. The patch converts the split on the separater to match against a regex and extract the fields via groups. The original separator choice is a bad one now :( Reviewed-by: imsnah
-
- 27 Jun, 2008 2 commits
-
-
Guido Trotter authored
What could possibly go wrong? Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
- 20 Jun, 2008 1 commit
-
-
Iustin Pop authored
This patch adds rpc layer calls (in rpc.py and the equivalent in ganeti-noded) to close a list of block devices, and the wrapper in backend.py that takes a list of Disk objects, identifies them and returns correctly formatted results. The reason why this very basic call was missing until now from the rpc layer is that we usually don't care about device closes (though we should, and will do so in the future) as only drbd has a meaningful Close() operation; right now we directly do Shutdown(). The patch is clean enough that it's actually independent of the live migration implementation. Reviewed-by: imsnah
-
- 16 Jun, 2008 2 commits
-
-
Iustin Pop authored
This patch adds a wrapper over the block device grow operation that converts the input and output parameters as needed for the rpc layer. Reviewed-by: imsnah
-
Iustin Pop authored
This patch adds the migration rpc call and its implementation in the backend. The patch does not deal with the correct activation of disks. Because of the new RPC, the protocol version is increased. Reviewed-by: imsnah
-