- Jan 23, 2009
-
-
Iustin Pop authored
When creating ‘fake’ results for offline nodes, we currently don't pass the call attribute. This complicates debugging, so even though this should not matter in practice, it's better to fix it. Reviewed-by: imsnah
-
Iustin Pop authored
This removes some constraints: - only two disks supported, this is no longer true as the underlying functions can now compute size for a variable number of disks - error when the hypervisor was not being passed - typo error Reviewed-by: imsnah
-
- Jan 22, 2009
-
-
Iustin Pop authored
This is less of an actual issue for regular gnt-* clients, but it's easily reproducible with burnin and possible with RAPI (depending on how the program uses luxi.Client(s)). In case of burnin, if we interrupt the client (^C) while it polls the job, it will abort and raise an error. After that, burnin issues a remove instance job, and at this point, we send the submit job (remove) call but the first thing we read from the socket will be the response to the previous poll job request, since that was queued already from the master. To solve this, whenever we detect an error in Transport.Call(), we close that transport and re-create a new one, to start anew. The other alternative would be to introduce a sequence to the protocol, but this is something that would be design-level change and it's not recommended at this stage. Reviewed-by: imsnah
-
- Jan 21, 2009
-
-
Guido Trotter authored
When an instance fails to shut down we currently log its whole object, rather than just the instance name. Reviewed-by: iustinp
-
Guido Trotter authored
If the KVM live migration ends up in a 'failed' state it has been aborted at the kvm level, and the machine is still running locally. We support also the 'cancelled' state even though there should be no way of reaching it, without manual intervention. Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
Guido Trotter authored
The tcp port used for migrating KVM instances is selectable at ./configure time. We use a single port as nodes are locked anyway during a migration, so no two migrations can happen at the same time to the same node. Reviewed-by: iustinp
-
Guido Trotter authored
Throughout the kvm code we very often look for the instance pidfile name, read it, and check if the process is alive. Abstract this into a private function and use that one instead. This patch also changes RebootInstance to check whether the instance is alive before trying to reboot it. Reviewed-by: iustinp
-
Guido Trotter authored
RebootInstance was broken, because it just used to call StartInstance with wrong parameters. With this patch we still stop the instance, but use the saved kvm runtime to start it again. Reviewed-by: iustinp
-
Guido Trotter authored
When we ask the instance to shutdown sometimes the command won't work, especially if the instance isn't fully booted up. We'll wait for a bit, and give it a few chances before giving up. Reviewed-by: iustinp
-
Guido Trotter authored
These are used, for the xen hypervisor, to copy the xen config file to the remote node. This breaks migration for instances which have been migrated, but not restarted, with the old code, for which the config file was just lost. Reviewed-by: iustinp
-
Iustin Pop authored
This patch converts the DRBD minors reservation protocol from explicit release to automatic release on the success paths. On the errors paths, it's still needed to manual release. The patch doesn't bring much by itself, but is needed for a future patch which enhances the automatic verification of configuration consistency. Reviewed-by: ultrotter
-
Iustin Pop authored
Two are real errors (invalid names) and one is style error (overriding name from outer scope). Reviewed-by: ultrotter
-
Iustin Pop authored
This was forgotten in the recent “switch to explicit ignore rules”. Reviewed-by: imsnah
-
Iustin Pop authored
Currently the rpc module logs the error description and target node in rpc calls logging, as such: 2009-01-21 00:50:01,456: pid=1051/Thread-21 ERROR RPC error from node node1.example.com: Connection failed (111: Connection refused) but this doesn't help to understand which call caused this (here it's an offline node which should not be contacted at all). This patch adds the logging of the call too, so cases like the above can be debugged easier. Reviewed-by: imsnah, ultrotter
-
Iustin Pop authored
Due to historic reasons, the “should run or not” attribute of an instance was denoted by its “status” attribute having a string value of either ‘up’ or ‘down’. Checking this is in code was done via hardcoding of the strings. This was long done for a redo, and this patch changes this attribute to “admin_up” having a boolean value. The patch is in fact shorter than I expected, and passes burnin. The patch also fixes an error in BuildInstanceHookEnvByObject where the instance.os was passed as the status value. Reviewed-by: ultrotter
-
Guido Trotter authored
MigrationInfo, AcceptInstance and AbortMigration are implemented as hypervisor specific functions, and by default they do nothing (as they're not always necessary). This patch also converts hv_base.MigrateInstance docstring to epydoc, adds a missing @type to the GetInstanceInfo docstring, and removes an unneeded empty line. Reviewed-by: iustinp
-
Guido Trotter authored
At instance startup time we save the kvm runtime, and at stop time we delete it. This patch also includes a function to load the kvm runtime, which is unused yet. Reviewed-by: iustinp
-
Guido Trotter authored
Before we used to generate the kvm command line and then just run it. With this patch we split the generation from the time it is run, allowing us to save it and replay it at reboot. We must take special care about instance nics: - We can't include them in the saved command line, as they point to temporary files - We can't just generate them at exec time, because we would apply those changes, but not all the other ones, to a running instance, thus making it inconsistent (for example if an instance had a memory increased and one more nic, in a soft reboot we would add the nic, but not the memory) So we'll just save the instance nic data at the time the kvm runtime data is generated, and transform it into actual parameters at execution time. Reviewed-by: iustinp
-
Guido Trotter authored
Currently the hypervisor is expected to do all the migration from the source side. With this patch we also add the option of passing some information to the target side, and starting some operation there. As a bonus, a function to cleanup any started operation is included. Reviewed-by: iustinp
-
Iustin Pop authored
With the addition of minors, this needs to show them too. Reviewed-by: ultrotter
-
- Jan 20, 2009
-
-
Guido Trotter authored
Currently we keep pid files and control files. In the conf dir we'll also keep the data to start the instance anew, and the network interface scripts. These will then be copied to a separate area (since _CONF_DIR could be mounted 'noexec') and used to start the instance. This patch also adds comments to state what the various directories are used for. Reviewed-by: iustinp
-
Guido Trotter authored
Abstract the monitor and serial socket naming in two functions, and reuse them to cleanup the files after shutdown. Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
Guido Trotter authored
When StopInstance raises an HypervisorError, report it in the logged message to ease with debugging. Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
Iustin Pop authored
(this is related to the master daemon log) Currently it's not possible to follow (in the non-debug runs) the logical execution thread of jobs. This is due to the fact that we don't log the thread name (so we lose the association of log messages to jobs) and we don't log the start/stop of job and opcode execution. This patch adds a new parameter to utils.SetupLogging that enables thread name logging, and promotes some log entries from debug to info. With this applied, it's easier to understand which log messages relate to which jobs/opcodes. The patch also moves the "INFO client closed connection" entry to debug level, since it's not a very informative log entry. Reviewed-by: ultrotter
-
Michael Hanselmann authored
This way newly added files will be not be excluded by default. Fixes also a small whitespace error in utils.py. Reviewed-by: iustinp
-
Iustin Pop authored
This allows the rename failures to show the ouput of OS scripts. Reviewed-by: ultrotter
-
Iustin Pop authored
As per Michael's comment, gitignore should not ignore a couple of real files from the autotools/ directory. Reviewed-by: ultrotter
-
Iustin Pop authored
The ConfigWriter.AllocateDRBDMinor requires the instance name, not the instance object. The LUSetInstanceParms is passing wrongly the instance object, which can cause breakage. The patch also adds asserts to check for this mismatch in ConfigWriter. Reviewed-by: ultrotter
-
Iustin Pop authored
The urllib2 module has very bad error handling. This patch changes to urllib which is simpler, and we derive a custom class from the FancyURLopener. Burning is no longer keeping sockets in CLOSE_WAIT state with this patch. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch adds support for verification of drbd minors space in cluster verify: minors which belong to running instances and should be online but are not, and minors which do not belong to any instace but are in use. The patch requires exposing some methods from bdev.DRBD8 and config.ConfigWriter which were until now private methods. Reviewed-by: ultrotter
-
Iustin Pop authored
Reviewed-by: ultrotter
-
Iustin Pop authored
In order to prevent errors with old, in-use DRBD minors, we check and abort at create time if our minor is already in use. For this we need to also modify DRBD8Status to be able to parse cs:Unconfigured devices. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch adds a tail file function, to be used for parsing and returning in the job log OS installation failures. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch adds unified temporary file handling to the testutils.GanetiTestCase class, which adds easy creation and automated cleanup of temporary files. The patch allows a simpler handling in a couple of test cases but requires all child classes to call the parent setUp and tearDown methods. Reviewed-by: ultrotter
-
Iustin Pop authored
Reviewed-by: ultrotter
-