- Jan 21, 2009
-
-
Iustin Pop authored
Currently the rpc module logs the error description and target node in rpc calls logging, as such: 2009-01-21 00:50:01,456: pid=1051/Thread-21 ERROR RPC error from node node1.example.com: Connection failed (111: Connection refused) but this doesn't help to understand which call caused this (here it's an offline node which should not be contacted at all). This patch adds the logging of the call too, so cases like the above can be debugged easier. Reviewed-by: imsnah, ultrotter
-
Iustin Pop authored
Due to historic reasons, the “should run or not” attribute of an instance was denoted by its “status” attribute having a string value of either ‘up’ or ‘down’. Checking this is in code was done via hardcoding of the strings. This was long done for a redo, and this patch changes this attribute to “admin_up” having a boolean value. The patch is in fact shorter than I expected, and passes burnin. The patch also fixes an error in BuildInstanceHookEnvByObject where the instance.os was passed as the status value. Reviewed-by: ultrotter
-
Guido Trotter authored
MigrationInfo, AcceptInstance and AbortMigration are implemented as hypervisor specific functions, and by default they do nothing (as they're not always necessary). This patch also converts hv_base.MigrateInstance docstring to epydoc, adds a missing @type to the GetInstanceInfo docstring, and removes an unneeded empty line. Reviewed-by: iustinp
-
Guido Trotter authored
At instance startup time we save the kvm runtime, and at stop time we delete it. This patch also includes a function to load the kvm runtime, which is unused yet. Reviewed-by: iustinp
-
Guido Trotter authored
Before we used to generate the kvm command line and then just run it. With this patch we split the generation from the time it is run, allowing us to save it and replay it at reboot. We must take special care about instance nics: - We can't include them in the saved command line, as they point to temporary files - We can't just generate them at exec time, because we would apply those changes, but not all the other ones, to a running instance, thus making it inconsistent (for example if an instance had a memory increased and one more nic, in a soft reboot we would add the nic, but not the memory) So we'll just save the instance nic data at the time the kvm runtime data is generated, and transform it into actual parameters at execution time. Reviewed-by: iustinp
-
Guido Trotter authored
Currently the hypervisor is expected to do all the migration from the source side. With this patch we also add the option of passing some information to the target side, and starting some operation there. As a bonus, a function to cleanup any started operation is included. Reviewed-by: iustinp
-
Iustin Pop authored
With the addition of minors, this needs to show them too. Reviewed-by: ultrotter
-
- Jan 20, 2009
-
-
Guido Trotter authored
Currently we keep pid files and control files. In the conf dir we'll also keep the data to start the instance anew, and the network interface scripts. These will then be copied to a separate area (since _CONF_DIR could be mounted 'noexec') and used to start the instance. This patch also adds comments to state what the various directories are used for. Reviewed-by: iustinp
-
Guido Trotter authored
Abstract the monitor and serial socket naming in two functions, and reuse them to cleanup the files after shutdown. Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
Guido Trotter authored
When StopInstance raises an HypervisorError, report it in the logged message to ease with debugging. Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
Guido Trotter authored
Reviewed-by: iustinp
-
Iustin Pop authored
(this is related to the master daemon log) Currently it's not possible to follow (in the non-debug runs) the logical execution thread of jobs. This is due to the fact that we don't log the thread name (so we lose the association of log messages to jobs) and we don't log the start/stop of job and opcode execution. This patch adds a new parameter to utils.SetupLogging that enables thread name logging, and promotes some log entries from debug to info. With this applied, it's easier to understand which log messages relate to which jobs/opcodes. The patch also moves the "INFO client closed connection" entry to debug level, since it's not a very informative log entry. Reviewed-by: ultrotter
-
Michael Hanselmann authored
This way newly added files will be not be excluded by default. Fixes also a small whitespace error in utils.py. Reviewed-by: iustinp
-
Iustin Pop authored
This allows the rename failures to show the ouput of OS scripts. Reviewed-by: ultrotter
-
Iustin Pop authored
As per Michael's comment, gitignore should not ignore a couple of real files from the autotools/ directory. Reviewed-by: ultrotter
-
Iustin Pop authored
The ConfigWriter.AllocateDRBDMinor requires the instance name, not the instance object. The LUSetInstanceParms is passing wrongly the instance object, which can cause breakage. The patch also adds asserts to check for this mismatch in ConfigWriter. Reviewed-by: ultrotter
-
Iustin Pop authored
The urllib2 module has very bad error handling. This patch changes to urllib which is simpler, and we derive a custom class from the FancyURLopener. Burning is no longer keeping sockets in CLOSE_WAIT state with this patch. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch adds support for verification of drbd minors space in cluster verify: minors which belong to running instances and should be online but are not, and minors which do not belong to any instace but are in use. The patch requires exposing some methods from bdev.DRBD8 and config.ConfigWriter which were until now private methods. Reviewed-by: ultrotter
-
Iustin Pop authored
Reviewed-by: ultrotter
-
Iustin Pop authored
In order to prevent errors with old, in-use DRBD minors, we check and abort at create time if our minor is already in use. For this we need to also modify DRBD8Status to be able to parse cs:Unconfigured devices. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch adds a tail file function, to be used for parsing and returning in the job log OS installation failures. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch adds unified temporary file handling to the testutils.GanetiTestCase class, which adds easy creation and automated cleanup of temporary files. The patch allows a simpler handling in a couple of test cases but requires all child classes to call the parent setUp and tearDown methods. Reviewed-by: ultrotter
-
Iustin Pop authored
Reviewed-by: ultrotter
-
Iustin Pop authored
This allows the install and reinstall instance to return (hopefully) relevant log files from the OS create scripts. Reviewed-by: ultrotter
-
Iustin Pop authored
This will record the failure cause in starting up the instance in the job log (and thus to the user). Reviewed-by: ultrotter
-
- Jan 19, 2009
-
-
Iustin Pop authored
Commit 2302 only modified _CreateBlockDevOnPrimary to the new style result, but _CreateBlockDevOnSecondary was forgotten. After the merger of the two functions, _CreateBlockDevOnSecondary was taken as template so we checked against old-style values, thus completely breaking error handling. Reviewed-by: imsnah
-
Iustin Pop authored
Instead of having the default live in the gnt-cluster script, we move it to the constants file. The patch also fixes a typo on constants.py. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch replaces a few obvious uses of [instance.primary_node] + list(instance.secondary_nodes) (or similar usage) with the new instance.all_nodes. Reviewed-by: ultrotter
-
Iustin Pop authored
Commit 2294 introduced a new instance.all_nodes property, which unfortunately is working incorrectly for non-drbd instances. This patch fixes it by making sure the primary node is always added to the set, even before recursing over (any potential) children. Reviewed-by: imsnah
-
Iustin Pop authored
We don't need to pre-create the node entries in lvmap, since they will be created at recursion time. Reviewed-by: ultrotter
-
Iustin Pop authored
Some callers of _CreateBlockDev need recursive behaviour, but not all. The replace secondary first creates (manually) new LVs to ensure storage is there, and then it creates the new DRBD. At this point, we need a non-recursive call so that the LVs are not needlessly re-created. This patch splits the single device creation into a separate function, so that LUReplaceDisks can use it. Reviewed-by: ultrotter
-
Iustin Pop authored
Since only two boolean parameters differ between these two functions, we combine them as to have less code duplication. This will be needed in the future as we will need to split off the recursive part off. Reviewed-by: ultrotter
-
Iustin Pop authored
This allows errors to be visible at the user level instead of just node daemon logs. Reviewed-by: ultrotter
-
Iustin Pop authored
For future propagation of error messages from backend to cmdlib and to the job log, just having True/False return from the disk creation function is not enough. This patch converts these functions (_CreateDisks, _CreateBlockDevOnXXX) to raise exception on errors, and otherwise the return value is None. Reviewed-by: ultrotter
-
Iustin Pop authored
Currently when creation LVM-based instances, we always get the extremely-confusing message "ERROR Can't find LV /dev/xenvg/..." which is actually expected. This behaviour was introduced before we had UUID-style LV names, since at that point it was not a unexpected to have such volumes laying around after a failed creation. Today, it's much more of an error to see existing volumes, and it's better to abort with a failure. Since bdev.LogicalVolume.Create() method will raise an error in case it exists, we can remove this check in backend before creating the device. The Create methods for DRBD and FileStorage currently don't raise exception, as behaviour is not very well defined here. We also change some exception types raised in bdev so that all exceptions raised by device creation are a subclass of GenericError. Reviewed-by: ultrotter
-
Iustin Pop authored
Currently we use a different UUID for the _data and _meta volumes of a DRBD disk. This is confusing as it's hard to associate the two in the output of “lvs” or “gnt-node volumes”. The patch changes so that they use the same prefix. Reviewed-by: ultrotter
-
- Jan 16, 2009
-
-
Iustin Pop authored
Due to deficiencies in our block device implementation, it is a must to call SetDiskID on disks before passing them to remote nodes. Since in export/import, we don't touch the disks themselves, this was not needed before in this function. However, since having instance symlinks, the correct ID is needed here too, and with static minors it's a "must need". This reflects into failed instance starts after migration and/or failover. Reviewed-by: ultrotter
-