- Nov 08, 2007
-
-
Michael Hanselmann authored
Reviewed-by: schreiberal
-
Michael Hanselmann authored
Reviewed-by: schreiberal
-
- Nov 07, 2007
-
-
Iustin Pop authored
This (big) patch does two things: - add "local disk status" to the block device checks (BlockDevice.GetSyncStatus and the rpc calls that call this function, and therefore cmdlib._CheckDiskConsistency) - improve the drbd8 secondary replace operation using the above functionality The "local disk status" adds a new variable to the result of GetSyncStatus that shows the degradation of the local storage of the device. Of course, not all device support this - for now, we only modify LogicalVolumes and DRBD8 to return degraded in some cases, other devices always return non-degraded. This variable should be a subset of is_degraded - whenever this variable is true, the is_degraded should also be true. The drbd8 secondary replace uses this variable as we don't care if the primary drbd device is network-degraded, only if it has good local disk data (ldisk is False). The patch also increases the protocol version (due to rpc changes). Reviewed-by: imsnah
-
Michael Hanselmann authored
Reviewed-by: schreiberal
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
Michael Hanselmann authored
Reviewed-by: schreiberal
-
- Nov 06, 2007
-
-
Michael Hanselmann authored
-
Michael Hanselmann authored
Replace --secondary-node option with an optional parameter for --node.
-
Iustin Pop authored
This patch adds enhanced reporting and much more checks to the disk replacement (when not switching the secondary). Reviewed-by: imsnah
-
Iustin Pop authored
Logical volumes can be 'degraded' in a similar way to mirrored devices, when their underlying storage has gone away (i.e. after a physical disk failure and 'vgreduce --removemissing'). If we can detect this, we can prevent mistaken replaces of disks that would use this LV (or its parent) as source data. This patch adds support for computing the degraded attribute and modifies gnt-instance to warn if the LV is virtual. Reviewed-by: imsnah
-
Iustin Pop authored
Currently, some LUs use logger.Error, others just feedback_fn, etc. This patch adds three functions to mcpu.Processor than can be used to log messages to both the log and to the user. These function will be used to enhance the output of replace-disks for drbd8 (at least). Reviewed-by: imsnah
-
Iustin Pop authored
Currently, the mirror operations (add and remove children) test against the instance's attributes. This patch changes the check tests to work against the actual status of the device (i.e. live data) which is more realistic. The changes are: - allow add children if the device doesn't have local storage (even if we believe it has) - early return from remove children if the device is already without local storage Reviewed-by: imsnah
-
Iustin Pop authored
This patch adds the following functionality: - DRBD8 devices can assemble without local storage (done by allowing None in the list of children, and making DRBD8 to ignore all children if any is None) - DRBD8 devices can attach (i.e. identify a device) which is not connected to backing storage but to the correct network ports; this is a rare case in normal operation (it's what would happen if one manually detaches the local disk, and the backing LV still exists) Reviewed-by: imsnah
-
Iustin Pop authored
This patch enables the bdev.DRBD8 class report a degraded status if the local disk is missing. This allows `gnt-instance info` to report the actual situation in this case. Note that DRBD7 should also behave like this, however the diskless case is less often met there and we also don't want to change behaviour. The patch also fixes some docstrings for the GetSyncStatus methods. Reviewed-by: imsnah
-
Iustin Pop authored
For some cases, we don't have to have access to the children of a device in order to remove them (e.g. md over lvs, or drbd over lvs). In order to ease the removal process, skip over finding the child if it provides a static dev path. This is needed in order to support removal of children when the underlying storage has gone away. Reviewed-by: imsnah
-
Iustin Pop authored
This patch adds a function returning the device path if it is computable from the disk object (and we don't need to instantiate a bdev object on the target node in order to compute this). Only LVs support this. Reviewed-by: imsnah
-
- Nov 05, 2007
-
-
Iustin Pop authored
Reviewed-by: ultrotter
-
Iustin Pop authored
This patch adds a check in the prereq of LUInitCluster for the existence of the init script. This allows a clean abort instead of a stack dump. Based on a report by admin@steibei.net Reviewed-by: ultrotter
-
Iustin Pop authored
The block device creation process is the following: - device create - device assembly (on primary or depending on dev_type, on secondary too) - set sync speed - return The problem is that device assembly after creation was not checked for errors, and as this is a very unusual case, we did not have problems with it (or we didn't detect them). The recent DevCacheManager however tripped on this case (because the dev_path of the device is None if the assembly fails) and the creation aborted with an unclear error message. The patch adds a check for the assembly success and aborts the creation of the device in this case - the error is quite clear in the instance add, for example. The patch also changes DevCacheManager to log the cases when dev_path is None but not raise an error (keeping consistent with the goal that the cache manager should be transparent to the code). For the record, this error case was detected with a mismatch between drbd kernel module and utilities. Reviewed-by: imsnah
-
Iustin Pop authored
This patch fixes some minor pylint warnings (unused variables, wrong indentation, etc.) and a real bug in the recovery for drbd8 rename procedure. Reviewed-by: imsnah
-
Michael Hanselmann authored
Reviewed-by: schreiberal
-
Michael Hanselmann authored
Reviewed-by: schreiberal
-
Michael Hanselmann authored
Reviewed-by: schreiberal
-
Guido Trotter authored
The OS cleanup patches change the wire protocol. Increment the protocol number by one. Reviewed-By: iustinp
-
Guido Trotter authored
In order to do this for simplicity we leave the OSFromDisk function as-is and we convert the eventual exception to an OS object in ganeti-noded. The unmangling gets simplified and so does the code for checking whether the OS is valid. Reviewed-By: iustinp
-
Guido Trotter authored
The functions in ganeti-noded and rpc.py still deal with the fact that an InvalidOS error could be returned by DiagnoseOS. As this is not the case anymore simplify their code for the current behavior. Reviewed-By: iustinp
-
Guido Trotter authored
Modify backend.py so that DiagnoseOS only returns OS objects rather than InvalidOS errors, and make sure gnt-os understands the new objects. Also delete the deprecated helper functions from gnt-os. Reviewed-By: iustinp
-
Guido Trotter authored
Add a new FromInvalidOS static function to objects.OS that makes it easy to create an object representing a broken OS starting from the relevant exception. Reviewed-By: iustinp
-
Guido Trotter authored
Till now the OS object just represents a correct OS instance. Change it so it can represent a broken one too, by adding a "status" field: if this field is different from the OS_VALID_STATUS constant the object is considered to be an invalid OS, the "status" field to be a debugging message, and its boolean status is set to false. Reviewed-By: iustinp
-
- Nov 04, 2007
-
-
Iustin Pop authored
This patch adds a '-n' option to burnin that takes a comma-separated list of nodes to perform the burnin on. Reviewed-by: ultrotter
-
Guido Trotter authored
call_os_get is never called with a real list of nodes, so there's no point in it being multi-node. Making it single-node till a usage for multi-node call is found. Reviewed-By: iustinp
-
Guido Trotter authored
Remove a wrong "i" and add a missing ")" to the DiagnoseOS function doc string. Reviewed-By: iustinp
-
- Nov 03, 2007
-
-
Iustin Pop authored
This patch adds a search command for locating tags on all objects of the cluster using a regex pattern. Reviewed-by: aat
-
- Nov 02, 2007
-
-
Michael Hanselmann authored
Also check whether file contents are correct for both “gnt-cluster command” and “gnt-cluster copyfile”. Reviewed-by: iustinp
-
Iustin Pop authored
Currently, troubleshooting DRBD problems involves a manual process of going backwards from the DRBD device to the instance that owns it. This patch adds a weak (i.e. not guaranteed to be correct or up-to-date) cache of device to instance. The cache should be, in normal operation, having correct information as the only time when devices change paths are when they are started/stopped, and the code in backend.py adds cache updates to exactly these operations. The only drawback of this implementation is that we don't fully update the cache on renames of devices (we clean the old entries but we don't add new ones). Since the rename changes the path only for LVs (and not drbd and md), this is less of a problem as the target of this code is debugging DRBD and MD issues. The patch writes files named bdev_drbd<N> (or bdev_md<N>, bdev_xenvg_...) in /var/run/ganeti (more exactly, LOCALSTATEDIR/ganeti). The files start with 'bdev_' and continue with the path of the device under /dev/ (this prefix stripped), and contain the following values, space separated: - instance name - primary or secondary (depending on how the device is on the primary or secondary node) - instance visible name: sda or sdb or not_visible, the latter case when the device is not the top-level device (i.e. remote_raid1 templates will have sd[ab] for the md, but not_visible for drbd and logical volumes) The cache is designed to not raise any errors, if there is an I/O error it will only be logged in the node daemon log file. This is in order to reduce the possible impact of the cache on the block device activation and shutdown code. Reviewed-by: imsnah
-
Iustin Pop authored
Allow burnin to use the new drbd8 template (for which case one needs to disable replacement of disks, burnin does yet support that with drbd8). The patch also changes do-replace[12] to no-replace[12] as that is what they actually do. Reviewed-by: imsnah
-
Iustin Pop authored
When renaming a logical volume, we should change the dev_path (and other internal variables) in order to be consistent. Reviewed-by: imsnah
-
Iustin Pop authored
I forgot a pair of parentheses in that revision which break the common case. This patch adds them. Reviewed-by: ultrotter
-
- Nov 01, 2007
-
-
Iustin Pop authored
If the device is unconfigured (not yet did SetDiskID for it ever), it might have a physical_id of None. This patch fixes that case. Reviewed-by: ultrotter
-
Guido Trotter authored
Ok, I've been battling with those for a while but it seems in the end I forgot to get rid of them! :( Doing it explicitely now. Reviewed-By: iustinp
-