- 14 Nov, 2007 1 commit
-
-
Guido Trotter authored
Right now an assembly error produces an exception but not a log message. This is bad because the exception suggests looking at the log, but the log itself has a lot of errors which are not really a problem and only some which really is. In order to make it clear where in the log the problem occurred we log a message too, before raising the exception. Reviewed-by: iustinp
-
- 12 Nov, 2007 1 commit
-
-
Iustin Pop authored
We want to prevent sending too many 'None' children to a device. However, the test as it is today is wrong, as we want to test the situation after adding a new child, and not before. This patch fixes this by testing greater-or-equal instead of just greater. Reviewed-by: imsnah
-
- 09 Nov, 2007 1 commit
-
-
Michael Hanselmann authored
Reviewed-by: schreiberal
-
- 07 Nov, 2007 1 commit
-
-
Iustin Pop authored
This (big) patch does two things: - add "local disk status" to the block device checks (BlockDevice.GetSyncStatus and the rpc calls that call this function, and therefore cmdlib._CheckDiskConsistency) - improve the drbd8 secondary replace operation using the above functionality The "local disk status" adds a new variable to the result of GetSyncStatus that shows the degradation of the local storage of the device. Of course, not all device support this - for now, we only modify LogicalVolumes and DRBD8 to return degraded in some cases, other devices always return non-degraded. This variable should be a subset of is_degraded - whenever this variable is true, the is_degraded should also be true. The drbd8 secondary replace uses this variable as we don't care if the primary drbd device is network-degraded, only if it has good local disk data (ldisk is False). The patch also increases the protocol version (due to rpc changes). Reviewed-by: imsnah
-
- 06 Nov, 2007 2 commits
-
-
Iustin Pop authored
This patch adds the following functionality: - DRBD8 devices can assemble without local storage (done by allowing None in the list of children, and making DRBD8 to ignore all children if any is None) - DRBD8 devices can attach (i.e. identify a device) which is not connected to backing storage but to the correct network ports; this is a rare case in normal operation (it's what would happen if one manually detaches the local disk, and the backing LV still exists) Reviewed-by: imsnah
-
Iustin Pop authored
For some cases, we don't have to have access to the children of a device in order to remove them (e.g. md over lvs, or drbd over lvs). In order to ease the removal process, skip over finding the child if it provides a static dev path. This is needed in order to support removal of children when the underlying storage has gone away. Reviewed-by: imsnah
-
- 05 Nov, 2007 3 commits
-
-
Iustin Pop authored
The block device creation process is the following: - device create - device assembly (on primary or depending on dev_type, on secondary too) - set sync speed - return The problem is that device assembly after creation was not checked for errors, and as this is a very unusual case, we did not have problems with it (or we didn't detect them). The recent DevCacheManager however tripped on this case (because the dev_path of the device is None if the assembly fails) and the creation aborted with an unclear error message. The patch adds a check for the assembly success and aborts the creation of the device in this case - the error is quite clear in the instance add, for example. The patch also changes DevCacheManager to log the cases when dev_path is None but not raise an error (keeping consistent with the goal that the cache manager should be transparent to the code). For the record, this error case was detected with a mismatch between drbd kernel module and utilities. Reviewed-by: imsnah
-
Iustin Pop authored
This patch fixes some minor pylint warnings (unused variables, wrong indentation, etc.) and a real bug in the recovery for drbd8 rename procedure. Reviewed-by: imsnah
-
Guido Trotter authored
Modify backend.py so that DiagnoseOS only returns OS objects rather than InvalidOS errors, and make sure gnt-os understands the new objects. Also delete the deprecated helper functions from gnt-os. Reviewed-By: iustinp
-
- 04 Nov, 2007 1 commit
-
-
Guido Trotter authored
Remove a wrong "i" and add a missing ")" to the DiagnoseOS function doc string. Reviewed-By: iustinp
-
- 02 Nov, 2007 1 commit
-
-
Iustin Pop authored
Currently, troubleshooting DRBD problems involves a manual process of going backwards from the DRBD device to the instance that owns it. This patch adds a weak (i.e. not guaranteed to be correct or up-to-date) cache of device to instance. The cache should be, in normal operation, having correct information as the only time when devices change paths are when they are started/stopped, and the code in backend.py adds cache updates to exactly these operations. The only drawback of this implementation is that we don't fully update the cache on renames of devices (we clean the old entries but we don't add new ones). Since the rename changes the path only for LVs (and not drbd and md), this is less of a problem as the target of this code is debugging DRBD and MD issues. The patch writes files named bdev_drbd<N> (or bdev_md<N>, bdev_xenvg_...) in /var/run/ganeti (more exactly, LOCALSTATEDIR/ganeti). The files start with 'bdev_' and continue with the path of the device under /dev/ (this prefix stripped), and contain the following values, space separated: - instance name - primary or secondary (depending on how the device is on the primary or secondary node) - instance visible name: sda or sdb or not_visible, the latter case when the device is not the top-level device (i.e. remote_raid1 templates will have sd[ab] for the md, but not_visible for drbd and logical volumes) The cache is designed to not raise any errors, if there is an I/O error it will only be logged in the node daemon log file. This is in order to reduce the possible impact of the cache on the block device activation and shutdown code. Reviewed-by: imsnah
-
- 01 Nov, 2007 1 commit
-
-
Iustin Pop authored
Reviewed-by: ultrotter
-
- 29 Oct, 2007 3 commits
-
-
Iustin Pop authored
Reviewed-by: imsnah
-
Iustin Pop authored
This patch adds three modes of disk replacement for drbd8: - replace the disk on the primary node - replace the disk on the secondary node - replace the secondary node It also adds some debugging code to backend.py and increments the protocol version for the recent changes of the rpc layer. Reviewed-by: imsnah
-
Iustin Pop authored
This patch add code for renaming a device; more precisely, for changing the unique_id of the device. This means: - logical volumes, rename the volume - drbd8, change the remote peer This is needed for the being able to replace disks for drbd8. Reviewed-by: imsnah
-
- 25 Oct, 2007 1 commit
-
-
Iustin Pop authored
The two calls mirror_addchild and mirror_removechild take only one child for addition/removal. While this is enough for our md usage, for local disk replacement in drbd8, we need to be able to specify both the data and metadata device. This patch modifies these two rpc calls (and their backend implementation and their usage in cmdlib) to take a list of children to add/remove. Reviewed-by: imsnah
-
- 19 Oct, 2007 1 commit
-
-
Iustin Pop authored
Currently, the disk types are defined using constants in the code. Convert those into constants so that we can easily find them and check their usage. Note that we don't rename the values of the constants as they are used in the configuration file, and as such it's best to leave them as they are. Reviewed-by: imsnah
-
- 17 Oct, 2007 1 commit
-
-
Alexander Schreiber authored
This patch series implements the reboot command for gnt-instance. It supports three types of reboot: soft (hypervisor reboot), hard (instance config rebuild and reboot) and full (full instance shutdown and startup again). This patch contains the backend and rpc part of the patch. Reviewed-by: iustinp
-
- 16 Oct, 2007 1 commit
-
-
Iustin Pop authored
The node's ssh keys filenames are now provided as constants; this should allow easier customization. Also, the user's ssh key computing has been abstracted into ssh.py Reviewed-by: imsnah
-
- 15 Oct, 2007 1 commit
-
-
Alexander Schreiber authored
Reviewed-by: iustinp
-
- 12 Oct, 2007 2 commits
-
-
Iustin Pop authored
This patch does the following: - add constants.GANETI_RUNAS = "root", which is used to compute the homedir (and thus the .ssh directory) instead of hardcoding "/root/.ssh" in backend.AddNode and backend.LeaveCluster - add constants.SSH_CONFIG_DIR (currently hardcoded to /etc/ssh) that is used in backend instead of hardcoding it (preparation for selecting that at ./configure time) - some more internal cleanup in backend.AddNode Reviewed-by: imsnah
-
Iustin Pop authored
Since we remove only files from DATA_DIR and not from subdirectories, let's not walk the entire tree, a simple listdir suffices. Also switch to utils.RemoveFile from simple os.unlink. Reviewed-by: imsnah
-
- 10 Oct, 2007 1 commit
-
-
Iustin Pop authored
Since modules are not directly executables, remove the shebang from them. This helps with lintian warnings. Also make the autogenerated _autoconf.py contain two comment lines at the beginning, like the other modules. Reviewed-by: ultrotter
-
- 08 Oct, 2007 2 commits
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
Michael Hanselmann authored
dot. Reviewed-by: iustinp
-
- 04 Oct, 2007 2 commits
-
-
Guido Trotter authored
This isdir() check leads to a broken error message. Even fixing it creates some cases in which the error message is nebulous and unclear while removing it makes this situation be dealt with a lot better by the _OSOndiskVersion checks. Reviewed-by: iustinp
-
Guido Trotter authored
- Document the expected change to errors.InvalidOS - Always pass the additional argument - Modify DiagnoseOS output to show the path Reviewed-by: iustinp, imsnah
-
- 03 Oct, 2007 2 commits
-
-
Guido Trotter authored
Abstract the _OSSearch function, to look for an OS in the search path Make OSFromDisk accept an optional base_dir, rather than the os_dir itself Reviewed-by: iustinp
-
Guido Trotter authored
First part of the OS search path cleanup. _OSOndiskversion is only ever called once, and with that argument set, so let's make it mandatory. Reviewed-by: iustinp
-
- 28 Sep, 2007 1 commit
-
-
Guido Trotter authored
directories which can contain OS scripts. The list defaults to the current one but can be changed at configure time. Reviewed-by: imsnah
-
- 25 Sep, 2007 2 commits
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
Michael Hanselmann authored
with the script named differently than Debian. Reviewed-by: ultrotter
-
- 17 Sep, 2007 3 commits
-
-
Iustin Pop authored
This uses the recently-added Instance.FindDisk() method instead of hard coded find-disk code. It also renames one parameter to AddNode from ssh to sshkey in order not to shadow the ganeti.ssh module. Reviewed-by: imsnah
-
Iustin Pop authored
This patch adds support for instance rename operation at all remaining layers: RPC, OpCode/LU and CLI. Reviewed-by: imsnah
-
Iustin Pop authored
This patch adds support for renaming at OS level. Because of this, we need to bump up the version of the OS api from 4 to 5. The patch also documents the new script interface in the ganeti-os-interface(7) man page and adds a section on upgrading the OS definitions to the new version. Reviewed-by: imsnah
-
- 13 Sep, 2007 1 commit
-
-
Iustin Pop authored
Explanation: since we use lists and not a string, every argument we give is passed unchanged to the remote shell. So, for example, passing '/etc/init.d/ganeti restart' to the remote shell, it will try to run the path /etc/init.d/ganeti\ restart. With the s space included. This breaks, for example, gnt-node add and gnt-cluster command. The original problem with the backup routines that led to the "'" change is that they use a plain " ".join(list), but we don't need to quote the whole ssh remote command for this. We can simply use the existing utils.ShellQuoteCmd(list) which does the proper quoting of the ';' or '&&' metacharacters. With this change, both gnt-node add, gnt-cluster command and export/import work. This also improves the error-handling behaviour of one cat command by making it conditional on the preceding mkdir. Reviewed-by: ultrotter
-
- 07 Sep, 2007 1 commit
-
-
Guido Trotter authored
This avoids forgetting some parameters, as it's happening right now (the correct known host file is not being passed) In order to do so we split SSHCall into an auxiliary BuildSSHCmd which builds the command but doesn't actually call it, and SSHCall itself which runs RunCmd on top of BuildSSHCmd's result. BuildSSHCmd is then explicitely called by import/export who has to build a more complex command to be run later.
-
- 30 Aug, 2007 1 commit
-
-
Iustin Pop authored
This changes a ';' to '&&' to make sure we run the create script from the correct directory. Reviewed-by: imsnah
-
- 24 Aug, 2007 1 commit
-
-
Iustin Pop authored
This changes: - cluster setup, we no longer edit /etc/ssh/ssh_known_hosts but our own file - node add, we no longer remove root's known_hosts (twice) - gnt-instance console, both the LU and the script: since now the ssh setup is not standard, we need to build the ssh cmdline in the LU (instead of manually building it in the script) with the correct parameters and use the command line as returned in the script - ssh.py, many changes, split options in module-level constants so that building the command line in different places is easier/more logical - backend.py, we no longer remove root's known_hosts in Add node, and we allow our own known_hosts file to be uploaded Reviewed-by: imsnah
-
- 14 Aug, 2007 1 commit
-
-
Iustin Pop authored
This changes the raising of exceptions from: raise Exception, value to raise Exception(value) as the first form will be removed in python-3000 and the second form is preferred now. The changes also involve a few cases of changing from raising standard exceptions and use our own ones. The new version also fixes many pylint-generated warnings, especially in ganeti-noded where I changed many methods to @staticmethod. There is no functionality changed (barring any bugs).
-