- May 09, 2011
-
-
Marco Casavecchia authored
Add INSTANCE_PRIMARY_NODE and INSTANCE_SECONDARY_NODES. These new values are useful for OS scripts that needs to know the nodes where the instance lives.. or has lived. Signed-off-by:
Iustin Pop <iustin@google.com> [iustin@google.com: fixed small issue with SECONDARY_NODES] Reviewed-by:
Iustin Pop <iustin@google.com>
-
- May 02, 2011
-
-
Iustin Pop authored
Currently cluster verify doesn't check for bridge information; the only checks are done at instance create and failover/migrate time. This means a cluster that seems healthy will fail creation jobs. This patch implements a simple verification that all nodes (in the entire cluster, so doesn't work well for multi-group) have all the required bridges: the default one plus any instance bridge. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Apr 20, 2011
-
-
Apollon Oikonomopoulos authored
Modify LUMigrateInstance and TLMigrateInstance to allow instance migrations for instances with DTS_EXT_MIRROR disk templates. Migrations of shared storage instances require either a target node, or an iallocator to determine the target node. If none is given, the cluster default iallocator is used. Locking behaviour: If the iallocator is used, then initially all nodes are locked and subsequently only the locks on the source node and the target node selected by the iallocator are retained. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> [iustin@google.com: small changes in cmdlib.py] Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Apollon Oikonomopoulos authored
The bdev_sizes multi-node RPC call returns the sizes of the requested block devices on the desired nodes. Its intended use is to verify the existence of a block device on a given node for shared block storage support. Block device paths are expected to lie under constants.BLOCKDEV_DIR ("/dev/disk" by default), where persistent symlinks for block devices are assumed to exist. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> [iustin@google.com: small changes in backend.py] Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Apollon Oikonomopoulos authored
This patch introduces core file storage support, consisting of the following: A configure-time switch for enabling/disabling shared file storage support and controlling the shared file storage location: --with-shared-file-storage-dir=. Shared file storage configuration is then available as _autoconf.ENABLE_SHARED_FILE_STORAGE and _autoconf.SHARED_FILE_STORAGE_DIR and there is a cluster-wide ssconf key named "shared_file_storage_dir" for changing the file location. A new disk template named "sharedfile" (DT_SHARED_FILE), using ganeti.bdev.FileStorage. Auxiliary functions in lib/config.py to handle shared file storage. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> [iustin@google.com: small style fixes] Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 03, 2011
-
-
Michael Hanselmann authored
The new import/export infrastructure in Ganeti 2.2 and up handles compression differently. It no longer writes compressed files to the destination. Unfortunately changing this behaviour would be non-trivial, so in the meantime setting “compression = none” will hopefully avoid some confusion. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jan 28, 2011
-
-
Iustin Pop authored
This patch implements recreation of instance disk symlinks when the activate-disks operation is run. Until now, it was not possible to re-create these symlinks without stopping and starting or migrating an instance as the RPC call where this is done was in instance startup and migration. In order to do this, the blockdev_assemble rpc call needs the disk index too, which is added to the protocol. This is a change from 2.3 and makes instance startup incompatible (FYI). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jan 27, 2011
-
-
Iustin Pop authored
Currently, the validity of the hypervisor parameters is only checked at init/modification time, and not in the cluster verify. This is bad, as it can lead to inconsistent state that is only detected when the next modification (which can be unrelated) is made, leading to unexpected error messages. This patch adds both syntax verification (in masterd) and validity verification on remote nodes. The downside of the patch is that on clusters with many instances which have custom parameters, it will be slow. A possible improvement would be to detect duplicate, identical set of parameters, and collapse these into a single verification, but that is left as a TODO (in case it becomes problematic). An additional change is in utils.ForceDict, where we said 'key', whereas this function is always used with parameter dicts, so I changed it to "Unknown parameter". Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jan 26, 2011
-
-
Iustin Pop authored
The recent work on multi-VG support has converted LUClusterVerifyDisks into doing serialised calls to each node, as each node can have different VGs. This is suboptimal, especially for big clusters, where this LU is executed by the watcher very often. This patch changes the logic based on the observation that querying a node for its VGs and then requesting a LV list for those VGs is equivalent to simply asking for all LVs, without specifying the VG name(s). So backend.py needs changes to accept an empty VG list, and the LU itself partially reverts to the previous version. Additionally, we do two other fixes to this LU: - small improvement in getting the instance list from the config - MapLVsByNode works for all disk types, hence no need to restrict to the DRBD template, especially as today we can "recreate" disks for plain volumes too (the warning message in gnt-cluster is updated too) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jan 20, 2011
-
-
Michael Hanselmann authored
With this patch, the exporting node will retry to connect a few times. The receiving node will make use of the master's increased timeout (see previous patch). Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jan 11, 2011
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Sorry I thought I did run commit-check but must not have paid attention to its output. There was a typo in the docstring. This patch fixes this. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jan 05, 2011
-
-
René Nussbaumer authored
This adds checks for out of band support. The helpers have to exist and they have to be executable. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Dec 21, 2010
-
-
Iustin Pop authored
As per issue 124, some Xen versions (or packaging) don't deal nicely with the colon being part of a disk name. Therefore we add a configure-time option for customising this. Note: setting the separator to interesting values like / is not handled by the code. This being a configure-time option (e.g. to be set by distribution packagers), we assume the person building the code knows what they are doing. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Dec 02, 2010
-
-
Iustin Pop authored
Currently, the Snapshot() function of LogicalVolume returns only the logical volume path, with the assumption that we only have one VG. But with the recent changes, it makes more sense to return the full data (vg and lv) from it, so as to not require computing it in the master. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Dmitry Chernyak authored
Changes to backend.GetVolumeList(): - now accepts a list of VGs instead of one VG - returns LV names in the form "vg_name/lv_name" Corresponding changes are done in: VerifyDisks, VerifyNode, LUCreateInstance (for both disk creation and adoption cases) Now the syntax "gnt-instance add ... --disk N:adopt=LV_NAME,vg=VG_NAME" as was described earlier in the man page works. Signed-off-by:
Dmitry Chernyak <dmi.chernyak@gmail.com> [iustin@google.com: QA changes for reserved LVs, style fixes and a few extra error checks, reviewed by hansmi/rn] Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Dec 01, 2010
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 29, 2010
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Nov 28, 2010
-
-
Iustin Pop authored
I have found a few regexes which are static and thus can be moved to load time, rather than run time, creation. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Nov 26, 2010
-
-
Iustin Pop authored
Currently, the call_node_info RPC does always check both the VG free space and the hypervisor information. However, in ⅔ of the uses, we only care about one or the other. Therefore, we change it so that if any of the passed parameters is None, we don't perform the respective check. We also modify its callers to only pass in what they need. This also helps if the "default" hypervisor is broken and we want to create an instance for another hypervisor. With this patch, the duration of this rpc changes from 500ms to 90ms for a normal LVM+Xen PVM node, when we only require the LVM data; when we only require the hypervisor data, it doesn't change (as the “xm list” time is dominant). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Nov 03, 2010
-
-
Michael Hanselmann authored
Tests have shown that the changes in commit b8d26c6e don't work as wanted. If any disk wasn't found on the node, all disks located on the same node would show as faulty. The cause was incorrect exception handling on the node. This patch changes the RPC call to return a per-disk success/error status, avoiding the problem. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Luca Bigliardi <shammash@google.com>
-
- Oct 28, 2010
-
-
Iustin Pop authored
The method to make vm_capable integrate easily into cluster verify is as follows: - we add a new NV_VMNODES that represents *non*-vm-capable nodes - the LU populates this list (it's expected that non-vm_capable nodes are few compared to vm_capable nodes) - backend skips the checks that are related to VM hosting - in the LU, we reorder the VM-related checks so that they occur after the non-VM (generic) tests, and we only execute them conditionally Additionally, we add some support to the instance checks to detect instances living on bad nodes. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 26, 2010
-
-
René Nussbaumer authored
This patch now uses dd entirely to wipe the disk, make it much easier to wipe in blocks so we can give interactive feedback about the status. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 25, 2010
-
-
Iustin Pop authored
Some parameters were missing (uuid, c/mtime). We simplify the export method; unfortunately we cannot simply iterate over __slots__ since the mapping is not 1:1. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 22, 2010
-
-
Iustin Pop authored
This allows serialization of updates to a given file, with respect to other cooperating writers. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 30, 2010
-
-
Iustin Pop authored
Currently, the computation of the 'pure' name or the variant is hardcoded and spread around the functions that need it. This is not nice, and in the future we'd spread it even more with more usage of variants/pure os names. This patch abstracts these functions into the OS class, and then replaces the hardcoded uses with the new functions. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Sep 23, 2010
-
-
René Nussbaumer authored
This patch removes duplicate code found in backend which also needs to get VG infos. To make it simpler we moved to bdev.LogicalVolume.GetVGInfo. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 13, 2010
-
-
Vitaly Kuznetsov authored
This was introduced in efaa9b06. in OSCoreEnv: inst_os.name is pure operating system name (without variant) as variant is stripped in OSFromDisk(). So we always get variant = inst_os.supported_variants[0] (first variant in variants list). Adding argument os_name with full name (including variant) solves this problem. Signed-off-by:
Vitaly Kuznetsov <vitty@altlinux.ru> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> [modified by iustin to handle the call to OSCoreEnv from ValidateOS too] Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 07, 2010
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 03, 2010
-
-
Manuel Franceschini authored
Signed-off-by:
Manuel Franceschini <livewire@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 23, 2010
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Aug 20, 2010
-
-
Manuel Franceschini authored
This patches changes the StartMaster method to consult the cluster primary ip version when deciding whether to use arping or ndisc6 after activating the master ip. Signed-off-by:
Manuel Franceschini <livewire@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 19, 2010
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Manuel Franceschini authored
Signed-off-by:
Manuel Franceschini <livewire@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Aug 18, 2010
-
-
Manuel Franceschini authored
This patch enables IPv6 name resolution by using socket.getaddrinfo instead of socket.gethostbyname_ex. It renames the HostInfo class to Hostname and unifies its use throughout the code. This is achieved by using static calls where no object is needed and removes some obsolete code. For now, we just resolve to IPv4 addresses, but this will change once it is needed. Signed-off-by:
Manuel Franceschini <livewire@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Manuel Franceschini authored
This patch unifies the netutils functions dealing with IP addresses to three classes: - IPAddress: Common IP address functionality - IPv4Address: IPv4 specific functionality - IPv6address: IPv6-specific functionality Furthermore it adds methods to check whether an address is a loopback address, replacing the .startswith("127") for IPv4 and adding IPv6 support. It also provides the basis for future IPv6 address handling. Methods to convert IP strings to their corresponding interger values will allow to canonicalize IPv6 addresses. Signed-off-by:
Manuel Franceschini <livewire@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 29, 2010
-
-
Iustin Pop authored
Since we don't support upgrades from 1.2.4 without restarting the instance, the 'not restarted since 1.2.5' check/error is wrong/misleading. Since the live migration works anyway without the links (it recreates them during the disk reconfiguration anyway), we remove the check and we transform it into a warning (to the node daemon log only, unfortunately). For 2.3, we'll need to change the symlink creation from instance start time to disk activation time (but that requires more RPC changes). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jul 26, 2010
-
-
Iustin Pop authored
Currently, backend.StartMaster (the function behind this RPC call) will activate the master IP and then, if the start_daemons parameter is true, it will also activate the master role. While this works, it has two issues: - first, it will activate the master IP unconditionally, even if this node will not start the master daemon due to missing votes - second, the activation of the IP is done twice if start_daemons is true, because the master daemon does its own activation too This behaviour seems to be unmodified since Summer 2008, so probably any rationale on why this is done in two places is forgotten. The patch changes so that this function does *either* IP activation or master role activation but not both. So the IP will be activated only once (from the master daemon or from LURenameCluster), and it will only be done if the masterd got enough votes for startup. I can see only one downside to this change: if masterd won't actually start (due to missing votes), RAPI will still start, and without the master IP activated. But this is no worse than before, when both RAPI was running and the IP was activated. Note that the behaviour of StopMaster remains the same, as noone else does the IP removal. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-