- Jun 22, 2011
-
-
Apollon Oikonomopoulos authored
A new ip keyword is added ("pool") to signify that LUCreateInstance should get the instance's IP from an IP pool (rather than manually or by DNS resolution). IP and link checks are re-ordered so that a NIC's link is available at the time of IP address validation. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr>
-
- May 13, 2011
-
-
Apollon Oikonomopoulos authored
Check that the instance is not being migrated to its current primary node during CheckPrereq. Otherwise migration is aborted because the instance is already running and cleaned-up, which causes the running instance to be killed. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr>
-
- May 11, 2011
-
-
Iustin Pop authored
There are multiple bugs with the code checking for N+1 failures in the instance memory changes which needs significant changes, in the meantime we can at least: - change the warning message into an error (--force will skip checks) - only make checks when we increase the memory Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- May 10, 2011
-
-
Apollon Oikonomopoulos authored
Comply with changes introduced in f1ea1bef, as we forgot to completely remove self._migrater.target_node from TLMigrateInstance. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr>
-
- May 09, 2011
-
-
Iustin Pop authored
Currently, when converting an instance from plain to DRBD, the instance is blocked during the entire resync period. This patch adds the --no-wait-for-sync so that the operation finishes as soon as the DRBD sync has started, without waiting for the entire sync. This makes the instance available much faster. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This patch introduces the option of changing an instance's nodes when doing the disk recreation. The rationale is that currently if an instance lives on a node that has gone down and is marked offline, it's not possible to re-create the disks and reinstall the instance on a different node without hacking the config file. Additionally, the LU now locks the instance's nodes (which was not done before), as we most likely allocate new resources on them. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- May 06, 2011
-
-
Iustin Pop authored
It makes not sense to show messages like: Fri May 6 02:04:01 2011 - INFO: Resolved given name 'instance18' to 'instance18' So we'll skip the message if the resolved name is identical to the requested one. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
The original code would get all node information and their groups without before acquiring the necessary locks. With this patch the node information is only retrieved once all locks have been acquired. Groups are locked optimistically and verified after acquiring the node locks. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- May 05, 2011
-
-
Apollon Oikonomopoulos authored
Commit b9187ba2 erroneously incorporated parts of the code of TLMigrateInstance.CheckPrereq into TLMigrateInstance._RunAllocator. As a result, all migrations performed without the use of an iallocator would end-up running non-live. This patch restores the original behaviour. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr>
-
- May 03, 2011
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
This removes (count of instances + count of nodes) lock acquires/releases. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- May 02, 2011
-
-
Iustin Pop authored
At least one generates an epydoc error :) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Currently cluster verify doesn't check for bridge information; the only checks are done at instance create and failover/migrate time. This means a cluster that seems healthy will fail creation jobs. This patch implements a simple verification that all nodes (in the entire cluster, so doesn't work well for multi-group) have all the required bridges: the default one plus any instance bridge. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Apr 29, 2011
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
If an iallocator is used, “gnt-instance replace-disks” would acquire the locks of all nodes (only the allocator will decide which node to use). Unfortunately the unneeded locks were not released during the operation, causing unnecessary delays for other jobs. This patch changes the LU to release unneeded locks and adds assertions. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Apr 28, 2011
-
-
Iustin Pop authored
This is a simple change to allow specifying a different VG for the meta device during the creation of instances and addition of disks via gnt-instance modify. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This is a small change to make this function take a list of VG names, instead of a single one. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Apr 27, 2011
-
-
Iustin Pop authored
This patch enhances the multi-VG support in replace disks, by keeping the meta device in the same VG, as opposed to moving it to the data device VG (note that we don't have a way to create the meta in a different VG in the first place, but at least we correctly handle a custom config). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Doug Dumitru authored
Converting an instance from 'plain' to 'drbd'. The old code would create the drbd volumes in the default VG and then the renames would fail. This fix pulls the plain VG names from the existing volumes and places it into the new disk template. Running 'replace-disks' has a similar issue with the new disks going into the wrong VG and then the rename failing. Their might be a similar issue with 'recreate-disks', but I actually have no idea what recreate-disks does, so did not look into it. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
A few issues in the clarity of the error messages are fixed: - "ERROR: node node3: OS API version lenny-image": no preposition between the parameter type and the OS name, changed to "for lenny-image" - "API version lenny-image differs from reference node node1: 10, 5 vs. 10, 20, 5, 15": parameters not sorted in display - "OS variants list lenny-image differs from reference node node1: vs. default, i386": empty sets are not clearly delimited, changed to add [] around the sets: "node node1: [] vs. [default, i386]" - "OS parameters lenny-image differs from reference node node1: vs. (u'dhcp', u'Whether to enable (yes) or disable (dhcp)')": ugly formatting in the OS parameters list, as we used to just "%s" the tuple; now it is "reference node node1: [] vs. [dhcp: Whether to enable (yes) or disable (dhcp)]" Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This breaks Ganeti in multiple ways. If we don't make the check in gnt-node itself, then bootstrap.SetupNodeDaemon will restart the master daemon, making the operation fail: node1# gnt-node add --readd node1 Cannot communicate with the master daemon. Is it running and listening for connections? The check in cmdlib is more of a safety check, as we shouldn't reach it. If we do (via a bad client), then it will prevent breakage in the job queue/config handling. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
IIRC we don't use punctuation at the end of error messages. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Apr 20, 2011
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Apollon Oikonomopoulos authored
TLMigrateInstance._ExecMigration included two 10-second sleeps for unknown reasons. This patch removes them. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr>
-
Apollon Oikonomopoulos authored
Commit faaabe3c fixed failover behaviour for DTS_INT_MIRROR instances, however it broke migration for DTS_EXT_MIRROR instances, by moving iallocator and node checks from LUInstanceMigrate to TLMigrateInstance. This has the side-effect that the LU called the TL with None for both, node and iallocator when the default iallocator was being used. This patch maintains the iallocator checks in TLMigrateInstance and fixes the LU-TL integration. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr>
-
Apollon Oikonomopoulos authored
Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr>
-
Apollon Oikonomopoulos authored
It is now possible to allow adopting a disk during gnt-instance modify time, as follows: gnt-instance modify --disk add:adopt=/path/to/disk (blockdev) or gnt-instance modify --disk add:adopt=<lvname> (plain) We do the same checks as during instance creation. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr>
-
Iustin Pop authored
Patches db366d9a and aac4511a added support for EXT_MIRROR instances, but inadvertently introduced a bug: for INT_MIRROR cases, we don't need (actually we can't support) neither an iallocator nor a target node. To fix this, we move the iallocator/node checks in CheckPrereq (or respectively in the tasklet CheckPrereq), where we have access to the instance configuration, and additionally we check for and prevent passing either of these two for INT_MIRROR instances. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Apollon Oikonomopoulos authored
DTS_INT_MIRROR better contrasts DTS_EXT_MIRROR. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> [iustin@google.com: updated patch for changed context] Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Apollon Oikonomopoulos authored
Modify LUFailoverInstance to enable shared storage instances to failover. Shared storage instance failover requires either a target node or an iallocator to determine the target node. If none is given, the cluster default iallocator is used. The hook environment variables {OLD,NEW}_SECONDARY will be blank for shared storage instances. Locking behaviour is the same as for instance migration. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> [iustin@google.com: revert the DTS_NET_MIRROR specific changes] Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Apollon Oikonomopoulos authored
Modify LUNodeMigrate to provide node migration for nodes with instances using shared storage. gnt-node migrate has to be passed an iallocator for migration of shared storage instances to be performed. When using a shared storage backend, all cluster nodes are locked. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Apollon Oikonomopoulos authored
Modify LUMigrateInstance and TLMigrateInstance to allow instance migrations for instances with DTS_EXT_MIRROR disk templates. Migrations of shared storage instances require either a target node, or an iallocator to determine the target node. If none is given, the cluster default iallocator is used. Locking behaviour: If the iallocator is used, then initially all nodes are locked and subsequently only the locks on the source node and the target node selected by the iallocator are retained. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> [iustin@google.com: small changes in cmdlib.py] Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Apollon Oikonomopoulos authored
Make cmdlib.IAllocator shared-storage-aware. IAllocator requires secondary nodes only on DTS_NET_MIRROR disk templates and requires no secondaries for DTS_EXT_MIRROR templates. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Apollon Oikonomopoulos authored
This patch introduces basic shared block storage support. It introduces a new storage backend, bdev.PersistentBlockDevice, to use as a backend for shared block storage. The new bdev requires a new BLOCKDEV_DRIVER_MANUAL constant with the value "manual" and uses it as the first part of the block device unique_id. A new disk template, DT_BLOCK is introduced as well and added to DTS_EXT_MIRROR and DTS_MAY_ADOPT. Also added DTS_MUST_ADOPT constant and use it to check for the presence of the adopt keyword during LU invocation. We enforce the /dev/disk limitation upon adoption, but we allow block devices to reside anywhere under /dev. This is very basic support and includes no storage manipulation (provisioning, resizing, renaming) which will have to be implemented through a "driver" framework. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> [iustin@google.com: slight changes to bdev.py] Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Apollon Oikonomopoulos authored
This patch introduces core file storage support, consisting of the following: A configure-time switch for enabling/disabling shared file storage support and controlling the shared file storage location: --with-shared-file-storage-dir=. Shared file storage configuration is then available as _autoconf.ENABLE_SHARED_FILE_STORAGE and _autoconf.SHARED_FILE_STORAGE_DIR and there is a cluster-wide ssconf key named "shared_file_storage_dir" for changing the file location. A new disk template named "sharedfile" (DT_SHARED_FILE), using ganeti.bdev.FileStorage. Auxiliary functions in lib/config.py to handle shared file storage. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> [iustin@google.com: small style fixes] Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Apr 19, 2011
-
-
Iustin Pop authored
The current wipe_chunk_size computation is doing min(int_value, float_value). For small disks (below 10GiB), the actual formula will result into the float value being chosen. This results into very interesting behaviour: Wiping disk 0, offset 102.4, chunk 102.4 Wiping disk 0, offset 204.8, chunk 102.4 … Wiping disk 0, offset 921.6, chunk 102.4 Wiping disk 0, offset 1024.0, chunk 1.13686837722e-13 Since these are passed to dd via %d, this will result into the call to dd specifying offset 1024 and count 0, which will fail. We just need to enforce conversion to int, in order to not get bitten by floating point rounding errors. The patch also reorders some logging messages in order to log the chunk size. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Apr 14, 2011
-
-
Michael Hanselmann authored
Ganeti 2.3 introduced an optional feature to overwrite an instance's disks on creation. Unfortunately the code kept all locks while doing the wipe, slowing down the creation of multiple instances in parallel. This patch changes the code to wipe the disks only after releasing the locks. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Apr 13, 2011
-
-
Michael Hanselmann authored
Before this patc the message would look like “Some groups do not exist: [u'foo', u'bar']”, now it's “Some groups do not exist: foo, bar”. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Apr 06, 2011
-
-
Michael Hanselmann authored
Until now LUInstanceQueryData always acquired locks for the instance(s) and nodes involved. In combination with long-running operations this prevented the use of “gnt-instance info”, even with the “--static” option. With this patch, locks are only acquired when explicitely requested in the opcode (like all query operations). Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Apr 04, 2011
-
-
Iustin Pop authored
This changes the display from: Mon Apr 4 02:29:46 2011 * Verifying N+1 Memory redundancy Mon Apr 4 02:29:46 2011 - ERROR: node node2: not enough memory to accomodate instance failovers should node node1 fail To: Mon Apr 4 02:32:50 2011 * Verifying N+1 Memory redundancy Mon Apr 4 02:32:50 2011 - ERROR: node node2: not enough memory to accomodate instance failovers should node node1 fail (33536MiB needed, 27910MiB available) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-