- 20 Jul, 2011 2 commits
-
-
Michael Hanselmann authored
An overview is available in the design document for this change, doc/design-chained-jobs.rst. When a job enters the job processor, the current opcode's dependencies are evaluated. If a referenced job has not yet reached the desired status, the current job is registered as a dependant. The job processor will continue to work on other pending tasks. When a job finishes it notifies any pending dependants by re-adding them to the workerpool. A per-job processor lock is necessary for rare cases where the same job can be re-added twice. There is no way to view waiting jobs at the moment, but I plan to export this information to “gnt-debug locks”. A so-called dependency manager takes care of managing waiting jobs and keeping track of their status. Unittests are included. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
They're no longer necessary. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- 15 Jul, 2011 4 commits
-
-
Michael Hanselmann authored
Commit 66bd7445 added an assertion to ensure a finalized job has its “end_timestamp” attribute set. Unfortunately it didn't cover a case when the queue is recovering from an unclean master shutdown. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
This will be useful for assertions. GanetiLockManager._is_owned is exported, too. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Stephen Shirley authored
The wrapper will connect to the console, and check in the background if the instance is paused, unpausing it as necessary. Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Stephen Shirley authored
The wrapper will connect to the console, and check in the background if the instance is paused, unpausing it as necessary. Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- 14 Jul, 2011 1 commit
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- 13 Jul, 2011 1 commit
-
-
Stephen Shirley authored
This fixes the lint error: E1120:1220:InstanceReboot: No value passed for parameter 'startup_paused' in function call Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- 12 Jul, 2011 3 commits
-
-
Michael Hanselmann authored
This patch allows commands to be run on and files to be copied to all nodes within a specific group. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
This patc changes cli.GetOnlineNodes to use query2, which does the filtering in the master daemon, and adds a new parameter to filter by node group. Unittests were added for the old implementation and then adopted to ensure no functionality was lost. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Explicitely defining “__call__” silences a pylint warning when wrapped type check functions are used directly. I had no idea pylint is this intelligent. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- 11 Jul, 2011 3 commits
-
-
Michael Hanselmann authored
Places which receive floats can usually also deal with integers, e.g. OpTestDelay. Tests are added and the new check function is used for the aforementioned opcode and verifying query results. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Commit 009e73d0 (September 2009) changed the job queue to generate multiple job serials at once. Ever since it would return one more than requested. The “serial” file in the job queue directory is defined to contain the “last job ID used” (design-2.0). With the change above, the serial file would always contain the next serial number. The first value returned by the generating function was the one contained in the file, so during the switch in 2009 one job may have been overwritten. This patch changes the code to always return the exact number of serials, to keep the last used serial on disk and adds an assertion. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
This reverts commits 030a9cb8 and ae082df0 . There are two problems: - Makefile.am breakage, which is trivial to revert - unittest breakage, which honestly I'm not sure how to fix and how serial consoles interact with the unpause helper After the reset, the startup --paused still works but won't unpause the instance automatically (if I understood the code correctly). Furthermore, the code also fixes a style issue in hv_kvm.py (too long line) introduced by the next commit after the above two. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- 08 Jul, 2011 3 commits
-
-
Stephen Shirley authored
Creates the instance, but pauses execution before booting. This combined with 'gnt-instance console' unpausing instances means that the entire boot process can be viewed and monitored. Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Stephen Shirley authored
The wrapper will connect to the console, and check in the background if the instance is paused, unpausing it as necessary. Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Stephen Shirley authored
The wrapper will connect to the console, and check in the background if the instance is paused, unpausing it as necessary. Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- 06 Jul, 2011 1 commit
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- 05 Jul, 2011 6 commits
-
-
Michael Hanselmann authored
- Use constants and an assertion - Update documentation for node migration Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
LUNodeEvacStrategy has been replaced with LUNodeEvacuate. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
The change is not backwards compatible, see the updated NEWS file. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
By default it'll now evacuate all instances from the node, not just secondaries. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
This new opcode will replace LUNodeEvacStrategy, which used to return a list of instances and new secondary nodes. With the new opcode the iallocator (if available) is tasked to generate the necessary operations in the form of opcodes. This moves some logic from the client to the master daemon. At the same time support is added to evacuate primary instances, which are also evacuated by default. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Guido Trotter authored
Am I the only one to make that mistake 10 times a week? Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- 01 Jul, 2011 1 commit
-
-
Iustin Pop authored
There were some implicit assertions in the code that all node groups have nodes, which is not necessarily true. Additionally, the patch does a wrapping change. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- 28 Jun, 2011 2 commits
-
-
Iustin Pop authored
The new functionality in 2.4.2 for recreate-disks to change nodes is broken for DRBD instances: it simply changes the nodes without caring for the DRBD minors mapping, which will lead to conflicts in non-empty clusters. This patch changes Exec() method of this LU significantly, to both fix the DRBD minor usage and make sure that we don't have partial modification to the instance objects: - the first half of the method makes all the checks and computes the needed configuration changes - the second half then performs the configuration changes and recreates the disks This way, instances will either be fully modified or not at all; whether the disks are successfully recreate is another point, but at least we'll have the configuration sane. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Patch db8e5f1c removed the use of feedback_fn, hence pylint warn now. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- 27 Jun, 2011 2 commits
-
-
Apollon Oikonomopoulos authored
Commit 5d9bfd87 moved tap interface handling from KVM to Ganeti, partly to also solve the problem of routed interfaces getting configured too early during live migrations, causing network anomalies. In that direction, configuration of NICs of incoming instances was deferred to FinalizeMigration time. However, this causes minor issues with bridged interfaces; KVM sends out an ARP-like packet upon migration finish, which is lost because the tap interface is not yet configured. As a consequence, intermediate network equipment (i.e. switches) does not get notified about the topology change, until the instance transmits another packet after the bridge has been configured, or the switch's ARP cache expires. The proper solution to that is to support different phases in network configuration (pre/post migration), which also requires separate ifup scripts. Until then we fall back to configuring bridged interfaces on incoming instances at migration start, instead of finish. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Currently the drbd8 replace-disks on the same node (i.e. -p or -s) has a bug in that it does modify the instance disk temporarily before changing it back to the same value. However, we don't need to, and shouldn't do that: what this operation do is simply change the LVM configuration on the node, but otherwise the instance disks keep the same configuration as before. In the current code, this change back-and-forth is fine *unless* we fail during attaching the new LVs to DRBD; in which case, we're left with a half-modified disk, which is entirely wrong. So we change the code in two ways: - use temporary copies of the disk children in the old_lvs var - stop updating disk.children Which means that the instance should not be modified anymore (except maybe for SetDiskID, which is a legacy and unfortunate decision that will have to cleaned up sometime). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- 23 Jun, 2011 3 commits
-
-
Guido Trotter authored
This function is a copy of bootstrap._InitFileStorage with the following differences: - check constants.ENABLE_SHARED_FILE_STORAGE and not constants.ENABLE_FILE_STORAGE - use different local variable names - one different error string Thus: - move the constant check outside of the function call - change error string so it's clear where the error is - call the same function twice Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Guido Trotter authored
Under newer kvm this prevents the vm from starting. Ah, change! Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- 22 Jun, 2011 1 commit
-
-
Apollon Oikonomopoulos authored
When using the pool security model, _ExecuteKVMRuntime was storing the instance's UID using str(uid), which would result in storing the LockedUid.__repr__() result: $ cat /var/run/ganeti/kvm-hypervisor/uid/xxxxxxxxxxxxx <ganeti.uidpool.LockedUid object at 0x1f30610> This patch restores the intended behaviour, by using LockedUid.AsStr(). Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- 17 Jun, 2011 5 commits
-
-
Guido Trotter authored
This was left out during the fix/refactoring Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Guido Trotter authored
- Move the calculation at the beginning of CheckPrereq, since it doesn't modify any state, but still keeps locks - Only perform the calculation if the actual disk template is filebased - Error out if there is no defined file storage dir - Only join the optional --file-storage-dir extra-path if one is passed Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Guido Trotter authored
As the manpage says, and the code does, self.op.file_storage_dir is an additional relative path under the cluster file storage dir. As such it should not be absolute. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- 15 Jun, 2011 1 commit
-
-
Michael Hanselmann authored
This patch removes all occurrences of the “multi-relocate” iallocator mode. Commit 25ee7fd8 updated the design document and introduced separate modes, “change-group” and “node-evacuate”. The constants aren't removed yet as they're still used by htools. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- 10 Jun, 2011 1 commit
-
-
Michael Hanselmann authored
Chained jobs need to look at previous jobs, including archived ones. A nice side-effect of this change is the ability to look at archived jobs using “gnt-job info <id>” as long as the ID is known. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-