- Jul 15, 2010
-
-
Michael Hanselmann authored
This new opcode and gnt-debug sub-command test some aspects of the job queue, including the status of a job. The bug fixed in commit 2034c70d was identified using this test. A future patch will run this test automatically from the QA scripts. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 08, 2010
-
-
Apollon Oikonomopoulos authored
Add a cluster parameter to hold the iallocator that will be used by default when required and no alternative (manually-specified iallocator or manually-specified node(s)) is given. Signed-off-by:
Apollon Oikonomopoulos <apollon@noc.grnet.gr> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 06, 2010
-
-
Luca Bigliardi authored
Signed-off-by:
Luca Bigliardi <shammash@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 01, 2010
-
-
René Nussbaumer authored
This will allow instance rename without dns check as it does for instance add. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jun 23, 2010
-
-
Iustin Pop authored
All code has been switched to the new-style LU… time for cleanup. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
We move the instance OS rename checks earlier, as we need to run the validation against the new OS, if it has changed. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
We use _GetUpdatedParams in order to support removal too, and then validate the OS parameters if the OS exists. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This is not yet complete, as it lacks proper support for instance import. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Guido Trotter authored
If the repetition count is not passed or is passed as 0 we sleep exactly one time, otherwise we sleep "repeat" times and log in between. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- May 18, 2010
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
To prepare a remote export, the X509 key and certificate need to be generated. A handshake value is also returned for an easier check whether both clusters share the same cluster domain secret. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Apr 16, 2010
-
-
Balazs Lecz authored
Signed-off-by:
Balazs Lecz <leczb@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Balazs Lecz authored
Signed-off-by:
Balazs Lecz <leczb@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Apr 12, 2010
-
-
Iustin Pop authored
When importing an instance, all the saved valued will be used as explicitly specified values, overriding the cluster defaults. This means export+import will change the status (from default to explicitly specified) of parameters. This patch adds a new option that changes the behaviour to identify parameter values which are equal to the current cluster defaults and mark them as such. It does this for hv, be and nic parameters. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Apr 08, 2010
-
-
Iustin Pop authored
This will be used to conditionally enable the watcher node maintenance feature. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Mar 17, 2010
-
-
Iustin Pop authored
This is a simple patch that adds the no-install mode for instance creation, allowing import from foreign source of the actual OS (instead of requiring the preparation of data in a form expected by the import scripts). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This patch modifies LUSetInstanceParms to allow OS name changes, without reinstallation, in case an OS gets renamed on-disk. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Mar 15, 2010
-
-
Iustin Pop authored
This patch adds a new mode to instance modify, the changing of the disk template. For now only plain to drbd conversion is supported, and the new secondary node must be specified manually (no iallocator support). The procedure for conversion works as follows: - a completely new disk template is created, matching the count, size and mode of the instance's current disks - we create manually (not via _CreateDisks) all the missing volumes - we rename on the primary the LVs to the new name - we create manually the DRBD devices Failures during the creation of volumes will leave orphan volumes. Failure during the rename might leave some disks renamed and some not, leading to an inconsistent instance. Once the disks are renamed, we update the instance information and wait for resync. Any failures of the DRBD sync must be manually handled (like a normal failure, e.g. by running replace-disks, etc.). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Mar 10, 2010
-
-
Iustin Pop authored
The upcoming python 2.6.5 release has a change that makes delattr(obj, attr) fail for slots-enabled objects if the attr is not already set. To prevent against this, we only run the delattr if the attribute is already set. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Mar 09, 2010
-
-
Iustin Pop authored
The current code in LUSetNodeParms regarding the demotion from master candidate role is complicated and duplicates the code in ConfigWriter, where such decisions should be made. Furthermore, we still cannot demote nodes (not even with force), if other regular nodes exist. This patch adds a new opcode attribute ‘auto_promote’, and changes the decision tree as follows: - if the node will be set to offline or drained or explicitly demoted from master candidate, and this parameter is set, then we lock all nodes in ExpandNames() - later, in CheckPrereq(), if the node is indeed a master candidate, and the future state (as computed via GetMasterCandidateStats with the current node in the exception list) has fewer nodes than it should, and we didn't lock all nodes, we exit with an exception - in Exec, if we locked all nodes, we do a AdjustCandidatePool() run, to ensure nodes are locked as needed (we do it before updating the node to remove a warning, and prevent the situation that if the LU fails between these, we're not left with an inconsistent state) Note that in Exec we run the AdjustCP irrespective of any node state change (just based on lock status), so we might simplify the CheckPrereq even more by not checking the future state, basically requiring auto_promote/lock_all for master candidates, since the case where we have more than needed master candidates is rarer; OTOH, this would prevent manual promotion ahead of time of another node, which is why I didn't choose this way. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
This patch implements all modifications to support per-os-hypervisor parameters in the framework. Signed-off-by:
René Nussbaumer <rn@google.com> Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 22, 2010
-
-
Iustin Pop authored
We add this as a new opcode since we don't want to alter the behaviour of current opcodes/lus. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 12, 2010
-
-
Michael Hanselmann authored
This will be useful for instance moves. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 11, 2010
-
-
Iustin Pop authored
Also automatically fix opcodes which have this missing in the LU init routine. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 10, 2010
-
-
Iustin Pop authored
Commit 154b9580 changed (correctly) the __slots__ usage, but this broke dumpers/loaders since we relied directly on the own class __slots__ field. To compensate, we introduce a simple function for computing the slots across all parent classes (if any), and use this instead of __slots__ directly. Note: the _all_slots() function is duplicated between objects.py and opcodes.py, but the only other options is to introduce a lang.py for such very basic language items. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 09, 2010
-
-
Iustin Pop authored
This patch adds an early_release parameter in the OpReplaceDisks and OpEvacuateNode opcodes, allowing earlier release of storage and more importantly of internal Ganeti locks. The behaviour of the early release is that any locks and storage on all secondary nodes are released early. This is valid for change secondary (where we remove the storage on the old secondary, and release the locks on the old and new secondary) and replace on secondary (where we remove the old storage and release the lock on the secondary node. Using this, on a three node setup: - instance1 on nodes A:B - instance2 on nodes C:B It is possible to run in parallel a replace-disks -s (on secondary) for instances 1 and 2. Replace on primary will remove the storage, but not the locks, as we use the primary node later in the LU to check consistency. It is debatable whether to also remove the locks on the primary node, and thus making replace-disks keep zero locks during the sync. While this would allow greatly enhanced parallelism, let's first see how removal of secondary locks works. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jan 27, 2010
-
-
Balazs Lecz authored
According to http://docs.python.org/reference/datamodel.html#slots * The action of a __slots__ declaration is limited to the class where it is defined. As a result, subclasses will have a __dict__ unless they also define __slots__ (which must only contain names of any /additional/ slots). * If a class defines a slot also defined in a base class, the instance variable defined by the base class slot is inaccessible (except by retrieving its descriptor directly from the base class). This renders the meaning of the program undefined. In the future, a check may be added to prevent this. Signed-off-by:
Balazs Lecz <leczb@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Signed-off-by:
Iustin Pop <iustin@google.com>
-
- Dec 16, 2009
-
-
Iustin Pop authored
This adds a new opcode parameter ‘name_check’ (similar to ip_check) that is not required to be present (to easy backwards compatibility for tools). It also adds a CheckArguments to LUCreateInstance and changes the workflow related to instance IP checks and NIC initialisation based on it. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Nov 03, 2009
-
-
Iustin Pop authored
A newer version of pylint, more warnings… Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Nov 02, 2009
-
-
Iustin Pop authored
Currently the repair storage has two issues: - down instances are aborting the operation, even though they should be ignored (it's not technically possible to know their disk status unless we would activate their disks) - if the VG is so broken that disks cannot be activated via gnt-instance activate-disks or gnt-instance startup, it's not possible to repair the VG at all The patch makes the opcode skip down instances and also introduces an ``--ignore-consistency`` flag for forcing the execution of the LU. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 13, 2009
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Guido Trotter authored
All the LUs that shut down the instance need to be able too pass the timeout parameter as well. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 09, 2009
-
-
Guido Trotter authored
Using the new --timeout option: - gnt-instance shutdown is changed to accept a timeout - the opcode is changed to hold one - the LU is changed to optionally get one - the rpc is changed to carry one - the backend is changed to take it as a parameter rather than hardcoding it in the function Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Oct 05, 2009
-
-
Guido Trotter authored
These two opcode need to know whether an unknown variant must be forced through or not. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Olivier Tharan <olive@google.com>
-
- Sep 17, 2009
-
-
Iustin Pop authored
One of the issues we have in ganeti is that it's very hard to test the error-handling paths; QA and burnin only test the OK code-path, since it's hard to simulate errors. LUVerifyCluster is special amongst the LUs in the fact that a) it has a lot of error paths and b) the error paths only log the error, they don't do any rollback or other similar actions. Thus, it's enough for this LU to separate the testing of the error condition from the logging of the error condition. This patch does this by replacing code blocks of the form: if x: log_error() [y] into: log_error_if(x) [if x: y ] After this change, it's simple enough to turn on logging of all errors by adding a special case inside log_error_if such that if the incoming opcode has a special ‘debug_simulate_errors’ attribute and it's true, it will log unconditionally the error. Surprisingly this also turns into an absolute code reduction, since some of the if blocks were simplified. The only downside to this patch is that the various _VerifyX() functions are now stateful (modifying an attribute on the LU instance) instead of returning a boolean result. Last note: yes, this discovered some error cases in the logging. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Currently the output of cluster verify can be parsed for 'ERROR' messages, but that is the only indication we get (error or no error). In order to allow monitoring tools to separate different error conditions, this patch introduces a new output format (“gnt-cluster verify --error-codes”) that changes the output from human-friendly to machine-friendly. In this mode, an error line changes from: ERROR: node node1: drbd minor 1 of instance inst1.is not active to: ERROR:ENODEDRBD:node:node1:drbd minor 1 of instance inst1 is not active i.e. the error message is a ‘:’-separated field, with ERROR in the first place, the error code in the second, the object type (cluster, node, instance) in the third, the name of the object (for nodes/instances) in the fourth, and then the text message. The patch also removes some of the verbosity of the operation (“Verifying instance X”, “Verifying node X”) since on big clusters these informational messages can quickly fill up an entire screen. The original behaviour can be restored via the ‘--verbose’ option. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Aug 24, 2009
-
-
Iustin Pop authored
This patch adds a basic version of LUMoveInstance. It doesn't yet support iallocator-mode and it's implemented in old-style (non-TL) mode. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Aug 17, 2009
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-