- Aug 03, 2009
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 31, 2009
-
-
Michael Hanselmann authored
It migrates all primary instances from the node to their secondaries. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jul 22, 2009
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jul 17, 2009
-
-
Iustin Pop authored
This patch converts the opcode loading to a pre-built map (at import time) instead of iteration over the globals dict at each call. Microbenchmarks show that this should be around three times faster, and burnin still passes. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jun 19, 2009
-
-
Iustin Pop authored
This patch adds a new (global) opcode flag 'dry_run' which, when True, causes early exit from the LU workflow, returning a special value from the LU object (initialized in the parent LogicalUnit class, and which if not overriden from child LUs will be None). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This simple patch adds to all opcodes extension of the base opcode __slots__. This way we can add slots across all opcodes, for example 'dry-run'. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jun 08, 2009
-
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- May 27, 2009
-
-
Iustin Pop authored
This (somewhat big) patch adds support for remotely rebooting the nodes via whatever support the hypervisor has for such a concept. For KVM/fake (and containers in the future) this just uses sysrq plus a ‘reboot’ call if the sysrq method failed. For Xen, it first tries the above, and then Xen-hypervisor reboot (we first try sysrq since that just requires opening a file handle, whereas xen reboot means launching an external utility). The user interface is: # gnt-node powercycle node5 Are you sure you want to hard powercycle node node5? y/[n]/?: y Reboot scheduled in 5 seconds The node reboots hopefully after sending the reply. In case the clock is broken, “time.sleep(5)” might take ages (but then I suspect SSL negotiation wouldn't work). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- May 19, 2009
-
-
Iustin Pop authored
This patch modifies the start instance script, opcode and logical unit to support temporary startup parameters. Different from 1.2, where only the kernel arguments were supporting changes (and thus xen-pvm specific), this version supports changing all hypervisor and backend parameters (with appropriate checks). This is much more flexible, and allows for example: - start with different, temporary kernel - start with different memory size Note: in later versions, this should be extended to cover disk parameters as well (e.g. start with drbd without flushes, start with drbd in async mode, etc.). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Feb 24, 2009
-
-
Iustin Pop authored
This patch removes the extra_args parameter and instead switches the instance to the HV_KERNEL_ARGS hypervisor option. This is a big change, but it's a needed cleanup, this extra parameter on all RPC calls is not generic and we also need to have a persistent value here. Reviewed-by: imsnah
-
- Feb 10, 2009
-
-
Iustin Pop authored
This patch adds LU and cli-level support for modification of the node drained flag. It is similar to the offline changes. Reviewed-by: imsnah
-
- Feb 06, 2009
-
-
Iustin Pop authored
This patch fixes a couple of issues with the job listing: - in case of a non-existing job, nicely raise 404 instead of 500 - in the job detail listing, also list the job log, the job timestamps, etc. - the opcode migrate instance was missing its description field Reviewed-by: imsnah
-
- Feb 04, 2009
-
-
Iustin Pop authored
This patch adds the framework for, and enables lockless OpQueryInstances. This means that instances will be shown in ERROR_up or ERROR_down state, even though this is not an error (but just an in-progress job). The framework is implemented as follows: - the OpQueryInstances, OpQueryNodes and OpQueryExports opcodes take an additional “use_locking” flag which will denote whether to lock or not; this patch only implements this for LUQueryInstances - the luxi query functions take an additional argument use_locking which is passed to the master daemon, and then passed to the above opcodes - cli.py export a new SYNC_OPT command line options which implement setting this flag to true - except for gnt-instance list, which uses this option, and for name-only queries (e.g. QueryNodes(fields=["names"])), all other callers are setting this flag to True - RAPI also sets the flag to True The patch was tested with a continuous (0.2s sleep in-between) gnt-instance list during a burnin, and no problems were observed. Reviewed-by: ultrotter
-
- Jan 20, 2009
-
-
Iustin Pop authored
Reviewed-by: ultrotter
-
- Jan 13, 2009
-
-
Iustin Pop authored
This is forward port via copy (and not individual patches cherry-pick) of the latest code on the 1.2 branch related to the migration. The changes compared to 1.2 are the fact that we don't need the IdentifyDisks step anymore (the drbd rpc calls are independent now), and the rpc module improvements. Reviewed-by: ultrotter
-
- Jan 12, 2009
-
-
Iustin Pop authored
This LU can be used to force a push of the config in case it's needed, for example after an upgrade to update the ssconf_release_version file. Reviewed-by: imsnah
-
- Dec 08, 2008
-
-
Iustin Pop authored
This patch changes gnt-node modify and the associated opcode/lu to allow modification of the node offline attribute. Setting a node into offline mode automatically demotes it from the master role. Reviewed-by: ultrotter
-
- Dec 02, 2008
-
-
Iustin Pop authored
This patch adds a new cluster paramater "candidate_pool_size" which tracks the desired size of the list of nodes with the master_candidate flag set. Reviewed-by: imsnah
-
Iustin Pop authored
This patch adds the OpCode, LogicalUnit and gnt-node command for modifying node parameters, more specifically the master candidate flag for a node. Reviewed-by: imsnah
-
- Nov 25, 2008
-
-
Iustin Pop authored
This big patch adds support for: - changing NIC/disks in the multi-device model - adding/removing NICs - adding/removing disks The patch is big and not very nice; the error checking paths are not very clear. The biggest problem is that from a simple instance.ATTR=VAL change (which didn't throw errors before) now we are creating and removing disks in this LU. Reviewed-by: imsnah
-
- Nov 24, 2008
-
-
Guido Trotter authored
Since the hypervisor is instance dependent we'll get one on instance creation, and use the one in the instance config on relocation. Reviewed-by: iustinp
-
- Nov 20, 2008
-
-
Iustin Pop authored
This patch adds support for mult-disk/multi-nic in: - instance add - burnin The start/stop/failover/cluster verify work as expected. Replace disk and grow disk are TODO. There's also a change gnt-job to allow dictionaries to be listed in gnt-job info. Reviewed-by: imsnah
-
- Oct 16, 2008
-
-
Iustin Pop authored
This patch enables the cluster modify to change: - enabled hypervisor list - hvparams (per hypervisor) - beparams (only the default group) Syntax: gnt-cluster modify -B vcpus=3 -H xen-pvm:no_initrd_path Validation for parameters is somewhat missing - the individual hypervisors will be checked for syntax and validation, but beparams doesn't have validation yes (nowhere), it should be added here once we have a global method (will come soon). Reviewed-by: imsnah
-
- Oct 14, 2008
-
-
Iustin Pop authored
The patch adds a new ‘--no-wait-for-sync’ parameter to grow-disk similar to the one in instance add, and changes the default to wait. This is cleaner as at the moment when the command returns, we either have a fully synced disk or there is an error. This is a forward-port of rev 1183 on the 1.2 branch. Reviewed-by: ultrotter
-
Iustin Pop authored
This big patch changes the master code to use the beparams. Errors might have crept in, but it passes a small burnin. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch adds a new '-s' parameter to ‘gnt-instance info’ that makes it return only 'static' information. This is much faster, especially for drbd instances. This is a forward-port of rev 1570 on the ganeti-1.2 branch, resending due to some conflicts. Reviewed-by: imsnah
-
Iustin Pop authored
Reviewed-by: imsnah
-
Iustin Pop authored
This big patch changes instance create to the new hvparams structure. Old parameters are removed, so old jobs or old instances file will break current clusters. Reviewed-by: ultrotter
-
- Oct 08, 2008
-
-
Iustin Pop authored
This (big) patch moves the hypervisor type from the cluster to the instance level; the cluster attribute remains as the default hypervisor, and will be renamed accordingly in a next patch. The cluster also gains the ‘enable_hypervisors’ attribute, and instances can be created with any of the enabled ones (no provision yet for changing that attribute). The many many changes in the rpc/backend layer are due to the fact that all backend code read the hypervisor from the local copy of the config, and now we have to send it (either in the instance object, or as a separate parameter) for each function. The node list by default will list the node free/total memory for the default hypervisor, a new flag to it should exist to select another hypervisor. Instance list has a new field, hypervisor, that shows the instance hypervisor. Cluster verify runs for all enabled hypervisor types. The new FIXMEs are related to IAllocator, since now the node total/free/used memory counts are wrong (we can't reliably compute the free memory). Reviewed-by: imsnah
-
- Oct 01, 2008
-
-
Michael Hanselmann authored
This can be used to retrieve certain cluster config values from within clients. OpDumpClusterConfig was not used anywhere, hence I'm just reusing it. The way ConfigWriter.DumpConfig returned the configuration was not thread-safe, anyway (no deepcopy). Reviewed-by: iustinp
-
Iustin Pop authored
The watcher has one last use of ganeti commands as opposed to sending requests via luxi. The patch changes this to use the cli functions. The patch also has two other changes: - fix the docstring for OpVerifyDisks (found out while converting this) - enable stderr logging on the watcher when “-d” is passes Reviewed-by: imsnah
-
- Sep 29, 2008
-
-
Iustin Pop authored
It is not currently possibly to show a summary of the job in the output of “gnt-job list”. The closes is listing the whole opcode(s), but that is too verbose. Also, the default output (id, status) is not very useful, unless one looks for (and knows about) an exact job ID. The patch adds a “summary” description of a job composed of the list of OP_ID of the individual opcodes. Moreover, if an opcode has a ‘logical’ target in a certain opcode field (e.g. start instance has the instance name as the target), then it is included in the formatting also. It's easier to explain via a sample output: gnt-job list ID Status Summary 1 error NODE_QUERY 2 success NODE_ADD(gnta2) 3 success CLUSTER_QUERY 4 success NODE_REMOVE(gnta2.example.com) 5 error NODE_QUERY 6 success NODE_ADD(gnta2) 7 success NODE_QUERY 8 success OS_DIAGNOSE 9 success INSTANCE_CREATE(instance1.example.com) 10 success INSTANCE_REMOVE(instance1.example.com) 11 error INSTANCE_CREATE(instance1.example.com) 12 success INSTANCE_CREATE(instance1.example.com) 13 success INSTANCE_SHUTDOWN(instance1.example.com) 14 success INSTANCE_ACTIVATE_DISKS(instance1.example.com) 15 error INSTANCE_CREATE(instance2.example.com) 16 error INSTANCE_CREATE(instance2.example.com) 17 success INSTANCE_CREATE(instance2.example.com) 18 success INSTANCE_ACTIVATE_DISKS(instance1.example.com) 19 success INSTANCE_ACTIVATE_DISKS(instance2.example.com) 20 success INSTANCE_SHUTDOWN(instance1.example.com) 21 success INSTANCE_SHUTDOWN(instance2.example.com) This is done by a simple change to the opcode classes, which allows an opcode to format itself. The additional function is small enough that it can go in opcodes.py, where it could also be used by a client if needed. Reviewed-by: imsnah
-
- Sep 01, 2008
-
-
Guido Trotter authored
It was already allowed in gnt-instance modify, but ignored. It will be used to force skipping parameter checks. This is a forward-port from branches/ganeti-1.2 Original-Reviewed-by: imsnah Reviewed-by: iustinp
-
- Aug 29, 2008
-
-
Alexander Schreiber authored
Add HVM device type flags 2/3 Reviewed-by: ultrotter
-
- Aug 08, 2008
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
- Jul 30, 2008
-
-
Iustin Pop authored
This (big) patch reworks the master startup/shutdown and the fixes the master failover. What does the patch do? For master start/stop: - remove the old ganeti-master script and its associated man page - moves the ip start/stop directly into the backend.(Start|Stop)Master - adds start/stop of the master/rapi daemon into these functions, selectively based on the start/stop arguments - makes the master call via rpc StartMaster(start_daemons=False) to the local node so that the master IP is started - and finally changes the example init.d script to directly start and stop all three daemons, since they do the right thing (depending on master/not master role) For master failover: - moves the code from LUMasterFailover into bootstrap.MasterFailover, since we need to start/stop the master during this operation and thus it can't be executed from the master - removes the LUMasterFailover and its associated opcode Notes: ubuntu's /etc/lsb-base-logging.sh is dumb, so the messages 'not master' are not seen during startup on non-master nodes. Reviewed-by: ultrotter
-
- Jul 15, 2008
-
-
Iustin Pop authored
Reviewed-by: imsnah
-
Iustin Pop authored
Since we don't have for now a job definition object anymore, we rename this class to BaseOpCode. It's still useful (and not merged with OpCode) since it holds all the 'pure' logic (no custom field handling, etc.) whereas OpCode holds opcode specific data (OP_ID handling, etc). The patch also fixes the module's docstring. Reviewed-by: imsnah
-
- Jul 09, 2008
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
- Jun 23, 2008
-
-
Iustin Pop authored
Since the disabling of forking in the master daemon, the two ssh-based subcommands were not working anymore. However, there is no need at all for the commands to be run from the master daemon (permissions to read the cluster private ssh key notwithstanding), they can be run directly from the command line utilities. The patch removes the two opcodes OpRunClusterCommand and OpClusterCopyFile (and their associated LUs) and changes the code in ‘gnt-cluster’ to query the list of nodes and run directly the SshRunner over the list. As such, all forking is done from the gnt-cluster script, and the commands are working again. Reviewed-by: imsnah
-