- Jul 26, 2009
-
-
Guido Trotter authored
Signed-off-by:
Guido Trotter <ultrotter@google.com>
-
- Jul 25, 2009
-
-
Guido Trotter authored
With three ganeti daemons, and one or two more coming, the daemon's main function started becoming too much cut&pasted code. Collapsing most of it in a daemon.GenericMain function. Some more code could be collapsed between the two http-based daemons, but since the new daemons won't be http-based we won't do it right now. As a bonus a functionality for overriding the network port on the command line for all network based nodes is added. Signed-off-by:
Guido Trotter <ultrotter@google.com>
-
- Jul 24, 2009
-
-
Guido Trotter authored
The <DAEMON>_PID constants were created to reference a daemon pid file, but actually contain a daemon's name, because the various functions that work with pidfiles abstract the filename from the daemon name themselves. Removing the constants and using the actual daemon name constants in their place. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Guido Trotter authored
The original LOG_<DAEMON_NAME> constants for daemon logfiles are gone. In their place there is a DAEMONS_LOGFILES dict, indexed by daemon name. This is a minor change with the objective to uniform most of the daemon's main() functions code, which is very similar one to the other. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Guido Trotter authored
GetNodeDaemonPort is used to lookup the node daemon port in the services file, and if not found to return the default one. We make it a generic function, which accepts the daemon name in input, so that it can be used by confd as well, to lookup its own udp port. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jul 23, 2009
-
-
Guido Trotter authored
Various modules set it to True when called in debugging mode, but the utils module supports no such global. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jul 22, 2009
-
-
Guido Trotter authored
On machines without the ssl file noded exists '5'. Changing this to constants.EXIT_NOTCLUSTER. Also utils.GetNodeDaemonPort hasn't risen errors.ConfigurationError for a while, so removing that try/except block. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jul 08, 2009
-
-
Guido Trotter authored
When the parameter is set to True and start_daemons is also True, ganeti-masterd will be started with the new --no-voting --yes-do-it options. This new option is set to True only on masterfailover, when no_voting is used. This changed the behavior from 2.0, where we didn't start the master daemon at all, when this option was used. The manpage is also updated to remove the 2.0 only change. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jun 29, 2009
-
-
Iustin Pop authored
There are volume-related rpc calls. This patch renames the ‘volume_list’ call to ‘lv_list’ to make more clear its purpose. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jun 15, 2009
-
-
Iustin Pop authored
Since now all functions fail via _Fail, the return True, … is redundant as all normal return paths have it, and thus the True value can be added in the ganeti-noded handler. This means that all functions can now forget about the special result type, and instead return normally, but signal all failures via _Fail(). Only a few functions must be handled specially (the recursive ones). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
Since all rpc calls were converted, we can now: - enforce result type to (status, data) - convert all unhandled exceptions to (False, str(err)) This makes sure that all unhandled errors are reported to rpc users. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
Currently the OSes have a special, customized error handling: the OS object can represent either a valid OS, or an invalid OS. The associated function, instead of raising other exception or failing, create custom OS objects representing failed OSes. While this was good when no other RPC had failure handling, it's extremely different from how other function in backend.py expect failures to be signalled. This patch reworks this completely: - the OS object always represents valid OSes (the next patch will remove the valid/invalid field and associated constants) - the call_os_diagnose returns instead of a list of OS objects, a list of (name, path, status, diagnose_msg); the status is then used in cmdlib to determine validity and the status and diagnose_msg values are used in gnt-os for display - call_os_get returns either a valid OS or a RPC remote failure (with the error message) - the other functions in backend.py now just call backend.OSFromDisk() which will return either a valid OS object or raise an exception - the bulk of the OSFromDisk was moved to _TryOSFromDisk which returns status, value for the functions which don't want an exception raised The gnt-os list and diagnose commands still work after this patch. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This patch converts the job queue rpc calls to the new style result. It's done in a single patch as there are helper function (in both jqueue and backend) that are used by multiple rpcs and need synchronized change. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This also removes custom post-processing from rpc.py; since this call has only one user, it was simple to move it back to the caller. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This also cleans up its single use in cmdlib.py. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This patch converts this rpc call to the new style result, and also changes in the process the meaning of the QuitGanetiException's arguments and the node daemon rpc call exception handler. The problem with the exception handler is that we used a two-stage one, and the inner used to catch all exception (including this one), so in the logs we always had an exception logged, instead of the normal 'leaving cluster message'. The patch also adds logging of the exception's arguments, so that we have a trail in the logs about the shutdown mode. The exception's arguments were reversed from the normal RPC results style. While it makes somewhat more sense for this exception, we change them such that they match the rpc result format. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This should actually have a function in backend, but it's fine for now. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
Since backend.GetInstanceList() is used both as RPC endpoint and as internal function, it can't return (status, value). Instead it returns only valid instance info, and failures are denoted by exceptions; and the ganeti-noded function adds the (True,) status. The patch also fixes a typo. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This is a big change, because we need to cleanup its users too. The call and thus LUVerifyDisks LU used to differentiate between failure at node level and failure at LV level, by returning different types in the RPC result. This is way too complicated for our needs. The patch changes to new style result (easy change), and then: - changes LUVerifyDisks.Exec() to return a tuple of 3-elements instead of 4-elements; we collapse the «nodes not reachable» and «nodes with LVM errors» in a single dict - changes gnt-cluster to parse 3-element results and simplifies the different by-error handling code Note that the status is added in ganeti-noded, and not in the function itself, as the function is used in other places too. This was tested with down nodes and broken VGs. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This also removes some code from ganeti-noded and rpc.py, which should not do such processing of data (and be simply glue code). (Or alternatively they could, if we had better infrastructure). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jun 09, 2009
-
-
Iustin Pop authored
This patch adds a simple failure reporting tool, similar to bdev's _ThrowError. In backend, we move towards the new-style RPC results (of type (status, payload)) and thus functions which use this style can very easily log and return the error message using this new function. The exception is declared here and not in errors.py since it's local to the node-daemon/backend combination. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- May 27, 2009
-
-
Iustin Pop authored
This (somewhat big) patch adds support for remotely rebooting the nodes via whatever support the hypervisor has for such a concept. For KVM/fake (and containers in the future) this just uses sysrq plus a ‘reboot’ call if the sysrq method failed. For Xen, it first tries the above, and then Xen-hypervisor reboot (we first try sysrq since that just requires opening a file handle, whereas xen reboot means launching an external utility). The user interface is: # gnt-node powercycle node5 Are you sure you want to hard powercycle node node5? y/[n]/?: y Reboot scheduled in 5 seconds The node reboots hopefully after sending the reply. In case the clock is broken, “time.sleep(5)” might take ages (but then I suspect SSL negotiation wouldn't work). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- May 06, 2009
-
-
Guido Trotter authored
Sometimes reinstalls are slightly different than new installs. For example certain partitions may need to be preserved accross reinstalls. In order to do that on a per-os basis we pass in the INSTANCE_REINSTALL variable to inform the create script about when a reinstall is happening. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- May 05, 2009
-
-
Guido Trotter authored
This allows ganeti-noded to bind only on one interface rather than all the ones on the machine. The default behaviour doesn't change. Signed-off-by:
Guido Trotter <ultrotter@google.com>
-
- Feb 27, 2009
-
-
Guido Trotter authored
Some hypervisors (KVM) need RUN_GANETI_DIR to exist even at cluster init time. This patch creates it in InitCluster just before hv parameter checking. Since the code to make list of directories is already repeated twice in the code, and this would be the third time, we abstract it into an utils.EnsureDirs function and we call that one from ganti-noded, ganeti-masterd and bootstrap. Reviewed-by: iustinp
-
- Feb 24, 2009
-
-
Iustin Pop authored
This patch removes the extra_args parameter and instead switches the instance to the HV_KERNEL_ARGS hypervisor option. This is a big change, but it's a needed cleanup, this extra parameter on all RPC calls is not generic and we also need to have a persistent value here. Reviewed-by: imsnah
-
- Feb 12, 2009
-
-
Iustin Pop authored
This patch changes the return type from this RPC call to include status information and renames the backend method to match the RPC call name. The patch is a little bigger than the reboot one, since this call is used in more than one place. However, all the points of call have the same usage pattern, so the patch is trivial. Reviewed-by: ultrotter
-
Iustin Pop authored
This small patch changes the return type from this RPC call to include status information and renames the backend method to match the RPC call name. Reviewed-by: ultrotter
-
- Feb 11, 2009
-
-
Guido Trotter authored
We need this directory for locks, so if for any reason it's not there we'll create it. The permissions are the standard /var/lock permissions. Reviewed-by: iustinp
-
- Feb 09, 2009
-
-
Iustin Pop authored
Currently, the names of the functions in backend.py that are actually RPC procedures and are called from ganeti-noded are not corresponding to the RPC names. This makes it hard to actually see which functions are exported and which functions are internal to backend. This patch renames all blockdevice-related functions in backend.py match the name of the RPC call (without the ‘call’ or ‘perspective’ prefix). This should make it easier to grep for a given function called in cmdlib, without having to open and check in ganet-inoded what backend function it corresponds to. The patch also does two minor extra cleanups (rename a variable and change a logging level). Reviewed-by: ultrotter
-
Iustin Pop authored
This patch converts the call_blockdev_find - which searches for block devices and returns their status - to the (status, data) format. We also modify the backend function name to match the rpc call. Reviewed-by: ultrotter
-
Iustin Pop authored
This patch fixes the error handling in the add OS to instance function with regard to invalid OSes. Previously, we didn't handle any such errors, with the end result that the user would have to look in the node daemon log. The patch also renames the name of the function to match the RPC call name. Reviewed-by: ultrotter
-
- Jan 21, 2009
-
-
Guido Trotter authored
Currently the hypervisor is expected to do all the migration from the source side. With this patch we also add the option of passing some information to the target side, and starting some operation there. As a bonus, a function to cleanup any started operation is included. Reviewed-by: iustinp
-
- Jan 13, 2009
-
-
Iustin Pop authored
This is a modified forward-port of DrbdNetReconfig and their associated RPCs. In Ganeti 2.0, these functions will be used for two things: - live migration (as in 1.2) - and for other network reconfiguration tasks, since DRBD8.Attach() doesn't do them anymore Because of the Attach() changes, we can now implement the AttachNet/DisconnectNet functions as independent entities, and we don't need the cache anymore. Note these functions are copies of the latest 1.2 code, and not cherry-picks of the (many) patches that went into 1.2. Reviewed-by: ultrotter
-
- Jan 09, 2009
-
-
Iustin Pop authored
The current fork+close fds sequence has deficiencies which are hard to work around: - logging can start logging before we fork (e.g. if we need to emit messages related to master checking), and thus use FDs which we can't track nicely - the queue locks the queue file, and again this fd needs to be kept open which is hard from the main loop (and this error is currently hidden by the fact that we don't log it) Given the above, it's much simpler, in case we will fork later, to close file descriptors right at the beginning of the program, and in Daemonize only close/reopen the stdin/out/err fds. In addition, we also close() the handlers we remove in SetupLogging so that the cleanup is more thorough. Reviewed-by: imsnah
-
- Jan 08, 2009
-
-
Iustin Pop authored
This is a forward-port of commit 1194 on the 1.2 branch: This call will check whether an instance is up on its primary, and that it has been started with symlinks. We currently have no on-secondary checks, nor any hypervisor specific call. Reviewed-by: iustinp The difference from the original patch is that we don't include the cmdlib changes, since those will come as a copy from the 1.2 cmdlib.py, and not as individual patches. Original-Author: ultrotter
-
- Jan 07, 2009
-
-
Iustin Pop authored
This is an extract of commit 1166 on the 1.2 branch (Add a rpc call for drbd network reconfiguration), but only the blockdev_close part. The patch changes the blockdev_close call to take the instance so that it can remove the symlinks of the instance. Originally-Reviewed-by: imsnah
-
- Dec 18, 2008
-
-
Michael Hanselmann authored
Reviewed-by: ultrotter
-
- Dec 11, 2008
-
-
Iustin Pop authored
This patch should fix all outstanding epydoc parsing errors; as such, we switch epydoc into verbose mode so that any new errors will be visible. Reviewed-by: imsnah
-
- Dec 05, 2008
-
-
Iustin Pop authored
This patch adds a simple rpc which makes a backup of the config file and then removes it. This is done so that cluster verify doesn't complain immediately after demoting a node. Reviewed-by: imsnah
-