- Apr 28, 2008
-
-
Iustin Pop authored
Currently the iallocator execution takes place in the master, which is a violation of the current architecture, and will create problems with a threaded master daemon. This patch moves the execution to the backend, similar to the hooks runner, by: - introducing a new class that handles the execution in the backend (and could be used also for listing the allocators, etc.) - introducing a new rpc call - replacing the actual execution in IAllocator.Run() with a rpc call This passes burnin with the dumb allocator Reviewed-by: imsnah
-
- Apr 10, 2008
-
-
Iustin Pop authored
This patch changes the definition of a job and introduces per-opcode results. First, the result and status fields of a job are condensed into a single 'status' attribute. Then, we introduce an opcode status and one result list, that allow jobs to return values. The gnt-job script is also modified to allow these new fields to be queried. Note that the patch changes the opcode field to op_list, and it changes its return value from string to a list of (serialized) opcodes. Reviewed-by: ultrotter
-
- Apr 05, 2008
-
-
Iustin Pop authored
This patch adds checks for the master role and daemonize support to ganeti-masterd. The patch modifies the startup/shutdown of the server because: - we want bind()/listen() to the master socket to occur before forking so that we can return a correct exit code and write messages to stderr - but we want thread startup to occur after fork(), otherwise python threading gets confused The patch also has some small cleanups: - remove the unix socket after closing it, so we don't need to remove it manually - instead of just telling the threads to terminate via the new_queue, we also join() them so that the logs show what thread clinging to life - the daemon logs to its own logfile now - there is command line parameter support :) Reviewed-by: imsnah
-
Manuel Franceschini authored
Reviewed-by: ultrotter
-
- Apr 04, 2008
-
-
Iustin Pop authored
This patch adds a very basic gnt-job script that allows job querying. This goes on top of the previous master daemon patches. Currently, because of the not-changed cmd lock, you can't query the jobs as long as a job is running - you have to rm the cmd lock and then you can query the jobs. Reviewed-by: imsnah
-
Iustin Pop authored
Currently, in ganeti-noded we have the createDaemon function. Since we'll need the same in other daemons, we move this function to utils.py With the move, a few changes were also done: - change the name to Daemonize() - add a parameter, logfile, as different daemons will want to log to different files - remove the try.. except.. around the fork calls, since they were only re-raising the OS exception with less data; unless we want to actually handle fork error (not just re-raising), these try blocks are not useful - change the return style at the end of the function Reviewed-by: imsnah
-
- Apr 01, 2008
-
-
Iustin Pop authored
This patch adds a very in-progress master daemon. This needs to be launched manually, does not background itself, but can be used for opcode execution. Also parts of this code should be moved to luxi.py. Reviewed-by: ultrotter
-
- Mar 27, 2008
-
-
Iustin Pop authored
This patch just removes an extraneous \n from the log message making it nicer to view. Reviewed-by: schreiberal
-
- Mar 19, 2008
-
-
Iustin Pop authored
Currently in order to deal with tmpfs /var/run, we create the BDEV_CACHE_DIR in the init script. However, that does not cover all the cases, and it's not a proper place to deal with it: for example, dealing with not initialized clusters and the master node is more complicated. Therefore, this patch does: - make ganeti-noded create the directory automatically - make ganeti-noded error out if it can't create it or it's already there but not a directory - remove the creation from the init.d script Reviewed-by: ultrotter
-
- Mar 11, 2008
-
-
Iustin Pop authored
This patch modifies TcpPing and its callers to make the source address selection optional. Usually, the kernel will know better what source address to use, just in some cases we want to enforce a given source address so it makes sense to make this optional. Reviewed-by: ultrotter
-
- Feb 22, 2008
-
-
Iustin Pop authored
This patch switches from the twisted usage for inter-node protocol to simple BaseHTTPServer/httplib. The patch has more deletions because we use no authentication, no encryption at all. As such, this is just for trunk, and only for testing. What it brings is the ability to use the rpc library from within multiple threads in parallel (or it should so). Since the changes are very few and non-intrusive, they can be reverted without impacting the rest of the code. This passes burnin. QA was not tested. Reviewed-by: imsnah
-
- Feb 05, 2008
-
-
Iustin Pop authored
This can be used for testing purposes. Reviewed-by: ultrotter,imsnah
-
- Dec 12, 2007
-
-
Iustin Pop authored
This patch modifies the watcher to run the ‘gnt-cluster verify-disks’ command and to log its output (if any). Reviewed-by: imsnah
-
- Dec 03, 2007
-
-
Michael Hanselmann authored
- When line wrapping is needed, move spaces to the next line. - Remove embedded line breaks from error messages. Reviewed-by: schreiberal
-
- Nov 29, 2007
-
-
Iustin Pop authored
This patch adds logging of command failures to the debug log in case the user either started the command (gnt-*) or the node daemon with the debug flag. Reviewed-by: imsnah
-
- Nov 13, 2007
-
-
Michael Hanselmann authored
- Use constants for keys. - Fix bug through which automatic instance restarts wouldn't be limited Reviewed-by: iustinp
-
- Nov 05, 2007
-
-
Guido Trotter authored
In order to do this for simplicity we leave the OSFromDisk function as-is and we convert the eventual exception to an OS object in ganeti-noded. The unmangling gets simplified and so does the code for checking whether the OS is valid. Reviewed-By: iustinp
-
Guido Trotter authored
The functions in ganeti-noded and rpc.py still deal with the fact that an InvalidOS error could be returned by DiagnoseOS. As this is not the case anymore simplify their code for the current behavior. Reviewed-By: iustinp
-
- Nov 02, 2007
-
-
Iustin Pop authored
Currently, troubleshooting DRBD problems involves a manual process of going backwards from the DRBD device to the instance that owns it. This patch adds a weak (i.e. not guaranteed to be correct or up-to-date) cache of device to instance. The cache should be, in normal operation, having correct information as the only time when devices change paths are when they are started/stopped, and the code in backend.py adds cache updates to exactly these operations. The only drawback of this implementation is that we don't fully update the cache on renames of devices (we clean the old entries but we don't add new ones). Since the rename changes the path only for LVs (and not drbd and md), this is less of a problem as the target of this code is debugging DRBD and MD issues. The patch writes files named bdev_drbd<N> (or bdev_md<N>, bdev_xenvg_...) in /var/run/ganeti (more exactly, LOCALSTATEDIR/ganeti). The files start with 'bdev_' and continue with the path of the device under /dev/ (this prefix stripped), and contain the following values, space separated: - instance name - primary or secondary (depending on how the device is on the primary or secondary node) - instance visible name: sda or sdb or not_visible, the latter case when the device is not the top-level device (i.e. remote_raid1 templates will have sd[ab] for the md, but not_visible for drbd and logical volumes) The cache is designed to not raise any errors, if there is an I/O error it will only be logged in the node daemon log file. This is in order to reduce the possible impact of the cache on the block device activation and shutdown code. Reviewed-by: imsnah
-
- Oct 29, 2007
-
-
Iustin Pop authored
This patch add code for renaming a device; more precisely, for changing the unique_id of the device. This means: - logical volumes, rename the volume - drbd8, change the remote peer This is needed for the being able to replace disks for drbd8. Reviewed-by: imsnah
-
- Oct 25, 2007
-
-
Iustin Pop authored
The two calls mirror_addchild and mirror_removechild take only one child for addition/removal. While this is enough for our md usage, for local disk replacement in drbd8, we need to be able to specify both the data and metadata device. This patch modifies these two rpc calls (and their backend implementation and their usage in cmdlib) to take a list of children to add/remove. Reviewed-by: imsnah
-
- Oct 17, 2007
-
-
Alexander Schreiber authored
This patch series implements the reboot command for gnt-instance. It supports three types of reboot: soft (hypervisor reboot), hard (instance config rebuild and reboot) and full (full instance shutdown and startup again). This patch contains the backend and rpc part of the patch. Reviewed-by: iustinp
-
- Oct 15, 2007
-
-
Iustin Pop authored
The creation of the log file for the node daemon lacks the mode parameter, so after applying the current umask, the file got 0700 permissions. Restrict this to the correct 0600. Reviewed-by: schreiberal
-
- Oct 10, 2007
-
-
Alexander Schreiber authored
This patch completely gets rid of fping - replace all fping invocations with TcpPing calls - update documentation accordingly. - associated cleanups (use constant for localhost IP, use more sensible defaults for TcpPing and _use_ those) Reviewed-by: iustinp
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
Michael Hanselmann authored
- Change format of watcher state file to JSON. - Move log path for watcher script to constants.py. Reviewed-by: iustinp
-
- Oct 04, 2007
-
-
Michael Hanselmann authored
- Add NEWS file with major changes between versions. - Bump RPC version number - No longer serialize in RPC, but just convert to dict Old Pickle based configuration files can be converted using the cfgupgrade utility. Reviewed-by: iustinp, ultrotter
-
- Sep 21, 2007
-
-
Iustin Pop authored
We currently require that hostnames are FQDN not short names (node1.example.com instead of node1). We can allow short names as long as: - we always resolve the names as returned by socket.gethostname() - we rely on having a working resolver These issues are not as big as may seem, as we only did gethostname() in a few places in order to check for the master; we already required working resolver all over the code for the other nodes names (and thus requiring the same for the current node name is normal). The patch moves some resolver calls from within execution path to the checking path (which can abort without any problems). It is important that after this patch is applied, no name resolving is called from the execution path (LU.Exec() or other code that is called from within those methods) as in this case we get much better code flow. This patch also changes the functions for doing name lookups and encapsulates all functionality in a single class. The final change is that, by requiring working resolver at all times, we can change the 'return None' into an exception and thus we don't have to check manually each time; only some special cases will check (ganeti-daemon and ganeti-watcher which are not covered by the generalized exception handling in cli.py). The code is cleaner this way. Reviewed-by: imsnah
-
Iustin Pop authored
The EXIT_NODESETUP_ERROR is a useful constant and ganeti-watcher could use it too. This patch moves it to constants.py and modifed the ganeti-master script to use it from there. Reviewed-by: imsnah
-
- Sep 17, 2007
-
-
Iustin Pop authored
This patch adds support for instance rename operation at all remaining layers: RPC, OpCode/LU and CLI. Reviewed-by: imsnah
-
- Aug 20, 2007
-
-
Iustin Pop authored
This was forgotten when the init script was changed. Reviewed-by: imsnah
-
- Aug 14, 2007
-
-
Iustin Pop authored
This changes the raising of exceptions from: raise Exception, value to raise Exception(value) as the first form will be removed in python-3000 and the second form is preferred now. The changes also involve a few cases of changing from raising standard exceptions and use our own ones. The new version also fixes many pylint-generated warnings, especially in ganeti-noded where I changed many methods to @staticmethod. There is no functionality changed (barring any bugs).
-
- Aug 03, 2007
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
- Jul 26, 2007
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
- Jul 25, 2007
-
-
Iustin Pop authored
handling, as it can be static and outside of ganeti. This also means we can get rid of a lot of infrastructure too: - the master/node config files checkers - one rpc function
-
Iustin Pop authored
Reviewed-by: imsnah
-
- Jul 24, 2007
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-
Michael Hanselmann authored
- Create all --output options using a constant - Put node checking code from opcodes into a single function - Do the same for output fields Reviewed-by: iustinp
-
- Jul 23, 2007
-
-
Iustin Pop authored
- move the master node name from the ConfigWriter to SimpleStore (all nodes need this, and it was the only thing pulled in from the ConfigWriter on nodes) - fix mcpu.py and the testing w.r.t. this change; for testing, rename the fake_config.py to mocks.py and add a FakeSStore object - then add a ganeti-master script which can be run on any node at boot and which will not do anything if not master on start (on stop it will still try to remove the ip address) - also add a new cluster-wide variable (master_netdev) that determines on which interface we add this ip address; it's customizable at cluster init time - also remove the cluster name file which was separately handled from ssconf (not needed anymore) - remove the master init.d links from the list of config files as this is not our responsibility now
-
- Jul 18, 2007
-
-
Michael Hanselmann authored
Reviewed-by: iustinp
-