- Oct 05, 2011
-
-
Andrea Spadaccini authored
Treat the gnt-cluster verify errors identified by the error codes in --ignore-errors as warnings; just print a warning message for the user. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Andrea Spadaccini authored
lib/cli.py - add IGNORE_ERROR_OPT; client/gnt_cluster.py - pass the ignore_errors parameter to the opcodes lib/opcode.py - update OpClusterVerifyConfig, OpClusterVerify and OpClusterVerifyGroup to accept the ignore_errors parameter lib/cmdlib.py - pass the ignore_errors parameter to the opcodes that need it Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Andrea Spadaccini authored
- move the cluster verify error codes from cmdlib._VerifyErrors to constants; - add to each of them the CV (Cluster Verify) prefix; - add the CV_ALL_ECODES and CV_ALL_ECODES_STRINGS constants; - wrap the lines that exceed 80 characters after changing the error code names to the new ones. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Andrea Spadaccini authored
Change 5a8648eb changed the order of the return values of backend.GetMasterInfo(). This broke the users of the master_info RPC. This change restores the original order, and adds a comment in bootstrap.py about the new value added to the return values of master_info. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Andrea Spadaccini authored
Add the master_netmask cluster parameter, that represents the netmask of the master IP, encoded as a CIDR suffix. This parameter can be set via the --master-netmask of gnt-cluster init and gnt-cluster modify. The default behaviour is to be consistent with the old default (/32 for IPv4 and /128 for IPv6). Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Andrea Spadaccini authored
Add the following methods to netutils.IPAddress: * ValidateNetmask * GetClassFromIpVersion * GetClassFromIpFamily Also, add related tests to the test suite. Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 04, 2011
-
-
Michael Hanselmann authored
If a cluster has any non-master-candidate nodes, those don't contain all files (e.g. config.data). With commit aef59ae7 (March 31st, 2011) the logic was changed and subsequently verifying a cluster with non-mc nodes would complain. This patch fixes this issue by changing the algorithm. It also adds an additional check for files which shouldn't exist on a machine. A newly added unittest is included. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Oct 03, 2011
-
-
Michael Hanselmann authored
This reverts commit 34aa8b7c. Writing error messages to stderr would also include backtraces, something we tried to avoid in the past. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Commit 64c7b383 changed the RPC call for verifying SSH connections. Unfortunately this case in adding nodes was missed. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 30, 2011
-
-
Michael Hanselmann authored
When verifying a group the code would always check SSH to all nodes in the same group, as well as the first node for every other group. On big clusters this can cause issues since many nodes will try to connect to the first node of another group at the same time. This patch changes the algorithm to choose a different node every time. A unittest for the selection algorithm is included. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
In the case we submit many pending jobs (> 100) to the masterd, the JobExecutor 'spams' the master daemon with status requests for the status of all the jobs, even though in the end it will only choose a single job for polling. This is very sub-optimal, because when the master is busy processing small/fast jobs, this query forces reading all the jobs from this. Restricting the 'window' of jobs that we query from the entire set to a smaller subset makes a huge difference (masterd only, 0s delay jobs, all jobs to tmpfs thus no I/O involved): - submitting/waiting for 500 jobs: - before: ~21 s - after: ~5 s - submitting/waiting for 1K jobs: - before: ~76 s - after: ~8 s This is with a batch of 25 jobs. With a batch of 50 jobs, it goes from 8s to 12s. I think that choosing the 'best' job for nice output only matters with a small number of jobs, and that for more than that people will not actually watch the jobs. So changing from 'perfect job' to 'best job in the first 25' should be OK. Note that most jobs won't execute as fast as 0 delay, but this is still a good improvement. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Andrea Spadaccini authored
lib/client/gnt_cluster.py: * Add activate-master-ip and deactivate-master-ip commands man/gnt-cluster.rst: * Document the new commands lib/opcodes.py lib/cmdlib.py * Add two opcodes and the LU that call the relevant RPCs test/docs_unittest.py * Silence an error about RAPI not implemented for the two new opcodes Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> (cherry picked from commit fb926117) Conflicts: test/docs_unittest.py - kept devel-2.5 version, without the RAPI opcode checks
-
Andrea Spadaccini authored
lib/backend.py * split StartMaster() in ActivateMasterIp() and StartMasterDaemons() * split StopMaster() in DeactivateMasterIp() and StopMasterDaemons() lib/server/noded.py, lib/rpc.py * adapt the call chains to the new functions, define new RPCs lib/bootstrap.py, lib/cmdlib.py, lib/server/masterd.py * use the new RPCs Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> (cherry picked from commit fb460cf7)
-
Andrea Spadaccini authored
lib/client/gnt_cluster.py: * Add activate-master-ip and deactivate-master-ip commands man/gnt-cluster.rst: * Document the new commands lib/opcodes.py lib/cmdlib.py * Add two opcodes and the LU that call the relevant RPCs test/docs_unittest.py * Silence an error about RAPI not implemented for the two new opcodes Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Andrea Spadaccini authored
lib/backend.py * split StartMaster() in ActivateMasterIp() and StartMasterDaemons() * split StopMaster() in DeactivateMasterIp() and StopMasterDaemons() lib/server/noded.py, lib/rpc.py * adapt the call chains to the new functions, define new RPCs lib/bootstrap.py, lib/cmdlib.py, lib/server/masterd.py * use the new RPCs Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
When “gnt-cluster copyfile” failed it would only print “Copy of file … to node … failed”. A detailed message is written using logging.error. Writing error messages to stderr can be helpful in figuring out what went wrong (the messages also go to the log file, but not everyone might know about it). Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 29, 2011
-
-
Andrea Spadaccini authored
Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Andrea Spadaccini authored
* hypervisor/hv_kvm.py - parse the memory transfer status * cmdlib.py - represent memory transfer info, if available Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Andrea Spadaccini authored
To add status reporting for the KVM migration, the instance_migrate RPC must be non-blocking. Moreover, there must be a way to represent the migration status and a way to fetch it. * constants.py: - add constants representing the migration statuses * objects.py: - add the MigrationStatus object * hypervisor/hv_base.py - change the FinalizeMigration method name to FinalizeMigrationDst - add the FinalizeMigrationSource method - add the GetMigrationStatus method * hypervisor/hv_kvm.py - change the implementation of MigrateInstance to be non-blocking (i.e. do not poll the status of the migration) - implement the new methods defined in BaseHypervisor * backend.py, server/noded.py, rpc.py - add methods to call the new hypervisor methods - fix documentation of the existing methods to reflect the changes * cmdlib.py - adapt the logic of TLMigrateInstance._ExecMigration to reflect the changes Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Andrea Spadaccini authored
Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 28, 2011
-
-
Andrea Spadaccini authored
* hv_kvm.py, hv_xen.py - return the hypervisor version (if available) from GetNodeInfo * cmdlib.py - if hypervisor version is available during the migration, and the versions differ, warn the user Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
The change to enforce boolean results for cluster verify group opcode missed the HooksCallBack, which uses a very ugly 1/0 logic. Furthermore, the logic is wrong, since it unconditionally resets the verify result to true. The patch is changed to simply treat hook failures as failures, and do nothing for offline/nodes. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
In the context of the lock monitor a “pending” item does not yet own the requested resource. Since these HTTP requests are already undergoing they should be shown as owners. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
With this change a node name instead of the IP address can be shown for pending RPC requests: Name Pending rpc/node18.example.com/test_delay thread:Jq1/Job692/TEST_DELAY Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Not all requests use an instance of RpcRunner yet and therefore won't show up (only instances have access to the global Ganeti context). Currently only the IP address is accessible. Another patch will add a nicer name for requests. Example output (gnt-debug locks -o name,pending): Name Pending rpc/192.0.2.18/test_delay thread:Jq12/Job683/TEST_DELAY Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
This simplifies HttpClientPool.ProcessRequests significantly and will be handy for showing pending RPC requests in the lock monitor. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Iustin Pop authored
This reverts to the old behaviour in Ganeti 2.4 and before. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Sep 27, 2011
-
-
Michael Hanselmann authored
Call dict.values once instead of N times. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
- Clearly separate node name to IP address resolution into separate functions - Simplified code structure (one code path instead of several) - Fully unittested - Preparation for more RPC improvements Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
No need to keep it in the class. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Instead of having one RPC runner per mcpu processor this will keep only one instance as part of the masterd-wide Ganeti context. Upcoming patches will change the RPC runner to report pending requests to the lock manager. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Sep 26, 2011
-
-
Agata Murawska authored
Signed-off-by:
Agata Murawska <agatamurawska@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Agata Murawska authored
Signed-off-by:
Agata Murawska <agatamurawska@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Agata Murawska authored
Signed-off-by:
Agata Murawska <agatamurawska@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Agata Murawska authored
Signed-off-by:
Agata Murawska <agatamurawska@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Agata Murawska authored
Signed-off-by:
Agata Murawska <agatamurawska@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Agata Murawska authored
Signed-off-by:
Agata Murawska <agatamurawska@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Agata Murawska authored
Signed-off-by:
Agata Murawska <agatamurawska@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Agata Murawska authored
Signed-off-by:
Agata Murawska <agatamurawska@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-