- Apr 14, 2011
-
-
Iustin Pop authored
So that we don't happen again to break this forever without realising it. The patch also replaces one ' with ". Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
If the cluster was upgraded from 2.4 or earlier, this key won't exist (it's only set to a correct value on cluster init), so we need to properly set it to a null string (disabled). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
This was (AFAICS) completely missing from the QA suite. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
For whatever reason, my test cluster managed to acquire shared_file_storage_dir with a None value, instead of empty string. This is not flagged in masterd itself, but the node daemon will fail in writing the value to disk, as it calls len() on the received value. Since this is a bad case, we should detect it as soon as possible (we basically shouldn't be able to set it), but in the meantime we at least prevent ssconf writes with such values. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Iustin Pop authored
It tests node add/remove secondary, rather than cluster-level N+1 checks, but it's better than nothing. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Adeodato Simo <dato@google.com>
-
Iustin Pop authored
This patch changes the add to secondary/remove from secondary code to not deduct/add the instance's memory if the instance is not auto_balanced. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Adeodato Simo <dato@google.com>
-
Iustin Pop authored
This also means _another_ change in the text format; we really should move to json… The unittests are also update for the new 9-column layout and additionally a bit of improvement is done. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Adeodato Simo <dato@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Adeodato Simo <dato@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Adeodato Simo <dato@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Adeodato Simo <dato@google.com>
-
- Apr 13, 2011
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Apr 12, 2011
-
-
Iustin Pop authored
This will mirror Ganeti's be/auto_balance one, which we need to use to properly match N+1 computations. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
I duplicate the BINARY= rule in the ghc invocation in order to be able to silence the if, which was confusing. Additionally, a new target for running just the htools unit-tests is provided. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Apr 07, 2011
-
-
René Nussbaumer authored
This patch just cleans up the htools codebase to make it more consistent with the naming of the Ganeti codebase. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
* devel-2.4: LUInstanceQueryData: Don't acquire locks unless requested Increase the lock timeouts before we block-acquire daemon.py: move startup log message before prep_fn Display the actual memory values in N+1 failures ssh.VerifyNodeHostname: remove the quiet flag Add error checking and merging for cluster params RAPI: Document need for Content-type header in requests Fix output for “gnt-job info” watcher: Fix misleading usage output Clarify --force-join parameter message locking: Fix race condition in lock monitor utils: Export NiceSortKey function Revert "Only merge nodes that are known to not be offline" cluster-merge: only operate on online nodes Only merge nodes that are known to not be offline Treat empty oob_program param as default Fix bug in instance listing with orphan instances Fix bug related to log opening failures Bump version for 2.4.1 release cfgupgrade: Fix critical bug overwriting RAPI users file Conflicts: NEWS: Trivial lib/opcodes.py: Added parameter descriptions, used variable for "use_locking" Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
* stable-2.4: Add error checking and merging for cluster params Clarify --force-join parameter message Treat empty oob_program param as default Fix bug in instance listing with orphan instances Fix bug related to log opening failures Bump version for 2.4.1 release cfgupgrade: Fix critical bug overwriting RAPI users file Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
And default to False, like in the Python codebase. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Iustin Pop authored
This allows extracting values from a JSON object that might miss, but have a well-defined default value. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Iustin Pop authored
First, fix hs-coverage on non-pristine tree, where the index.html file already existed, and second, disallow compilation of htools binaries if configure, for some reason, didn't enable them. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Apr 06, 2011
-
-
René Nussbaumer authored
Before hbal decided on the fly if an instance is migratable or not. As we implemented failover fallback in commit d5cafd31 we can start to use that. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Until now LUInstanceQueryData always acquired locks for the instance(s) and nodes involved. In combination with long-running operations this prevented the use of “gnt-instance info”, even with the “--static” option. With this patch, locks are only acquired when explicitely requested in the opcode (like all query operations). Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
As the code for failover for checking is almost identical it's an easy task to switch it over to the TLMigrateInstance. This allows us to fallback to failover if migrate fails prereq check for some reason. Please note that everything from LUInstanceFailover.Exec is taken over unchanged to TLMigrateInstance._ExecFailover, only with adaption to opcode fields and variable referencing, but not in logic. There still needs to go some effort into merging the logic with the migration (for example DRBD handling). But this should happen in a separate iteration. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
This has been observed to cause problems on real clusters via the following mechanism: - a long job (e.g. a replace-disks) is keeping an exclusive lock on an instance - the watcher starts and submits its query instances opcode which wants shared locks for all instances - after about an hour, the watcher job falls back to blocking acquire, after having acquired all other locks - any instance opcode that wants an exclusive lock for an instance cannot start until the watcher has finished, even though there's no actual operation on that instance In order to alleviate this problem, we simply increase the max timeout until lock acquires are sent back to either blocking acquire or priority increase. The timeout is computed such that we wait ~10 hours (instead of one) for this to happen, which should be within the maximum lifetime of a reasonable opcode on a healthy cluster. The timeout also means that priority increases will happen every half hour. We also increase the max wait interval to 15 seconds, otherwise we'd have too many retries with the increased interval. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
The intent of this function is to be able to provide a globbing operator or query filters. One should be able to say, for example, something to the effect of “gnt-instance shutdown '*.site'”. Also rename a variable in MatchNameComponent. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Until now “gnt-cluster verify” (LUClusterVerify) would compute its own list of files to check for consistency. This list was not complete and certain inconsistencies were missed. With this patch the code is changed to use the list of files used by LUClusterRedistConf. The new check needs to be on a whole-cluster level, and no longer per node. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
… and change the logic in _RedistributeAncillaryFiles. The virtually same list of files will be used to verify the files' consistency. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Apr 05, 2011
-
-
Michael Hanselmann authored
It'll be implemented using OP_REGEXP by the parser. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
So far this operator was not implemented. This patch adds an additional value preparation function to the function table for binary operators, used to compile the regular expression. Unittests are included. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Apr 04, 2011
-
-
Michael Hanselmann authored
Commit 75c7520f used the wrong constant. I double-checked all other changes made in the commit. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
Before this, the output in the rapi daemon log was: 2011-04-04 03:09:51,026: ganeti-rapi pid=17447 INFO Reading users file at /var/lib/ganeti/rapi/users 2011-04-04 03:09:51,027: ganeti-rapi pid=17447 INFO ganeti-rapi daemon startup Which is confusing, as it might look like the read of the users file is part of the previous run. This is because we log the 'daemon startup' message after the prepare_fn, which can log things on its own. The patch simply moves the 'daemon startup' message just before prepare_fn call. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This changes the display from: Mon Apr 4 02:29:46 2011 * Verifying N+1 Memory redundancy Mon Apr 4 02:29:46 2011 - ERROR: node node2: not enough memory to accomodate instance failovers should node node1 fail To: Mon Apr 4 02:32:50 2011 * Verifying N+1 Memory redundancy Mon Apr 4 02:32:50 2011 - ERROR: node node2: not enough memory to accomodate instance failovers should node node1 fail (33536MiB needed, 27910MiB available) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Adeodato Simo authored
Signed-off-by:
Adeodato Simo <dato@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Apr 01, 2011
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Mar 31, 2011
-
-
Iustin Pop authored
This is not needed for this function, and can interfere with debugging of ssh failures. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
The new wrapper makes moving legacy code to utils.Retry or adding retries in existing code simpler. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This exports whether htools was enabled at configure-time, and adds a constant for our reference iallocator. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-