- Mar 23, 2012
-
-
Iustin Pop authored
Fix a typo introduced in commit c85b15c1, which breaks epydoc. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Guido Trotter authored
* stable-2.5: LUNodeAdd: Verify version in Prereq Fix LV status parsing to accept newer LVM Bump version for 2.5.0~rc6 release Revert "Stop acquiring BGL for LUXI queries" LUClusterVerifyConfig: Share BGL, acquire all locks in shared mode KVM: don't add -nographic using spice Stop acquiring BGL for LUXI queries Fix type error in LUInstanceChangeGroup Conflicts: lib/hypervisor/hv_kvm.py - trivial, keep both changes Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
There are other ways to leave the cluster in a broken state than just the version check. However they are not very trivial to fix in 2.5. So leave it up to 2.6 for a nicer fix. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com> (cherry picked from commit e2ea8de1)
-
René Nussbaumer authored
There are other ways to leave the cluster in a broken state than just the version check. However they are not very trivial to fix in 2.5. So leave it up to 2.6 for a nicer fix. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
LVM version 2.02.93 (or at least, sometimes after .88) has extend the lv_attr field with two more flag; we only care about the first digit, so let's change the "!= 6" check to "< 6". Thanks to Robin H Johnson <robbat2@gentoo.org> for finding this issue. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Mar 22, 2012
-
-
Michael Hanselmann authored
This requires acquiring the node group locks in shared mode. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
The “cur_group_uuid” parameter is optional to prepare for using the factorized code from LUInstanceQueryData. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
While debugging another issue we realized that LUClusterQuery forks. This turned out to be the “platform.architecture” function from the Python library. It uses the “file” command to determine the architecture of the Python binary. This patch adds two new functions to the “runtime” module to get this information once per process instead of doing it every single time LUClusterQuery is used. Forking is a no-go in a multi-threaded environment anyway. A future change will also have to change the terminology in “gnt-cluster info”: it reports the binary architecture simply as “architecture”, when it's actually the binaries' architecture. Kernel and userland can be different. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
Michael Hanselmann authored
Don't notify for every released lock in shared mode. The last one is enough. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
Michael Hanselmann authored
This was already a TODO since the implementation of lock priorities in September 2010. Under certain conditions a waiting acquire can be notified at a time when it can't actually get the lock. In this case it would try and fail to acquire the lock and then return to the caller before the timeout ends. While this is not bad (nothing breaks), it isn't nice either. A separate patch will prevent unnecessary notifications when shared locks are released. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
Michael Hanselmann authored
While working on another SharedLock fix I realized timeouts on lock deletion don't work very well if the timeout actually expires. This patch fixes the issue and adds a new unittest. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
Andrea Spadaccini authored
Signed-off-by:
Andrea Spadaccini <spadaccio@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com> (cherry picked from commit f8326fca)
-
Michael Hanselmann authored
This reverts commit 0fa753ba. Turns out there are more queries acquiring locks than we'd like. This patch goes to version 2.6 and a separate patch fixes the immediate issues in LUClusterVerifyConfig. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
Michael Hanselmann authored
Instead of acquiring the BGL in exclusive mode (which blocks all other operations), we acquire all locks for groups, nodes and instances in shared mode before verifying the configuration. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
- Mar 21, 2012
-
-
Guido Trotter authored
This fixes issue 222. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Mar 20, 2012
-
-
Michael Hanselmann authored
Short description: This fixes an issue whereby masterd would become unresponsive on the LUXI socket, leading to client timeouts. While made worse in 2.5, the underlying issue was already present in 2.4. Longer description: Until now all LUXI queries would acquire the BGL (big Ganeti lock) in shared mode. With the exception of OpNodeAdd and OpNodeRemove, this was also the case for all opcodes before version 2.5. In 2.5 we split OpClusterVerify into multiple opcodes, one of which (OpClusterVerifyConfig) now acquires the BGL in exclusive mode. Whether or not doing so is good is a separate discussion: OpNodeAdd and OpNodeRemove, as of this writing, still require an exclusive BGL. OpClusterVerifyConfig is run more often than OpNodeAdd or OpNodeRemove in normal clusters, which is why we only recognized this issue in 2.5. What would happen is that once OpClusterVerifyConfig tried to acquire its exclusive BGL while it was actually held by other opcodes (e.g. OpInstanceReplaceDisks), the locking code would not grant shared acquires for the BGL, even when the exclusive acquire is removed from the queue for a short amount of time after a timeout. This is necessary to prevent lock starvation. In this situation further LUXI queries requiring the BGL in shared mode, e.g. OpClusterQuery, would block and the client eventually time out. Over time they fill the client request workerpool's queue and at that point even requests not requiring the BGL stop working. Once the long-running operation(s) holding the BGL in shared mode finished, OpClusterVerifyConfig gets it in exclusive mode and everything returns to normal. LUXI recovers very soon too. I'd like to thank Bernardo Dal Seno for his contribution to this bugfix. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Bernardo Dal Seno <bdalseno@google.com>
-
- Mar 19, 2012
-
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
If a specific list of groups has been requested, then the code used that, without transforming it to a (frozen)set first, which results in: unsupported operand type(s) for &: 'list' and 'frozenset' Trivial fix is to do that in the 'then' branch. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Feb 23, 2012
-
-
Michael Hanselmann authored
* stable-2.5: Fix Makefile.am compatibility with automake 1.11.2 Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 20, 2012
-
-
Iustin Pop authored
Automake 1.11.2 made the following change: * Long-standing bugs: - Automake now warns about more primary/directory invalid combinations, such as "doc_LIBRARIES" or "pkglib_PROGRAMS". Unfortunately, this breaks our Makefile.am (issue 216) exactly because we were relying on pkglib_SCRIPTS. This patch works around this by adding a new myexeclibdir variable (exec so that it is intalled at `install-exec` time, the same as the pkglibdir), and switches to that. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Feb 15, 2012
-
-
Iustin Pop authored
Sorry, forgot this in previous commit. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
The vgreduce command has changed behaviour from when we initially wrote the code (2.02.02 versus 2.02.66, 4 years delta): - if there are LVs which will be impacted, it requires --force - otherwise refuses to proceed, but it still returns exit code 0 We handle this by looking to see if it returns "Wrote out consistent volume group" (behaviour unchanged), or if it complains about "--force"; in the case it didn't complete, we retry the operation. We improve a bit the checking of "vgs", as it uses to fail silently and we didn't detect it. New tests for this function should test, I believe, all the expected variations; at the least we now have data files with the expected output. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 07, 2012
-
-
Iustin Pop authored
This is a partial cherry-pick from 7530364d on master: Currently, noded requires PUT, even though the semantics of the RPC calls do not match a PUT. We change the code accept both PUT and POST, with the intention to remove the PUT support in a later version. Additionally, we add a message to the HttpBadRequest exception to make clear the failure mode (not seeing any error message was what made me send this patch…). This was the only description-less use of this exception, by the way. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com> (cherry picked from commit 7530364d) What was not cherry-picked is the rpc change (to switch to PUT). The reason I want to backport this to devel-2.5 is that when upgrading to 2.6, having noded accept both makes for an easier upgrade path. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 01, 2012
-
-
Michael Hanselmann authored
* stable-2.5: Fix type check for OpQuery.filter Fix explanation of gnt-node evacuate --primaries-only Makefile.am: fix permissions for Python scripts on install devel/upload: Fix permissions for installed directories Fix cluster verification issues on multi-group clusters Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Jan 31, 2012
-
-
Michael Hanselmann authored
Just using ht.TListOf as a type check doesn't work correctly. The function must be called with the expected item type. In this specific case TListOf was always called with the filter as a value, and the result of that call evaluated to truth. Since filters can be quite complex there's no check yet, and therefore just “TList” is used. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Jan 26, 2012
-
-
Iustin Pop authored
Furthermore, correct the --help display on evacuate. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Bernardo Dal Seno authored
Some Python scripts in /usr/lib/ganeti/ were getting the wrong permissions (their 'x' bit was cleared). This patch fixes that behavior. This patch renames the variable 'dist_tools_PYTHON' to 'python_scripts'. Some Python scripts were listed in the 'dist_tools_PYTHON' variable, but as said scripts have no .py extension in their names, Automake treated the scripts as data files, and hence no 'x' bit. Now the Python scripts are processed by the rules created for the 'dist_tools_SCRIPTS' variable, and such rules don't depend on file name extensions. Signed-off-by:
Bernardo Dal Seno <bdalseno@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> (cherry picked from commit cc120286)
-
Bernardo Dal Seno authored
Permissions for the directories created during install depended on the umask of the user running the script. Now umask is reset inside the script to remove such dependency. Signed-off-by:
Bernardo Dal Seno <bdalseno@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> (cherry picked from commit 0f796800)
-
- Jan 25, 2012
-
-
Michael Hanselmann authored
This patch attempts to fix a number of issues with “gnt-cluster verify” in presence of multiple node groups and DRBD8 instances split over nodes in more than one group. - Look up instances in a group only by their primary node (otherwise split instances would be considered when verifying any of their node's groups) - When gathering additional nodes for LV checks, just compare instance's node's groups with the currently verified group instead of comparing against the primary node's group - Exclude nodes in other groups when calculating N+1 errors and checking logical volumes Not directly related, but a small error text is also clarified. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jan 20, 2012
-
-
Guido Trotter authored
* stable-2.5: Migrate: don't check for free memory on cleanup Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Guido Trotter authored
Cleanup just updates the config with the correct location of the instance, or informs of its down status, but never starts it. As such there's no point in checking for enough free memory. Actually this check could prevent a perfectly safe cleanup operation if a node is busy. Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Jan 19, 2012
-
-
Michael Hanselmann authored
This reverts commit 232aab3f. We shouldn't change the parsing of command line options in the middle of the 2.5.x series. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
* stable-2.5: Bump version to 2.5.0~rc5, update NEWS Add UnescapeAndSplit unittest for multi-escapes Fix a bug in command line option parsing code Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
This would have caught the bug in the first place. Argh, hand-generated test cases! Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Nikos Skalkotos authored
Fix bug affecting command line options of "keyval" type. Although escaping commands with \ is supported, it is is not applied to the input recursively. Signed-off-by:
Nikos Skalkotos <skalkoto@grnet.gr> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Jan 18, 2012
-
-
Michael Hanselmann authored
Python's “optparse” module does option name prefix matching by default. Since this can lead to confusing behaviour, e.g. by specifying “--force” for a command which only has a “--force-multi” option, this patch disables the prefix matching. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Jan 09, 2012
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-