1. 12 Apr, 2012 1 commit
  2. 11 Apr, 2012 6 commits
  3. 30 Mar, 2012 1 commit
  4. 29 Mar, 2012 1 commit
    • Dimitris Aragiorgis's avatar
      Fix a bug concerning TCP port release · 3b3b1bca
      Dimitris Aragiorgis authored
      Commit f396ad8c
       returns the TCP port used by DRBD disk back to the
      TCP/UDP port pool using AddTcpUdpPort().
      However, AddTcpUdpPort() writes the config on every invocation,
      using _WriteConfig(). This causes two problems:
       * it causes critical errors logged by VerifyConfig(), after the DRBD
         disk removal, and until the actual instance removal.
       * if the code following AddTcpUdpPort() fails, the port is already
         returned back the pool, which causes the port to have duplicates
         (inconsistent config).
      AddTcpUdpPort() is invoked in three cases:
       * during InstanceRemove() through _RemoveDisks().
       * during InstanceSetParams() in case of disk removal.
       * during InstanceSetParams() through _ConvertDrbdToPlain().
      This commit fixes the problem by removing the _WriteConfig() call from
      AddTcpUdpPort(), delegate it to Update() via the
      TemporaryReservationManager and ensure AddTcpUdpPort() precedes
      Signed-off-by: default avatarDimitris Aragiorgis <dimara@grnet.gr>
      [iustin@google.com: small comments adjustements]
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
  5. 28 Mar, 2012 4 commits
  6. 23 Mar, 2012 5 commits
  7. 22 Mar, 2012 10 commits
  8. 21 Mar, 2012 1 commit
  9. 20 Mar, 2012 1 commit
    • Michael Hanselmann's avatar
      Stop acquiring BGL for LUXI queries · 0fa753ba
      Michael Hanselmann authored
      Short description: This fixes an issue whereby masterd would become
      unresponsive on the LUXI socket, leading to client timeouts. While made
      worse in 2.5, the underlying issue was already present in 2.4.
      Longer description: Until now all LUXI queries would acquire the BGL
      (big Ganeti lock) in shared mode. With the exception of OpNodeAdd and
      OpNodeRemove, this was also the case for all opcodes before version 2.5.
      In 2.5 we split OpClusterVerify into multiple opcodes, one of which
      (OpClusterVerifyConfig) now acquires the BGL in exclusive mode. Whether
      or not doing so is good is a separate discussion: OpNodeAdd and
      OpNodeRemove, as of this writing, still require an exclusive BGL.
      OpClusterVerifyConfig is run more often than OpNodeAdd or OpNodeRemove
      in normal clusters, which is why we only recognized this issue in 2.5.
      What would happen is that once OpClusterVerifyConfig tried to acquire
      its exclusive BGL while it was actually held by other opcodes (e.g.
      OpInstanceReplaceDisks), the locking code would not grant shared
      acquires for the BGL, even when the exclusive acquire is removed from
      the queue for a short amount of time after a timeout. This is necessary
      to prevent lock starvation.
      In this situation further LUXI queries requiring the BGL in shared mode,
      e.g. OpClusterQuery, would block and the client eventually time out.
      Over time they fill the client request workerpool's queue and at that
      point even requests not requiring the BGL stop working. Once the
      long-running operation(s) holding the BGL in shared mode finished,
      OpClusterVerifyConfig gets it in exclusive mode and everything returns
      to normal. LUXI recovers very soon too.
      I'd like to thank Bernardo Dal Seno for his contribution to this bugfix.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarBernardo Dal Seno <bdalseno@google.com>
  10. 19 Mar, 2012 2 commits
  11. 23 Feb, 2012 1 commit
  12. 20 Feb, 2012 1 commit
    • Iustin Pop's avatar
      Fix Makefile.am compatibility with automake 1.11.2 · b8fe7ca6
      Iustin Pop authored
      Automake 1.11.2 made the following change:
      * Long-standing bugs:
        - Automake now warns about more primary/directory invalid combinations,
          such as "doc_LIBRARIES" or "pkglib_PROGRAMS".
      Unfortunately, this breaks our Makefile.am (issue 216) exactly because
      we were relying on pkglib_SCRIPTS.
      This patch works around this by adding a new myexeclibdir variable
      (exec so that it is intalled at `install-exec` time, the same as the
      pkglibdir), and switches to that.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
  13. 15 Feb, 2012 2 commits
    • Iustin Pop's avatar
      Reconcile Makefile.am and test data files · 1a1e7ab3
      Iustin Pop authored
      Sorry, forgot this in previous commit.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
    • Iustin Pop's avatar
      Workaround changed LVM behaviour · 048eeb2b
      Iustin Pop authored
      The vgreduce command has changed behaviour from when we initially
      wrote the code (2.02.02 versus 2.02.66, 4 years delta):
      - if there are LVs which will be impacted, it requires --force
      - otherwise refuses to proceed, but it still returns exit code 0
      We handle this by looking to see if it returns "Wrote out consistent
      volume group" (behaviour unchanged), or if it complains about
      "--force"; in the case it didn't complete, we retry the operation.
      We improve a bit the checking of "vgs", as it uses to fail silently
      and we didn't detect it.
      New tests for this function should test, I believe, all the expected
      variations; at the least we now have data files with the expected
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
  14. 07 Feb, 2012 1 commit
    • Iustin Pop's avatar
      Accept both PUT and POST in noded · 5d0566de
      Iustin Pop authored
      This is a partial cherry-pick from
       on master:
      Currently, noded requires PUT, even though the semantics of the RPC
      calls do not match a PUT. We change the code accept both PUT and POST,
      with the intention to remove the PUT support in a later version.
      Additionally, we add a message to the HttpBadRequest exception to make
      clear the failure mode (not seeing any error message was what made me
      send this patch…). This was the only description-less use of this
      exception, by the way.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      (cherry picked from commit 7530364d
      What was not cherry-picked is the rpc change (to switch to PUT). The
      reason I want to backport this to devel-2.5 is that when upgrading to
      2.6, having noded accept both makes for an easier upgrade path.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
  15. 01 Feb, 2012 1 commit
  16. 31 Jan, 2012 1 commit
  17. 26 Jan, 2012 1 commit