1. 24 Jul, 2012 2 commits
    • Iustin Pop's avatar
      Fix boot=on flag for CDROMs · 24be50e0
      Iustin Pop authored
      This generalises commit 4304964a
      
       to cdroms too, since they have
      somewhat the same logic. We just abstract the needs_boot_flag into a
      separate variable, and then reuse it in the cdrom section.
      
      Note that the logic of what 'if=' type to pass to KVM was very
      convoluted, and (I think) incorrect; I went and cleaned it to be more
      consistent.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      24be50e0
    • Iustin Pop's avatar
      KVM: only pass boot flag once · 2b846304
      Iustin Pop authored
      This addresses issue 230: passing two methods of booting to KVM can,
      depending on the KVM version, confuse it.
      
      Note that commit 4304964a
      
       introduced a partial fix for this (but only
      for disks, and keyed on KVM versions). However, it didn't fix cdrom
      booting, which still fails with the same error, so let's fix it more
      generically; we still leave the per-disk check since that is about
      -boot c versus -drive …,boot=on rather than two boot methods.
      
      Patch is based on the one submitted by Vladimir Mencl, many thanks!
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      2b846304
  2. 11 Jun, 2012 3 commits
  3. 06 Jun, 2012 1 commit
    • Iustin Pop's avatar
      Fix parallel build failures · a13d6911
      Iustin Pop authored
      
      
      This is the 2.5 version of the "fix build failures":
      
      - man/%.gen could be left over even in case of failure, due to
        automake bug
      - make man/%.gen runs RUN_IN_TEMPDIR, so let's depend on it, since
        that target has the proper dependencies (create needed dirs)
      - man/%.gen depends on a number of built sources, but the dependency
        was not declared
      
      Furthermore, wraps a long comment.
      
      Tested with -j4/-j16, after `make maintainer-clean'.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      a13d6911
  4. 05 Jun, 2012 1 commit
  5. 24 May, 2012 1 commit
  6. 11 May, 2012 10 commits
  7. 09 May, 2012 4 commits
    • Iustin Pop's avatar
      Add a default PATH variable to OS scripts env · 9a6ade06
      Iustin Pop authored
      In commit 896a03f6
      
       I cleaned up the environment for OS scripts,
      however I think that was a bit too extreme - it breaks our own
      instance-debootstrap hooks, because for example dpkg (called from the
      grub script) requires PATH to be set.
      
      Instead of requiring every OS to define a path, let's set a default
      PATH for the OS scripts, which should cover most common uses. A more
      specialised PATH can be set, if needed, in the OS scripts.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      9a6ade06
    • Andrea Spadaccini's avatar
      Move hooks PATH environment variable to constants · aa7b59ac
      Andrea Spadaccini authored
      
      
      Move the contents of the PATH environment variable for hooks to
      constants, and use its value in the code and in the hooks documentation.
      Signed-off-by: default avatarAndrea Spadaccini <spadaccio@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      (cherry picked from commit fe5ca2bb
      
      )
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      aa7b59ac
    • Iustin Pop's avatar
      Add note to the install doc about bridge MAC issues · 12f9d75e
      Iustin Pop authored
      
      
      Thanks to Faidon Liambotis for explaining this on the external IRC
      channel.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarFaidon Liambotis <paravoid@gmail.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      12f9d75e
    • Iustin Pop's avatar
      Fix exception re-raising in Python Luxi clients · 98dfcaff
      Iustin Pop authored
      Commit e687ec01
      
       (present in 2.5 since the 2.5 beta 3) did consistency
      fixes across the code-base. Unfortunately this was done without enough
      checks on the actual meaning of one of the fixes, which means error
      re-raising in lib/errors.py is broken.
      
      The problem is that:
      
        raise cls, args
      
      is different than:
      
        raise cls(args)
      
      And our unit-tests didn't catch this (this patch updates the tests).
      
      This breakage is usually trivial, like wrong error messages:
      
        $ gnt-instance remove no-such-instance
        Failure: prerequisites not met for this operation:
        ("Instance 'no-such-instance' not known", 'unknown_entity')
      
      versus:
      
        $ gnt-instance remove no-such-instance
        Failure: prerequisites not met for this operation:
        error type: unknown_entity, error details:
        Instance 'no-such-instance' not known
      
      or:
      
        $ gnt-instance add … no-such-instance
        Failure: prerequisites not met for this operation:
        ('The given name (no-such-instance) does not resolve: Name or service not known', 'resolver_error')
      
      versus:
      
        $ gnt-instance add … no-such-instance
        Failure: prerequisites not met for this operation:
        error type: resolver_error, error details:
        The given name (no-such-instance) does not resolve: Name or service not known
      
      But in some cases where we rely on a certain data representation
      (e.g. HooksAbort), this actually breaks because we try to iterate over
      the wrong type:
      
        File "/usr/lib/python2.6/dist-packages/ganeti/cli.py", line 1907, in FormatError
           for node, script, out in err.args[0]:
        ValueError: need more than 1 value to unpack
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      98dfcaff
  8. 07 May, 2012 1 commit
  9. 11 Apr, 2012 6 commits
  10. 30 Mar, 2012 1 commit
  11. 29 Mar, 2012 1 commit
    • Dimitris Aragiorgis's avatar
      Fix a bug concerning TCP port release · 3b3b1bca
      Dimitris Aragiorgis authored
      Commit f396ad8c
      
       returns the TCP port used by DRBD disk back to the
      TCP/UDP port pool using AddTcpUdpPort().
      
      However, AddTcpUdpPort() writes the config on every invocation,
      using _WriteConfig(). This causes two problems:
      
       * it causes critical errors logged by VerifyConfig(), after the DRBD
         disk removal, and until the actual instance removal.
       * if the code following AddTcpUdpPort() fails, the port is already
         returned back the pool, which causes the port to have duplicates
         (inconsistent config).
      
      AddTcpUdpPort() is invoked in three cases:
      
       * during InstanceRemove() through _RemoveDisks().
       * during InstanceSetParams() in case of disk removal.
       * during InstanceSetParams() through _ConvertDrbdToPlain().
      
      This commit fixes the problem by removing the _WriteConfig() call from
      AddTcpUdpPort(), delegate it to Update() via the
      TemporaryReservationManager and ensure AddTcpUdpPort() precedes
      Update().
      Signed-off-by: default avatarDimitris Aragiorgis <dimara@grnet.gr>
      [iustin@google.com: small comments adjustements]
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      3b3b1bca
  12. 28 Mar, 2012 1 commit
  13. 23 Mar, 2012 2 commits
  14. 22 Mar, 2012 3 commits
  15. 21 Mar, 2012 1 commit
  16. 20 Mar, 2012 1 commit
    • Michael Hanselmann's avatar
      Stop acquiring BGL for LUXI queries · 0fa753ba
      Michael Hanselmann authored
      
      
      Short description: This fixes an issue whereby masterd would become
      unresponsive on the LUXI socket, leading to client timeouts. While made
      worse in 2.5, the underlying issue was already present in 2.4.
      
      Longer description: Until now all LUXI queries would acquire the BGL
      (big Ganeti lock) in shared mode. With the exception of OpNodeAdd and
      OpNodeRemove, this was also the case for all opcodes before version 2.5.
      In 2.5 we split OpClusterVerify into multiple opcodes, one of which
      (OpClusterVerifyConfig) now acquires the BGL in exclusive mode. Whether
      or not doing so is good is a separate discussion: OpNodeAdd and
      OpNodeRemove, as of this writing, still require an exclusive BGL.
      OpClusterVerifyConfig is run more often than OpNodeAdd or OpNodeRemove
      in normal clusters, which is why we only recognized this issue in 2.5.
      
      What would happen is that once OpClusterVerifyConfig tried to acquire
      its exclusive BGL while it was actually held by other opcodes (e.g.
      OpInstanceReplaceDisks), the locking code would not grant shared
      acquires for the BGL, even when the exclusive acquire is removed from
      the queue for a short amount of time after a timeout. This is necessary
      to prevent lock starvation.
      
      In this situation further LUXI queries requiring the BGL in shared mode,
      e.g. OpClusterQuery, would block and the client eventually time out.
      Over time they fill the client request workerpool's queue and at that
      point even requests not requiring the BGL stop working. Once the
      long-running operation(s) holding the BGL in shared mode finished,
      OpClusterVerifyConfig gets it in exclusive mode and everything returns
      to normal. LUXI recovers very soon too.
      
      I'd like to thank Bernardo Dal Seno for his contribution to this bugfix.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarBernardo Dal Seno <bdalseno@google.com>
      0fa753ba
  17. 19 Mar, 2012 1 commit