Skip to content
Snippets Groups Projects
  1. Jul 26, 2010
    • Iustin Pop's avatar
      masterd: move the IP activation from Exec to Check · 340f4757
      Iustin Pop authored
      
      Currently, the master IP activation is done in the Exec function. Since
      the original masterd process returns after forking, and Exec is run in
      the (grand)child process, this means that after 'ganeti-masterd' has
      returned there are still initialization tasks running.
      
      Normally this is not a problem, but in cases where one does quick master
      failovers, this creates a race condition which hits the QA scripts
      especially hard.
      
      To solve this, and make the startup process cleaner (the system is in
      steady state after the command has returned, even though masterd startup
      could still fail), we move the IP activation to Check(). This also
      allows error messages about the IP activation to be seen on the console.
      
      With this patch enabled, I can no longer reproduce the double-failover
      errors, which were occuring before in 4/5 cases.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      340f4757
    • Iustin Pop's avatar
      Move the UsesRPC decorator from cli to rpc · e0e916fe
      Iustin Pop authored
      
      This is needed because not just the cli scripts need this decorator, but
      the master daemon too (and it already duplicated the code once).
      
      In cli.py we just leave a stub, so that we don't have to modify all the
      scripts to import rpc.py.
      
      We then change the master daemon code to reuse this decorator, instead
      of duplicating it.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      e0e916fe
    • Iustin Pop's avatar
      watcher: smarter handling of instance records · f5116c87
      Iustin Pop authored
      
      This patch implements a few changes to the instance handling. First, old
      instances which no longer exist on the cluster are removed from the
      state file, to keep things clean.
      
      Second, the instance restart counters are reset every 8 hours, since
      some error cases might be transient (e.g. networking issues, or machine
      temporarily down), and if the problem takes more than 5 restarts but is
      not permanent, watcher will not restart the instance. The value of 8
      hours is, I think, both conservative (as not to hammer the cluster too
      often with restarts) and fast enough to clear semi-transient problems.
      
      And last, if an instance is not restarted due to exhausted retries, this
      should be warned, otherwise it's hard to understand why watcher doesn't
      want to restart an ERROR_down instance.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      f5116c87
  2. Jul 16, 2010
    • Michael Hanselmann's avatar
      Convert RPC client to PycURL · 33231500
      Michael Hanselmann authored
      
      Instead of using our custom HTTP client, using PycURL's multi
      interface allows us to get rid of the HTTP client threadpool.
      The majority of the code is still in the ganeti.http.client
      module.
      
      A simple per-thread HTTP client pool gives cURL a chance to
      cache and retain as much information as possible (e.g. SSL certs).
      Unused HTTP clients (e.g. due to removed nodes) are deleted after
      25 requests going through the pool.
      
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      33231500
  3. Jul 12, 2010
    • Manuel Franceschini's avatar
      Confd IPv6 support · d8bcfe21
      Manuel Franceschini authored
      
      This patch series basically adds a new parameter 'family' to the constructors
      of daemon.AsyncUDPSocket and confd.client.ConfdUDPClient. This enables the
      users of these two classes to support IPv6.
      
      In ganeti-confd.ConfdAsyncUDPClient a method to check the address families of
      all peers is added.
      
      Furthermore it adds unittests for the added functionality.
      
      Signed-off-by: default avatarManuel Franceschini <livewire@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      d8bcfe21
  4. Jul 09, 2010
  5. Jul 07, 2010
  6. Jul 06, 2010
  7. Jul 02, 2010
  8. Jul 01, 2010
    • Michael Hanselmann's avatar
      RAPI client: Switch to pycURL · 2a7c3583
      Michael Hanselmann authored
      
      Currently the RAPI client uses the urllib2 and httplib modules from
      Python's standard library. They're used with pyOpenSSL in a very fragile
      way, and there are known issues when receiving large responses from a RAPI
      server.
      
      By switching to PycURL we leverage the power and stability of the
      widely-used curl library (libcurl). This brings us much more flexibility
      than before, and timeouts were easily implemented (something that would
      have involved a lot of work with the built-in modules).
      
      There's one small drawback: Programs using libcurl have to call
      curl_global_init(3) (available as pycurl.global_init) while exactly one
      thread is running (e.g. before other threads) and are supposed to call
      curl_global_cleanup(3) (available as pycurl.global_cleanup) upon exiting.
      See the manpages for details. A decorator is provided to simplify this.
      
      Unittests for the new code are provided, increasing the test coverage of
      the RAPI client from 74% to 89%.
      
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      2a7c3583
  9. Jun 30, 2010
  10. Jun 29, 2010
  11. Jun 23, 2010
  12. Jun 14, 2010
  13. Jun 11, 2010
  14. Jun 10, 2010
  15. Jun 08, 2010
  16. Jun 04, 2010
  17. Jun 03, 2010
  18. May 28, 2010
  19. May 25, 2010
  20. May 21, 2010
  21. May 18, 2010
  22. May 17, 2010
  23. May 14, 2010
    • Guido Trotter's avatar
      ganeti-noded: add the --no-mlock option · bebf68d3
      Guido Trotter authored
      
      While mlock on noded is definitely good in most situations, there are
      some - namely my laptop - where it has no benefit, and uses precious
      non-swappable memory. To avoid this we make it optional, with a new
      --no-mlock option. Note that only the main node daemon and its http
      children are affected: the powercycle node child still uses mlock, which
      doesn't harm, since it's a short lived process happening just before
      node reboot anyway. The manpage is updated.
      
      Signed-off-by: default avatarGuido Trotter <ultrotter@google.com>
      Reviewed-by: default avatarLuca Bigliardi <shammash@google.com>
      bebf68d3
Loading