1. 03 Dec, 2008 1 commit
  2. 02 Dec, 2008 4 commits
    • Iustin Pop's avatar
      Convert rpc results to a custom type · 781de953
      Iustin Pop authored
      For a long time we had the problem that both RPC-layer errors and
      results from the remote node share the same "valuespace". This is
      because we shouldn't raise an exception when only one node failed
      (and lose the results from the other nodes).
      
      This patch attempts to address this problem by returning a special
      object from RPC calls, which separates the rpc-layer status and the
      remote results into different attributes.
      
      All the users of rpc (mainly cmdlib, but also bootstrap and the
      HooksMaster in mcpu) have been converted to this new model. The code has
      changed from, e.g. for boolean return types:
      
        if not self.rpc.call_...
      
      to
      
        result = self.rpc.call_
        if result.failed or not result.data:
           ^ rpc-layer error    |
                                - result payload
      
      While this is slightly more complicated, it will allow cleaner checks in
      the future; right now the code is just a plain port, without
      optimizations.
      
      There's also a "result.Raise()" which raises an OpExecError if the
      rpc-layer had errors.
      
      One side-effect of the patch is that now all return types from the
      rpc.call_ functions are of either RpcResult (single-node) or dicts of
      (node name, RpcResult); previously, some functions were returning
      different object types based on error status.
      
      The code passes burnin (after many retries :).
      
      Reviewed-by: imsnah
      781de953
    • Guido Trotter's avatar
      Use the new utils.CheckBEParams function · d4b72030
      Guido Trotter authored
      Where we used/forgot to validate beparams we now use the new common function.
      
      Reviewed-by: imsnah
      d4b72030
    • Iustin Pop's avatar
      Fix master failover · bbe19c17
      Iustin Pop authored
      The ssconf files were not updated by the master failover. We need to
      push them, and since we already have RPC initialized, we can use the
      standard ConfigWriter to do so - this will take care of both the config
      file and the ssconf files.
      
      Reviewed-by: imsnah
      bbe19c17
    • Iustin Pop's avatar
      Prevent master failover to a non candidate node · 8135a2db
      Iustin Pop authored
      Reviewed-by: imsnah
      8135a2db
  3. 01 Dec, 2008 1 commit
  4. 27 Nov, 2008 1 commit
    • Iustin Pop's avatar
      Improve the node add operation · 87622829
      Iustin Pop authored
      Currently, the node add operation uses a job to query the node name and
      the bootstrap function directly reads the config file for the cluster
      name.
      
      This patch changes to that both the cluster name and the verification of
      the node is done via queries to the master.
      
      Reviewed-by: ultrotter
      87622829
  5. 21 Nov, 2008 1 commit
  6. 12 Nov, 2008 2 commits
  7. 20 Oct, 2008 2 commits
    • Alexander Schreiber's avatar
      Set default hypervisor at cluster init · 02691904
      Alexander Schreiber authored
      During cluster init, set the default hypervisor to be used for instances.
      Ensure that the default hypervisor belongs to the set enabled hypervisors
      for this cluster. Also fix a small bug with setting the default enabled
      hypervisor list.
      
      Reviewed-by: imsnah
      
      02691904
    • Alexander Schreiber's avatar
      Remove --hypervisor-type from gnt-cluster. · 4342e89b
      Alexander Schreiber authored
      We no longer use a single, cluster-wide hypervisor, but configure the
      actual to be used hypervisor on the instance level.
      
      Reviewed-by: imsnah
      
      4342e89b
  8. 18 Oct, 2008 1 commit
  9. 16 Oct, 2008 2 commits
    • Iustin Pop's avatar
      Prevent master failover if we have wrong data · d5927e48
      Iustin Pop authored
      If we don't actually know the current master (as determined via voting),
      we prevent the failover.
      
      The patch also changes some messages (capitalization, typos).
      
      Reviewed-by: ultrotter
      d5927e48
    • Iustin Pop's avatar
      Improvements to the master startup checks · d7cdb55d
      Iustin Pop authored
      In order to account for future improvements to master failover, we move
      the actual data gathering capabilities from ganeti-masterd into
      bootstrap.py, and we leave only the verification into masterd.
      
      The verification procedure is then changed to retry multiple times (up
      to one minute) in case most nodes do not respond, and also the algorithm
      is changed to require at least half (but not half+1) votes, since our
      vote also should count (and we vote for ourselves).
      
      Example for consistent (config-wise) cluster:
        - 5 node cluster, 2 nodes down: still start
        - 4 node cluster, 2 nodes down: retry for one minute, abort
      
      Reviewed-by: ultrotter
      d7cdb55d
  10. 12 Oct, 2008 1 commit
    • Iustin Pop's avatar
      Abstract checking own address into a function · caad16e2
      Iustin Pop authored
      Currently, we check if we have a given ip address (i.e. it's alive on
      one of our interfaces) but manually calling TcpPing(source=localhost).
      This works, but having it spread all over the code makes it hard to
      change the implementation.
      
      The patch abstracts this into a separate utils.OwnIpAddress(addr)
      function. We add a rpc call for it, which we use instead of the
      (single-use of) call_node_tcp_ping. We leave node_tcp_ping in, as seems
      useful and eventually it should be removed in a separate patch.
      
      Reviewed-by: imsnah
      caad16e2
  11. 10 Oct, 2008 2 commits
    • Iustin Pop's avatar
      Convert rpc module to RpcRunner · 72737a7f
      Iustin Pop authored
      This big patch changes the call model used in internode-rpc from
      standalong function calls in the rpc module to via a RpcRunner class,
      that holds all the methods. This can be used in the future to enable
      smarter processing in the RPC layer itself (some quick examples are not
      setting the DiskID from cmdlib code, but only once in each rpc call,
      etc.).
      
      There are a few RPC calls that are made outside of the LU code, and
      these calls are left as staticmethods, so they can be used without a
      class instance (which requires a ConfigWriter instance).
      
      Reviewed-by: imsnah
      72737a7f
    • Iustin Pop's avatar
      Small random fixes · 7b3a8fb5
      Iustin Pop authored
      Indentation in bootstrap was wrong and some names in cmdlib.py were not
      right.
      
      Reviewed-by: imsnah
      7b3a8fb5
  12. 08 Oct, 2008 2 commits
    • Iustin Pop's avatar
      Sanitize the hypervisor names · 00cd937c
      Iustin Pop authored
      Since in 2.0 the user will possibly have more interaction with the
      hypervisor names, we sanitize them by removing the version numbers
      (the version can be a prerequisite for the ganeti installation, we
      shouldn't document it in variable names).
      
      Reviewed-by: schreiberal
      00cd937c
    • Oleksiy Mishchenko's avatar
      Fix for gnt-cluster init. · 02f99608
      Oleksiy Mishchenko authored
      Reviewed-by: iustinp
      02f99608
  13. 06 Oct, 2008 1 commit
  14. 01 Oct, 2008 5 commits
  15. 28 Sep, 2008 1 commit
    • Iustin Pop's avatar
      Move the pseudo-secret generation to utils.py · 33081d90
      Iustin Pop authored
      The bootstrap code needs a pseudo-secret and this is currently generated
      inside the InitGanetiServerSetup function. Since more users will need
      this, move it to utils.py
      
      Reviewed-by: ultrotter
      33081d90
  16. 15 Aug, 2008 1 commit
  17. 13 Aug, 2008 1 commit
    • Michael Hanselmann's avatar
      Fix adding pristine nodes · 51144e33
      Michael Hanselmann authored
      If a node hasn't been part of the cluster before being added it'll not
      have the cluster's SSH key. This patch makes sure to accept those by
      not aliasing the machine name to the cluster name.
      
      Reviewed-by: ultrotter
      51144e33
  18. 30 Jul, 2008 4 commits
    • Iustin Pop's avatar
      Fix cluster destroy · 140aa4a8
      Iustin Pop authored
      With the recent startup/shutdown changes (and with the master daemon in
      place), the cluster destroy needs some fixing.
      
      This patch moves the finalization of the destroy out from cmdlib into
      bootstrap, so we can nicely shutdown the rapi and master daemons.
      
      Reviewed-by: ultrotter
      140aa4a8
    • Iustin Pop's avatar
      Fix cluster init · b3f1cf6f
      Iustin Pop authored
      With the recent changes, I forgot the extra parameter to this rpc call.
      Also the rpc call needs to be done after we setup the config data, for
      the master daemon to be able to start, so we move it after all other
      init steps.
      
      Reviewed-by: ultrotter
      b3f1cf6f
    • Iustin Pop's avatar
      Fix some errors detected by pylint · 3b9e6a30
      Iustin Pop authored
      Reviewed-by: imsnah
      3b9e6a30
    • Iustin Pop's avatar
      Rework master startup/shutdown/failover · b1b6ea87
      Iustin Pop authored
      This (big) patch reworks the master startup/shutdown and the fixes the
      master failover.
      
      What does the patch do?
      
      For master start/stop:
        - remove the old ganeti-master script and its associated man page
        - moves the ip start/stop directly into the backend.(Start|Stop)Master
        - adds start/stop of the master/rapi daemon into these functions,
          selectively based on the start/stop arguments
        - makes the master call via rpc StartMaster(start_daemons=False) to
          the local node so that the master IP is started
        - and finally changes the example init.d script to directly start and
          stop all three daemons, since they do the right thing (depending on
          master/not master role)
      
      For master failover:
        - moves the code from LUMasterFailover into bootstrap.MasterFailover,
          since we need to start/stop the master during this operation and
          thus it can't be executed from the master
        - removes the LUMasterFailover and its associated opcode
      
      Notes: ubuntu's /etc/lsb-base-logging.sh is dumb, so the messages 'not
      master' are not seen during startup on non-master nodes.
      
      Reviewed-by: ultrotter
      b1b6ea87
  19. 27 Jun, 2008 1 commit
    • Guido Trotter's avatar
      AddNode: move the initial setup to boostrap · 827f753e
      Guido Trotter authored
      From the master node we can't start ssh and connect to the remote node,
      nor we can do it from ganeti-noded as this ssh section will possibly ask
      for key confirmation and password. So the code to copy the ganeti-noded
      password and SSL key has been moved to bootstrap.py, and it's called by
      gnt-node before the AddNode opcode.
      
      Reviewed-by: iustinp
      827f753e
  20. 16 Jun, 2008 1 commit
    • Guido Trotter's avatar
      Move SetKey to WritableSimpleStore and use it · 05f86716
      Guido Trotter authored
      Before we used to be able to update SimpleStore by just calling SetKey, this
      feature is now moved to an external class, which inherits from it. In this
      patch the new WritableSimpleStore class is also put to use, in the LUs that
      need it. Rather than making each LU instantiate it, we have a new LogicalUnit
      flag REQ_WSSTORE which defaults to False, but when declared to be True asks the
      LogicalUnit to be initialized with a writeable version of the SimpleStore.
      LUMasterFailover and LURenameCluster are then changed to use it.
      
      InitCluster is also changed to instantiate a WritableSimpleStore, rather
      than a normal one.
      
      Reviewed-by: imsnah
      05f86716
  21. 12 Jun, 2008 1 commit