1. 04 Dec, 2008 2 commits
  2. 02 Dec, 2008 1 commit
    • Iustin Pop's avatar
      Fix master failover · bbe19c17
      Iustin Pop authored
      The ssconf files were not updated by the master failover. We need to
      push them, and since we already have RPC initialized, we can use the
      standard ConfigWriter to do so - this will take care of both the config
      file and the ssconf files.
      
      Reviewed-by: imsnah
      bbe19c17
  3. 26 Nov, 2008 2 commits
  4. 25 Nov, 2008 3 commits
    • Guido Trotter's avatar
      Move the MASTER_SOCKET to SOCKET_DIR · 227647ac
      Guido Trotter authored
      Before it was in the abstract linux namespace, where unfortunately we
      couldn't easily check from python the credentials of the connecting
      clients. Now we also have to remove the file on exit and when starting.
      
      Reviewed-by: imsnah
      227647ac
    • Guido Trotter's avatar
      ganeti-masterd: create SOCKET_DIR · d823660a
      Guido Trotter authored
      If SOCKET_DIR doesn't exist we create it in the master daemon, before
      trying to put a socket inside it.
      
      Reviewed-by: imsnah
      d823660a
    • Michael Hanselmann's avatar
      Pass ssconf values from master to node · 03d1dba2
      Michael Hanselmann authored
      Instead of parsing the configuration on the node, we pass the ssconf
      values from the master.
      
      Reviewed-by: iustinp
      03d1dba2
  5. 21 Nov, 2008 5 commits
    • Michael Hanselmann's avatar
      Use SSL for master/node RPC · eafd8762
      Michael Hanselmann authored
      This patch enables SSL between masterd and noded.
      
      Reviewed-by: iustinp
      eafd8762
    • Michael Hanselmann's avatar
      Get rid of node daemon password · ec17d09c
      Michael Hanselmann authored
      With the new SSL client certificate stuff it's no longer needed.
      
      Reviewed-by: iustinp
      ec17d09c
    • Michael Hanselmann's avatar
      ganeti-masterd: Remove PID file at the end · 15486fa7
      Michael Hanselmann authored
      Removing the PID file should be the last thing done. This patch makes
      sure it's also removed when master.server_cleanup() throws an exception.
      
      Also initialize logging only after writing the PID file.
      
      Reviewed-by: iustinp
      15486fa7
    • Michael Hanselmann's avatar
      Reuse HTTP client pool for RPC · 4331f6cd
      Michael Hanselmann authored
      ganeti-masterd: Add initialization and shutdown of RPC pool. It needs
      to be shutdown before forking.
      
      ganeti.cli: Add decorator function to initialize and shutdown RPC pool.
      
      ganeti.rpc: Add functions to initialize and shutdown RPC pool. Throw
      exception when used without proper initialization.
      
      gnt-cluster, gnt-node: Use decorator function to initialize and shutdown
      RPC pool.
      
      Reviewed-by: iustinp
      4331f6cd
    • Michael Hanselmann's avatar
      Add RPC call to update ssconf files · 6ddc95ec
      Michael Hanselmann authored
      Reviewed-by: iustinp
      6ddc95ec
  6. 11 Nov, 2008 1 commit
    • Iustin Pop's avatar
      Abstract runtime creation of dirs into a function · 8adbffaa
      Iustin Pop authored
      Currently the dir creation in ganeti-noded is in the main function. This
      is not nice: we move it into a separate function and also add creation
      of the OS_LOG_DIR (with different permissions, but in the same way).
      This will permit cleanup of the creation of the OS_LOG_DIR from the
      backend module (it's done multiple places currently).
      
      Reviewed-by: imsnah
      8adbffaa
  7. 24 Oct, 2008 1 commit
  8. 23 Oct, 2008 1 commit
    • Iustin Pop's avatar
      Export the disk index in the import/export scripts · 74c47259
      Iustin Pop authored
      We want to export the disk index as some OSes will only want to export
      the first disk (or the second one, etc.), even if we have multiple
      disks.
      
      The patch also updates the backend.ExportSnapshot docstring.
      
      Reviewed-by: ultrotter
      74c47259
  9. 22 Oct, 2008 1 commit
    • Guido Trotter's avatar
      Convert ImportOSIntoInstance to OS API 10 · 6c0af70e
      Guido Trotter authored
      - Change ImportOSIntoInstance not to get any "os_disk" and "swap_disk"
        arguments but to accept multiple target images to import, and to
        return a list of booleans with the result of each import
      - Change the relevant rpc call and the only caller to conform
      - Pass arguments to the import script through the environment
      - Run one import os script for each disk image, passing an IMPORT_DEVICE
      
      Reviewed-by: iustinp
      6c0af70e
  10. 21 Oct, 2008 1 commit
  11. 20 Oct, 2008 2 commits
    • Iustin Pop's avatar
      Convert the job queue rpcs to address-based · 99aabbed
      Iustin Pop authored
      The two main multi-node job queue RPC calls (jobqueue_update,
      jobqueue_rename) are converted to address-based calls, in order to speed
      up queue changes. For this, we need to change the _nodes attribute on
      the jobqueue to be a dict {name: ip}, instead of a set.
      
      Reviewed-by: imsnah
      99aabbed
    • Iustin Pop's avatar
      Remove the logger.py module · 82d9caef
      Iustin Pop authored
      Since now we use only one function from the logger module
      (SetupLogging), we move it to utils.py (which is already imported by all
      users of this function), and we remove the module.
      
      Reviewed-by: imsnah
      82d9caef
  12. 17 Oct, 2008 2 commits
  13. 16 Oct, 2008 3 commits
    • Michael Hanselmann's avatar
      rapi: Convert to new HTTP server class · 16a8967d
      Michael Hanselmann authored
      Requests are no longer logged to a separate file.
      
      Reviewed-by: amishchenko
      16a8967d
    • Iustin Pop's avatar
      Improvements to the master startup checks · d7cdb55d
      Iustin Pop authored
      In order to account for future improvements to master failover, we move
      the actual data gathering capabilities from ganeti-masterd into
      bootstrap.py, and we leave only the verification into masterd.
      
      The verification procedure is then changed to retry multiple times (up
      to one minute) in case most nodes do not respond, and also the algorithm
      is changed to require at least half (but not half+1) votes, since our
      vote also should count (and we vote for ourselves).
      
      Example for consistent (config-wise) cluster:
        - 5 node cluster, 2 nodes down: still start
        - 4 node cluster, 2 nodes down: retry for one minute, abort
      
      Reviewed-by: ultrotter
      d7cdb55d
    • Iustin Pop's avatar
      Add an interface for the drain flag changes/query · 3ccafd0e
      Iustin Pop authored
      This adds the set/reset in the jqueue and luxi modules, and a way to
      query it in OpQueryConfigValues, and also the comand line interface for
      it:
      $ gnt-cluster queue info
      The drain flag is unset
      $ gnt-cluster queue drain
      $ gnt-cluster queue info
      The drain flag is set
      $ gnt-cluster queue undrain
      $ gnt-cluster queue info
      The drain flag is unset
      
      The choice of making the setting via luxi and not an opcode is that
      opcodes can't be executed when drained, but we don't query via luxi
      since in the future it might become a cluster property as opposed to a
      node one.
      
      Reviewed-by: imsnah
      3ccafd0e
  14. 15 Oct, 2008 3 commits
  15. 14 Oct, 2008 1 commit
    • Iustin Pop's avatar
      Export the hypervisor.ValidateParameters over RPC · 6217e295
      Iustin Pop authored
      The newly-added node-specific ValidateParams hypervisor method is
      exported over RPC, using the semi-standard (success, message) return
      value. Multi-node call, so that we call on both primary and secondary at
      once.
      
      Reviewed-by: ultrotter
      6217e295
  16. 13 Oct, 2008 1 commit
    • Iustin Pop's avatar
      Fix a few rpc-related errors · 16ad1a83
      Iustin Pop authored
      This fixes:
        - whitespace change, double lines between methods
        - duplication of call_upload_file, introduced by mistake in rev 1795
          and which went undetected because of the many changes in that ref
          (only diff -b shows it clearly)
        - call_instance_info didn't pass the hypervisor name parameter, but
          the backend requires it
      
      Reviewed-by: ultrotter
      16ad1a83
  17. 12 Oct, 2008 1 commit
    • Iustin Pop's avatar
      Abstract checking own address into a function · caad16e2
      Iustin Pop authored
      Currently, we check if we have a given ip address (i.e. it's alive on
      one of our interfaces) but manually calling TcpPing(source=localhost).
      This works, but having it spread all over the code makes it hard to
      change the implementation.
      
      The patch abstracts this into a separate utils.OwnIpAddress(addr)
      function. We add a rpc call for it, which we use instead of the
      (single-use of) call_node_tcp_ping. We leave node_tcp_ping in, as seems
      useful and eventually it should be removed in a separate patch.
      
      Reviewed-by: imsnah
      caad16e2
  18. 10 Oct, 2008 2 commits
    • Michael Hanselmann's avatar
      Convert ganeti-noded to new HTTP server class · cc28af80
      Michael Hanselmann authored
      Reviewed-by: iustinp
      cc28af80
    • Iustin Pop's avatar
      Convert rpc module to RpcRunner · 72737a7f
      Iustin Pop authored
      This big patch changes the call model used in internode-rpc from
      standalong function calls in the rpc module to via a RpcRunner class,
      that holds all the methods. This can be used in the future to enable
      smarter processing in the RPC layer itself (some quick examples are not
      setting the DiskID from cmdlib code, but only once in each rpc call,
      etc.).
      
      There are a few RPC calls that are made outside of the LU code, and
      these calls are left as staticmethods, so they can be used without a
      class instance (which requires a ConfigWriter instance).
      
      Reviewed-by: imsnah
      72737a7f
  19. 08 Oct, 2008 1 commit
    • Iustin Pop's avatar
      Move the hypervisor attribute to the instances · e69d05fd
      Iustin Pop authored
      This (big) patch moves the hypervisor type from the cluster to the
      instance level; the cluster attribute remains as the default hypervisor,
      and will be renamed accordingly in a next patch. The cluster also gains
      the ‘enable_hypervisors’ attribute, and instances can be created with
      any of the enabled ones (no provision yet for changing that attribute).
      
      The many many changes in the rpc/backend layer are due to the fact that
      all backend code read the hypervisor from the local copy of the config,
      and now we have to send it (either in the instance object, or as a
      separate parameter) for each function.
      
      The node list by default will list the node free/total memory for the
      default hypervisor, a new flag to it should exist to select another
      hypervisor. Instance list has a new field, hypervisor, that shows the
      instance hypervisor. Cluster verify runs for all enabled hypervisor
      types.
      
      The new FIXMEs are related to IAllocator, since now the node
      total/free/used memory counts are wrong (we can't reliably compute the
      free memory).
      
      Reviewed-by: imsnah
      e69d05fd
  20. 07 Oct, 2008 2 commits
    • Iustin Pop's avatar
      rpc.call_instance_migrate: pass the whole instance · 9f0e6b37
      Iustin Pop authored
      Currently the call_instance_migrate call only passes the instance name;
      we need to pass the whole object for the hypervisor_type changes (all
      the other individual instance rpc calls already pass the instance
      object).
      
      Reviewed-by: imsnah
      9f0e6b37
    • Iustin Pop's avatar
      Implement job 'waiting' status · e92376d7
      Iustin Pop authored
      Background: when we have multiple jobs in the queue (more than just a
      few), many of the jobs (up to the number of threads) will be in state
      'running', although many of them could be actually blocked, waiting for
      some locks. This is not good, as one cannot easily see what is
      happening.
      
      The patch extends the opcode/job possible statuses with another one,
      waiting, which shows that the LU is in the acquire locks phase. The
      mechanism for doing so is simple, we initialize (in the job queue) the
      opcode with OP_STATUS_WAITLOCK, and when the processor is ready to give
      control to the LU's Exec, it will call a notifier back into the
      _JobQueueWorker that sets the opcode status to OP_STATUS_RUNNING (with
      the proper queue locking). Because this mechanism does not save the job,
      all opcodes on disk will be in status WAITLOCK and not RUNNING anymore,
      so we also change the load sequence to consider WAITLOCK as RUNNING.
      
      With the patch applied, creating in parallel (via burnin) five instances
      on a five node cluster shows that only two are executing, while three
      are waiting for locks.
      
      Reviewed-by: imsnah
      e92376d7
  21. 06 Oct, 2008 2 commits
    • Iustin Pop's avatar
      Implement job auto-archiving · 07cd723a
      Iustin Pop authored
      This patch adds a new luxi call that implements auto-archiving of jobs
      older than a certain age (or -1 for all completed jobs), and the gnt-job
      command that makes use of this (with 'all' for -1).
      
      Reviewed-by: imsnah
      07cd723a
    • Iustin Pop's avatar
      backend.py change to get cluster name from master · 62c9ec92
      Iustin Pop authored
      Currently there are three function in backend that need the cluster name
      in order to instantiate an SshRunner. The patch changes these to get the
      cluster name from the master in the rpc call; once the multi-hypervisor
      change is implemented, then very few places in which we need the SCR
      remain in the backend.
      
      Reviewed-by: killerfoxi, imsnah
      62c9ec92
  22. 01 Oct, 2008 2 commits