1. 17 Oct, 2008 2 commits
  2. 15 Oct, 2008 1 commit
  3. 14 Oct, 2008 3 commits
    • Iustin Pop's avatar
      Change the backend to use the beparams · 51de46bf
      Iustin Pop authored
      The backend.FinalizeExport function is changed to use the beparams
      instead of the instance attributes. Future enhancements should be done
      in order to export and import/reuse the whole be/hv params.
      
      Reviewed-by: ultrotter
      51de46bf
    • Iustin Pop's avatar
      Temporary fix for dual hvm/pvm instances · f23b5ae8
      Iustin Pop authored
      We have a problem with the current model of combining instance lists
      from multiple hypervisors: we don't allow duplicates, but "xm list"
      gives the same output for both pvm and hvm. This is a lack in the actual
      xen hypervisor implementation/split between pvm and hvm, but for now we
      implement a weak workaround: identical instance params will be allowed,
      and merged. This breaks because there is a delta in listing, and should
      be treated as temporary workaround only.
      
      Note that there are two cases for duplicate instance: the above one (xen
      is the same, whether pvm or hvm), and the other case, the real error,
      when we have two different hypervisors reporting the same instance name.
      The latter case needs to be handled better (not by refusing to list the
      instances in the backend).
      
      Reviewed-by: ultrotter
      f23b5ae8
    • Iustin Pop's avatar
      Export the hypervisor.ValidateParameters over RPC · 6217e295
      Iustin Pop authored
      The newly-added node-specific ValidateParams hypervisor method is
      exported over RPC, using the semi-standard (success, message) return
      value. Multi-node call, so that we call on both primary and secondary at
      once.
      
      Reviewed-by: ultrotter
      6217e295
  4. 12 Oct, 2008 1 commit
    • Iustin Pop's avatar
      Abstract checking own address into a function · caad16e2
      Iustin Pop authored
      Currently, we check if we have a given ip address (i.e. it's alive on
      one of our interfaces) but manually calling TcpPing(source=localhost).
      This works, but having it spread all over the code makes it hard to
      change the implementation.
      
      The patch abstracts this into a separate utils.OwnIpAddress(addr)
      function. We add a rpc call for it, which we use instead of the
      (single-use of) call_node_tcp_ping. We leave node_tcp_ping in, as seems
      useful and eventually it should be removed in a separate patch.
      
      Reviewed-by: imsnah
      caad16e2
  5. 10 Oct, 2008 1 commit
    • Guido Trotter's avatar
      OS API: support for multiple versions in an OS · 082a7f91
      Guido Trotter authored
      Allow multiple api versions in an OS. This is according to the OS API
      changes design doc, by which an OS can support multiple versions of the
      Ganeti API and if one is supported by Ganeti it will work. Since up to
      version 5 of the API mandates an OS could support only one version, this
      change is retrocompatible with it and requires no version bump up.
      
      Reviewed-by: iustinp
      082a7f91
  6. 08 Oct, 2008 1 commit
    • Iustin Pop's avatar
      Move the hypervisor attribute to the instances · e69d05fd
      Iustin Pop authored
      This (big) patch moves the hypervisor type from the cluster to the
      instance level; the cluster attribute remains as the default hypervisor,
      and will be renamed accordingly in a next patch. The cluster also gains
      the ‘enable_hypervisors’ attribute, and instances can be created with
      any of the enabled ones (no provision yet for changing that attribute).
      
      The many many changes in the rpc/backend layer are due to the fact that
      all backend code read the hypervisor from the local copy of the config,
      and now we have to send it (either in the instance object, or as a
      separate parameter) for each function.
      
      The node list by default will list the node free/total memory for the
      default hypervisor, a new flag to it should exist to select another
      hypervisor. Instance list has a new field, hypervisor, that shows the
      instance hypervisor. Cluster verify runs for all enabled hypervisor
      types.
      
      The new FIXMEs are related to IAllocator, since now the node
      total/free/used memory counts are wrong (we can't reliably compute the
      free memory).
      
      Reviewed-by: imsnah
      e69d05fd
  7. 07 Oct, 2008 1 commit
    • Iustin Pop's avatar
      rpc.call_instance_migrate: pass the whole instance · 9f0e6b37
      Iustin Pop authored
      Currently the call_instance_migrate call only passes the instance name;
      we need to pass the whole object for the hypervisor_type changes (all
      the other individual instance rpc calls already pass the instance
      object).
      
      Reviewed-by: imsnah
      9f0e6b37
  8. 06 Oct, 2008 2 commits
    • Iustin Pop's avatar
      backend.py change to get cluster name from master · 62c9ec92
      Iustin Pop authored
      Currently there are three function in backend that need the cluster name
      in order to instantiate an SshRunner. The patch changes these to get the
      cluster name from the master in the rpc call; once the multi-hypervisor
      change is implemented, then very few places in which we need the SCR
      remain in the backend.
      
      Reviewed-by: killerfoxi, imsnah
      62c9ec92
    • Iustin Pop's avatar
      Fix SshRunner breakage from the changed API · 6b0469d2
      Iustin Pop authored
      More places actually use the SshRunner than just the gnt-cluster
      commands.
      
      Reviewed-by: ultrotter
      6b0469d2
  9. 01 Oct, 2008 3 commits
  10. 09 Sep, 2008 2 commits
    • Michael Hanselmann's avatar
      Never remove job queue lock in node daemon · 1bc59f76
      Michael Hanselmann authored
      Otherwise, corruption could occur in some corner cases. E.g. when
      LeaveNode is running in a child and is in the process of removing
      queue files, the main process gets killed, started again and gets
      a request to update the queue. This is rather extreme corner case,
      but we should opt for safety.
      
      Reviewed-by: iustinp
      1bc59f76
    • Iustin Pop's avatar
      Change backend._GetMasterInfo to return more data · bd1e4562
      Iustin Pop authored
      The _GetMasterInfo() function needs to export the master name too to be
      useful in master safety checks. This patch makes it a public (no _)
      function and adds a third element in the return tuple. Its callers are
      modified too.
      
      Reviewed-by: imsnah
      bd1e4562
  11. 14 Aug, 2008 1 commit
    • Guido Trotter's avatar
      Pass hypervisor type to the OS scripts · 4f0afaf5
      Guido Trotter authored
      It's handy to make the os scripts know which hypervisor the instance is
      going to run under. In order not to change the os API we pass this
      information in the environment, where the os scripts can access it if
      they're hypervisor-aware.
      
      Reviewed-by: imsnah
      4f0afaf5
  12. 08 Aug, 2008 7 commits
  13. 06 Aug, 2008 1 commit
  14. 31 Jul, 2008 1 commit
  15. 30 Jul, 2008 4 commits
    • Iustin Pop's avatar
      Fix pylint-detected issues · 38206f3c
      Iustin Pop authored
      This is mostly:
        - whitespace fix (space at EOL in some files, not all, broken
          indentation, etc)
        - variable names overriding others (one is a real bug in there)
        - too-long-lines
        - cleanup of most unused imports (not all)
      
      Reviewed-by: ultrotter
      38206f3c
    • Iustin Pop's avatar
      Fix some errors detected by pylint · 3b9e6a30
      Iustin Pop authored
      Reviewed-by: imsnah
      3b9e6a30
    • Iustin Pop's avatar
      Rework master startup/shutdown/failover · b1b6ea87
      Iustin Pop authored
      This (big) patch reworks the master startup/shutdown and the fixes the
      master failover.
      
      What does the patch do?
      
      For master start/stop:
        - remove the old ganeti-master script and its associated man page
        - moves the ip start/stop directly into the backend.(Start|Stop)Master
        - adds start/stop of the master/rapi daemon into these functions,
          selectively based on the start/stop arguments
        - makes the master call via rpc StartMaster(start_daemons=False) to
          the local node so that the master IP is started
        - and finally changes the example init.d script to directly start and
          stop all three daemons, since they do the right thing (depending on
          master/not master role)
      
      For master failover:
        - moves the code from LUMasterFailover into bootstrap.MasterFailover,
          since we need to start/stop the master during this operation and
          thus it can't be executed from the master
        - removes the LUMasterFailover and its associated opcode
      
      Notes: ubuntu's /etc/lsb-base-logging.sh is dumb, so the messages 'not
      master' are not seen during startup on non-master nodes.
      
      Reviewed-by: ultrotter
      b1b6ea87
    • Iustin Pop's avatar
      Add a new parameter to backend.(Start|Stop)Master · 1c65840b
      Iustin Pop authored
      This patch adds a new, unused for now, parameter to the start and stop
      master operations in backend. The idea behind it is that we need to be
      able to control whether the IP (de)activation is coupled with daemon
      startup/shutdown.
      
      The callers are also modified to pass this parameter (even if unused for
      now).
      
      Reviewed-by: ultrotter
      1c65840b
  16. 23 Jul, 2008 1 commit
    • Iustin Pop's avatar
      Distribute the queue serial file after each update · c3f0a12f
      Iustin Pop authored
      This patch adds distribution of the queue serial file after each write
      to it (but before a new job is created and written with that ID, and
      before a response is returned, so we should be safe from crashes in
      between).
      
      Currently it only logs if a node cannot be contacted, it should abort if
      > 50% errors are seen.
      
      Reviewed-by: imsnah
      c3f0a12f
  17. 11 Jul, 2008 3 commits
    • Iustin Pop's avatar
      Convert backend.py to the logging module · 18682bca
      Iustin Pop authored
      The patch also switches some of the exception logs to use
      logging.exception (and therefore the log message will have a diferent
      format).
      
      (Note that this might not be a good choice in all cases, though)
      
      Reviewed-by: imsnah
      18682bca
    • Iustin Pop's avatar
      Fix backend.NodeVolumes handling of LVM output · a17a7623
      Iustin Pop authored
      This is the same fix as for GetVolumeList.
      
      I've checked manually and all other places that call lvm commands are
      already checking the output validity in terms of correct number of
      fields.
      
      Reviewed-by: ultrotter
      a17a7623
    • Iustin Pop's avatar
      Fix backend.GetVolumeList handling of LVM output · df4c2628
      Iustin Pop authored
      Sometimes ‘lvs’ can spit error messages on stdout, even when one wants
      to parse the output:
      ...
      Inconsistent metadata copies found - updating to use version 2776
      ...
      
      So we need to validate the output to guard against such cases.
      
      The patch converts the split on the separater to match against a regex
      and extract the fields via groups. The original separator choice is a
      bad one now :(
      
      Reviewed-by: imsnah
      df4c2628
  18. 27 Jun, 2008 2 commits
  19. 20 Jun, 2008 1 commit
    • Iustin Pop's avatar
      Add a rpc call for BlockDev.Close() · d61cbe76
      Iustin Pop authored
      This patch adds rpc layer calls (in rpc.py and the equivalent in
      ganeti-noded) to close a list of block devices, and the wrapper in
      backend.py that takes a list of Disk objects, identifies them and
      returns correctly formatted results.
      
      The reason why this very basic call was missing until now from the rpc
      layer is that we usually don't care about device closes (though we
      should, and will do so in the future) as only drbd has a meaningful
      Close() operation; right now we directly do Shutdown().
      
      The patch is clean enough that it's actually independent of the live
      migration implementation.
      
      Reviewed-by: imsnah
      d61cbe76
  20. 16 Jun, 2008 2 commits
    • Iustin Pop's avatar
      Expose block device grow in backend.py · 594609c0
      Iustin Pop authored
      This patch adds a wrapper over the block device grow operation that
      converts the input and output parameters as needed for the rpc layer.
      
      Reviewed-by: imsnah
      594609c0
    • Iustin Pop's avatar
      Add migration support at the rpc layer · 2a10865c
      Iustin Pop authored
      This patch adds the migration rpc call and its implementation in the
      backend. The patch does not deal with the correct activation of disks.
      
      Because of the new RPC, the protocol version is increased.
      
      Reviewed-by: imsnah
      2a10865c