1. 16 Apr, 2008 1 commit
    • Iustin Pop's avatar
      Allocator framework, 1st part: allocator input generation · d61df03e
      Iustin Pop authored
      In preparation for the introduction of automatic instance allocator,
      this patch adds an allocator simulation opcode, that based on the input
      parameters, will return either the input message to the allocator
      (implemented) or the result of the allocator run (not yet implemented).
      
      This allows algorithm tests against simulated allocations and the
      current cluster state.
      
      The patch adds the following:
        - a function that generates the generic cluster information for the
          allocator
        - a function that generates the 'new instance' information
        - a function that generates the 'replace_secondary' information
      
      These three functions will be used by the allocator framework later to
      generate the actual information for the external algorithms. Currently
      we just return the json-serialized text.
      
      Reviewed-by: imsnah
      d61df03e
  2. 10 Apr, 2008 2 commits
  3. 05 Apr, 2008 1 commit
    • Iustin Pop's avatar
      Implement forking/master role checking in masterd · c1f2901b
      Iustin Pop authored
      This patch adds checks for the master role and daemonize support to
      ganeti-masterd.
      
      The patch modifies the startup/shutdown of the server because:
        - we want bind()/listen() to the master socket to occur before forking
          so that we can return a correct exit code and write messages to
          stderr
        - but we want thread startup to occur after fork(), otherwise python
          threading gets confused
      
      The patch also has some small cleanups:
        - remove the unix socket after closing it, so we don't need to remove
          it manually
        - instead of just telling the threads to terminate via the new_queue,
          we also join() them so that the logs show what thread clinging to
          life
        - the daemon logs to its own logfile now
        - there is command line parameter support :)
      
      Reviewed-by: imsnah
      c1f2901b
  4. 01 Apr, 2008 1 commit
    • Iustin Pop's avatar
      Add submit function to lib/cli.py · ceab32dd
      Iustin Pop authored
      This patch adds function that submit jobs or queries over the unix socket
      interface to lib/cli.py. The will be used by the scripts instead of the
      SubmitOpCode function.
      
      Reviewed-by: ultrotter
      ceab32dd
  5. 31 Mar, 2008 1 commit
    • Manuel Franceschini's avatar
      Add DEFAULT_VG and DTS_NOT_LVM to constants.py · d63e148a
      Manuel Franceschini authored
      DTS_NOT_LVM:
      This constant is needed when checking if an instance can be created with
      the given disk template if no lvm-storage is available, i.e. the ganeti
      cluster does not have a volume group
      
      DEFAULT_VG:
      'xenvg' has been hardcoded before.
      
      Reviewed-by: iustinp
      d63e148a
  6. 25 Mar, 2008 1 commit
  7. 19 Mar, 2008 1 commit
  8. 18 Mar, 2008 1 commit
  9. 20 Jan, 2008 2 commits
    • Iustin Pop's avatar
      Make backend._GetVGInfo check the validity of 'vgs' · f4d377e7
      Iustin Pop authored
      Currently, the function backend._GetVGInfo only checks for errors via
      the exit code of the 'vgs' command. However, there are other ways of
      failure so we need to also check for valid output before parsing.
      
      Furthermore, the checks on the exit code were reported via a 'raise
      LVMError', however this exception is not handled anywhere and so the
      remote caller will not get reasonable data.
      
      This patch does two main things:
        - change the calling protocol for this function to not raise an error,
          and instead return the same type of argument always (dict) with the
          requested keys but values changed into None; this allows in the
          parent rpc call node_info to have valid memory information but
          "error" value for disk space, if there's an error with disks
        - check the validity of the output so that in case we fail to parse
          it, we don't abort with a backtrace in the node daemon but instead
          return the default result value (containing errors), and log these
          cases in the node daemon log file
      
      We also bump the protocol version to 11.
      
      Reviewed-by: ultrotter
      f4d377e7
    • Iustin Pop's avatar
      Fix run directory for the fake hypervisor · 1ed70996
      Iustin Pop authored
      Currently the fake hypervisor has hardcoded ‘/var/run’ as a base
      directory for its store. This patch adds a constant RUN_DIR that is used
      for both the fake hypervisor and for BDEV_CACHE_DIR.
      
      Reviewed-by: ultrotter
      1ed70996
  10. 11 Jan, 2008 1 commit
  11. 08 Jan, 2008 1 commit
    • Iustin Pop's avatar
      Add support for modifying the kernel/initrd path · 973d7867
      Iustin Pop authored
      This patch adds support in ‘gnt-instance modify’ to set the kernel and
      initrd paths. The user can pass either 'default' or 'none' (none is not
      valid for kernel).
      
      Reviewed-by: imsnah
      973d7867
  12. 07 Jan, 2008 1 commit
    • Iustin Pop's avatar
      Improve verify-disks: broken/missing LV detection · b63ed789
      Iustin Pop authored
      This patch improves the ‘gnt-cluster verify-disks’ command by adding
      support for detecting broken volume groups and missing logical volume
      names.
      
      As such, we don't try anymore to activate disks for instances that are
      not likely to succeed anyway, and instead report them.
      
      Reviewed-by: schreiberal
      b63ed789
  13. 20 Dec, 2007 2 commits
  14. 18 Dec, 2007 1 commit
  15. 17 Dec, 2007 1 commit
  16. 11 Dec, 2007 1 commit
    • Iustin Pop's avatar
      Return more data in rpc.call_volume_list · cb2037a2
      Iustin Pop authored
      Currently, the volume_list call returns only the volume size. However,
      it is useful to also have two other things: the 'inactive' state of the
      volume (which might trigger a ‘vgchange -a y’ on the volume group) and
      the online state (which shows if the volume is in use or not).
      
      Since this modifies an RPC call, we also bump the protocol version,
      although the single user of the call didn't care about the dictionary
      values, only about the keys.
      
      Reviewed-by: imsnah
      cb2037a2
  17. 29 Nov, 2007 1 commit
    • Iustin Pop's avatar
      Replace hardcoded lock dir · 3aecd2c7
      Iustin Pop authored
      This patch replaces the hardcoded ‘/var/lock/’ directory with one based on
      LOCALSTATEDIR.
      
      Reviewed-by: imsnah
      3aecd2c7
  18. 09 Nov, 2007 1 commit
  19. 07 Nov, 2007 1 commit
    • Iustin Pop's avatar
      Enhance secondary node replace for drbd8 · 0834c866
      Iustin Pop authored
      This (big) patch does two things:
        - add "local disk status" to the block device checks
          (BlockDevice.GetSyncStatus and the rpc calls that call this
          function, and therefore cmdlib._CheckDiskConsistency)
        - improve the drbd8 secondary replace operation using the above
          functionality
      
      The "local disk status" adds a new variable to the result of
      GetSyncStatus that shows the degradation of the local storage of the
      device. Of course, not all device support this - for now, we only modify
      LogicalVolumes and DRBD8 to return degraded in some cases, other devices
      always return non-degraded. This variable should be a subset of
      is_degraded - whenever this variable is true, the is_degraded should
      also be true.
      
      The drbd8 secondary replace uses this variable as we don't care if the
      primary drbd device is network-degraded, only if it has good local disk
      data (ldisk is False).
      
      The patch also increases the protocol version (due to rpc changes).
      
      Reviewed-by: imsnah
      0834c866
  20. 05 Nov, 2007 2 commits
    • Guido Trotter's avatar
      Bump protocol version up · 8ee53a06
      Guido Trotter authored
      The OS cleanup patches change the wire protocol. Increment the protocol number
      by one.
      
      Reviewed-By: iustinp
      
      8ee53a06
    • Guido Trotter's avatar
      Make the OS object able to represent broken OSes · 37482e7b
      Guido Trotter authored
      Till now the OS object just represents a correct OS instance.  Change it so it
      can represent a broken one too, by adding a "status" field: if this field is
      different from the OS_VALID_STATUS constant the object is considered to be an
      invalid OS, the "status" field to be a debugging message, and its boolean
      status is set to false.
      
      Reviewed-By: iustinp
      37482e7b
  21. 02 Nov, 2007 1 commit
    • Iustin Pop's avatar
      Implement device to instance mapping cache · 3f78eef2
      Iustin Pop authored
      Currently, troubleshooting DRBD problems involves a manual process of going
      backwards from the DRBD device to the instance that owns it.
      
      This patch adds a weak (i.e. not guaranteed to be correct or up-to-date)
      cache of device to instance. The cache should be, in normal operation,
      having correct information as the only time when devices change paths
      are when they are started/stopped, and the code in backend.py adds cache
      updates to exactly these operations.
      
      The only drawback of this implementation is that we don't fully update
      the cache on renames of devices (we clean the old entries but we don't
      add new ones). Since the rename changes the path only for LVs (and not
      drbd and md), this is less of a problem as the target of this code is
      debugging DRBD and MD issues.
      
      The patch writes files named bdev_drbd<N> (or bdev_md<N>,
      bdev_xenvg_...) in /var/run/ganeti (more exactly, LOCALSTATEDIR/ganeti).
      The files start with 'bdev_' and continue with the path of the device
      under /dev/ (this prefix stripped), and contain the following values,
      space separated:
        - instance name
        - primary or secondary (depending on how the device is on the primary
          or secondary node)
        - instance visible name: sda or sdb or not_visible, the latter case
          when the device is not the top-level device (i.e. remote_raid1
          templates will have sd[ab] for the md, but not_visible for drbd and
          logical volumes)
      
      The cache is designed to not raise any errors, if there is an I/O error
      it will only be logged in the node daemon log file. This is in order to
      reduce the possible impact of the cache on the block device activation
      and shutdown code.
      
      Reviewed-by: imsnah
      3f78eef2
  22. 29 Oct, 2007 1 commit
    • Iustin Pop's avatar
      Implement replace-disks for drbd8 devices · a9e0c397
      Iustin Pop authored
      This patch adds three modes of disk replacement for drbd8:
        - replace the disk on the primary node
        - replace the disk on the secondary node
        - replace the secondary node
      
      It also adds some debugging code to backend.py and increments the
      protocol version for the recent changes of the rpc layer.
      
      Reviewed-by: imsnah
      a9e0c397
  23. 24 Oct, 2007 1 commit
    • Iustin Pop's avatar
      Initial implementation of drbd8 template type · a1f445d3
      Iustin Pop authored
      This is a partially working drbd8 template type. It does:
        - add/remove
        - startup/failover/shutdown
      
      Not working is replace disks, which needs custom code for this template.
      
      Reviewed-by: imsnah
      a1f445d3
  24. 19 Oct, 2007 2 commits
    • Iustin Pop's avatar
      Some tiny style fixes · aa4260ca
      Iustin Pop authored
      Reviewed-by: imsnah
      aa4260ca
    • Iustin Pop's avatar
      Abstract more strings values into constants · fe96220b
      Iustin Pop authored
      Currently, the disk types are defined using constants in the code.
      Convert those into constants so that we can easily find them and check
      their usage.
      
      Note that we don't rename the values of the constants as they are used
      in the configuration file, and as such it's best to leave them as they
      are.
      
      Reviewed-by: imsnah
      fe96220b
  25. 17 Oct, 2007 1 commit
    • Alexander Schreiber's avatar
      Patch series for reboot feature, part 1 · 007a2f3e
      Alexander Schreiber authored
      This patch series implements the reboot command for gnt-instance. It
      supports three types of reboot: soft (hypervisor reboot), hard (instance
      config rebuild and reboot) and full (full instance shutdown and startup
      again).
      
      This patch contains the backend and rpc part of the patch.
      
      
      Reviewed-by: iustinp
      
      007a2f3e
  26. 16 Oct, 2007 1 commit
    • Iustin Pop's avatar
      Replace more ssh paths with proper constants · 70d9e3d8
      Iustin Pop authored
      The node's ssh keys filenames are now provided as constants; this should
      allow easier customization.
      
      Also, the user's ssh key computing has been abstracted into ssh.py
      
      Reviewed-by: imsnah
      70d9e3d8
  27. 12 Oct, 2007 1 commit
    • Iustin Pop's avatar
      Remove some hardcoded names/paths from backend.py · 7900ed01
      Iustin Pop authored
      This patch does the following:
        - add constants.GANETI_RUNAS = "root", which is used to compute
          the homedir (and thus the .ssh directory) instead of hardcoding
          "/root/.ssh" in backend.AddNode and backend.LeaveCluster
        - add constants.SSH_CONFIG_DIR (currently hardcoded to /etc/ssh) that
          is used in backend instead of hardcoding it (preparation for
          selecting that at ./configure time)
        - some more internal cleanup in backend.AddNode
      
      Reviewed-by: imsnah
      7900ed01
  28. 11 Oct, 2007 1 commit
    • Iustin Pop's avatar
      Implement post-configuration-update hook · 6a4aa7c1
      Iustin Pop authored
      This patch adds a special hook: the post-configuration update hook. This
      hook has only a post phase that runs after a top-level LU that modified
      the configuration.
      
      Since the hook is a post-phase one, no error checking is done on the
      results. The hook runs only on the master.
      
      Reviewed-by: imsnah
      6a4aa7c1
  29. 10 Oct, 2007 5 commits
    • Alexander Schreiber's avatar
      Remove fping as a dependency for Ganeti. · 16abfbc2
      Alexander Schreiber authored
      This patch completely  gets rid of fping
       - replace all fping invocations with TcpPing calls
       - update documentation accordingly.
       - associated cleanups (use constant for localhost IP, use more sensible
         defaults for TcpPing and _use_ those)
      
      Reviewed-by: iustinp
      
      16abfbc2
    • Iustin Pop's avatar
      Implement gnt-node evacuate · a5bc662a
      Iustin Pop authored
      This patch adds a new 'evacuate' subcommand to gnt-node. The command
      will do a replace disks for all instances having that node as secondary
      with the new target being the new node given.
      
      The syntax is:
        gnt-node evacuate src_node target_node
      
      The command by itself doesn't do any resource checks, and instead relies
      on the LUFailoverInstance code to do that.
      
      Reviewed-by: imsnah
      a5bc662a
    • Michael Hanselmann's avatar
      Make Xen DomU kernel and initrd configurable at build time. · f00b46bc
      Michael Hanselmann authored
      Reviewed-by: iustinp
      
      f00b46bc
    • Iustin Pop's avatar
      Remove the shebang from modules · 2f31098c
      Iustin Pop authored
      Since modules are not directly executables, remove the shebang from
      them. This helps with lintian warnings.
      
      Also make the autogenerated _autoconf.py contain two comment lines at
      the beginning, like the other modules.
      
      Reviewed-by: ultrotter
      2f31098c
    • Michael Hanselmann's avatar
      Detect node restarts and reactivate disks. · 5a3103e9
      Michael Hanselmann authored
      - Change format of watcher state file to JSON.
      - Move log path for watcher script to constants.py.
      
      Reviewed-by: iustinp
      
      5a3103e9
  30. 04 Oct, 2007 1 commit
  31. 28 Sep, 2007 1 commit