1. 25 Jun, 2008 2 commits
    • Michael Hanselmann's avatar
      Cleanup old DRBD 0.7.x code · 00fb8246
      Michael Hanselmann authored
      Apparently there were still some leftovers. While removing an instance,
      I got the message "unhandled exception 'module' object has no attribute
      'LD_MD_R1'".
      
      Reviewed-by: iustinp
      00fb8246
    • Iustin Pop's avatar
      Cleanup LV status computation · 99e8295c
      Iustin Pop authored
      Currently, when seeing if a LV is degraded or not (i.e. virtual volume),
      we first attach to the device (which does an lvdisplay), then do a lvs
      in order to display the lv_attr. This generates two external commands to
      do (almost) the same thing.
      
      This patch changes the Attach() method for LVs to call lvs and display
      both the major/minor (needed for attach) and the lv_status (needed for
      GetSyncStatus). Thus, later in GetSyncStatus, we don't need to run lvs
      again, and instead just return the value computed in Attach().
      
      Reviewed-by: imsnah
      99e8295c
  2. 23 Jun, 2008 5 commits
  3. 22 Jun, 2008 1 commit
    • Iustin Pop's avatar
      Add a ‘tags’ field to instance and node listing · 130a6a6f
      Iustin Pop authored
      Currently there isn't any easy way to list all nodes or instance and
      their tags; you have to query each node in turn, or list all the tags
      via something like “gnt-cluster search-tags '.*'”. Of course, this is
      not optimal.
      
      The patch adds a new fields to “gnt-instance list” and “gnt-node list”
      called ‘tags’, that will list the tags of the object in comma-separated
      form. This field will be empty if there are no tags (when using a
      separator this output can still be parsed by other scripts).
      
      At opcode level, there is a new fields called ‘tags’ that returns a
      (python) list of the object tags.
      
      Reviewed-by: ultrotter
      130a6a6f
  4. 21 Jun, 2008 1 commit
    • Iustin Pop's avatar
      Implement handling of luxi errors in cli.py · 03a8dbdc
      Iustin Pop authored
      Currently the generic handling of ganeti errors in cli.py (GenericMain
      and FormatError) only handles the core ganeti errors, and not the client
      protocol errors (which live in a separate hierarchy).
      
      This patch adds handling of luxi errors too, and also adds another luxi
      error for the case when the master is not running. This gives us a nice:
      
        gnta1:~# gnt-node list
        Cannot communicate with the master daemon.
        Is it running and listening on '/var/run/ganeti-master.sock'?
      
      error message instead of a traceback.
      
      Reviewed-by: amishchenko
      03a8dbdc
  5. 20 Jun, 2008 1 commit
    • Iustin Pop's avatar
      Add a rpc call for BlockDev.Close() · d61cbe76
      Iustin Pop authored
      This patch adds rpc layer calls (in rpc.py and the equivalent in
      ganeti-noded) to close a list of block devices, and the wrapper in
      backend.py that takes a list of Disk objects, identifies them and
      returns correctly formatted results.
      
      The reason why this very basic call was missing until now from the rpc
      layer is that we usually don't care about device closes (though we
      should, and will do so in the future) as only drbd has a meaningful
      Close() operation; right now we directly do Shutdown().
      
      The patch is clean enough that it's actually independent of the live
      migration implementation.
      
      Reviewed-by: imsnah
      d61cbe76
  6. 19 Jun, 2008 1 commit
  7. 18 Jun, 2008 5 commits
    • Iustin Pop's avatar
      Rework the DRBD8 device status computation · 6b90c22e
      Iustin Pop authored
      Currently, compute the status of a drbd8 device in GetSyncStatus and
      return only the values that we need (and fit in the framework of
      GetSyncStatus). However, the full status details are useful (and needed)
      in other places, so the patch attempts to improve this situation.
      
      We abstract the status of a device outside in a separate class, that
      knows how to parse contents from /proc/drbd and set easily accessible
      attributes. We then simplify the GetSyncStatus to use this and return
      the values that it needs, and add a separate method that returns the
      full status object.
      
      The move to a separate class cleans up a little bit the old
      sync-progress computation from GetSyncStatus, but it's still many
      regexes.
      
      The patch also adds unittests for a few statuses, and modifies one
      BaseDRBD call to accept a custom filename instead of '/proc/drbd' to
      ease unittests.
      
      Reviewed-by: imsnah
      6b90c22e
    • Michael Hanselmann's avatar
      ganeti-watcher: Replace custom exceptions with ganeti.error.* · 7bca53e4
      Michael Hanselmann authored
      Reviewed-by: iustinp
      7bca53e4
    • Michael Hanselmann's avatar
      Add more parameters to utils.WriteFile · 71714516
      Michael Hanselmann authored
      - Make closing file optional: Required by ganeti-watcher to keep
        file open after writing it. Changes return value of utils.WriteFile
        if "close" parameter evaluates to True.
      - Pre- and post-write functions: Can be used to lock files. This
        will be used by ganeti-watcher to lock the temporary file before
        renaming.
      
      Reviewed-by: iustinp
      71714516
    • Michael Hanselmann's avatar
      Replace custom logging code in watcher with logging module · 438b45d4
      Michael Hanselmann authored
      - Log timestamp for all messages
      - Write everything to logfile and optionally to stderr
      - Log messages are no longer buffered, allowing a user to see progress
      
      Reviewed-by: ultrotter
      438b45d4
    • Michael Hanselmann's avatar
      Make sure serialized data ends with EOL character · e91ffe49
      Michael Hanselmann authored
      Also fix the regular expression to not remove newlines. The simplejson
      module puts whitespace at line endings when using indentation. Remove
      unnecessary import of ConfigParser module.
      
      Reviewed-by: ultrotter
      e91ffe49
  8. 17 Jun, 2008 5 commits
    • Iustin Pop's avatar
      Allow disk object to set their own physical ID · 0402302c
      Iustin Pop authored
      Currently, the way to customize a DRBD disk from (node name 1, node name
      2, port) to (ip1, port, ip2, port) is to use the ConfigWriter method
      SetDiskID. However, since this needs a ConfigWriter object, it can be
      run only on the master, and therefore disk object can't be passed to
      more than one node unchanged. This, coupled with the rpc layer
      limitation that all nodes in a multi-node call receive the same
      arguments, prevent any kind of multi-node operation that has disks as an
      argument.
      
      This patch takes the SetDiskID method from ConfigWriter and ports it to
      the disk object itself, and instead of the full node configuration it
      uses a simple {node_name: replication_ip} mapping for all the nodes
      involved in the disk tree (currently we only pass primary and secondary
      node since we don't support nested drbd devices).
      
      This allows us to send disks to both the primary and secondary nodes at
      once and perform synchronized drbd activation on primary/secondary
      nodes.
      
      Note that while for the 1.2 branch this will not change old methods, it
      is worth to investigate and possible replace all such calls on the
      master to the nodes themselves for the 2.0 branch.
      
      Reviewed-by: ultrotter
      0402302c
    • Iustin Pop's avatar
      Fix an error-handling case · c7cdfc90
      Iustin Pop authored
      There is a mistake in handling grow-disk for an invalid disk. This patch
      fixes it.
      
      Reviewed-by: imsnah
      c7cdfc90
    • Iustin Pop's avatar
      Implement disk grow at LU level · 8729e0d7
      Iustin Pop authored
      This patch adds a new opcode and LU for growing an instance's disk.
      
      The opcode allows growing only one disk at time, and will throw an error
      if the operation fails midway (e.g. on the primary node after it has
      been increased on the secondary node). As such, it might actually leave
      different sized LVs on different nodes, but this will not create
      problems.
      
      Reviewed-by: imsnah
      8729e0d7
    • Iustin Pop's avatar
      Add method to update a disk object size · acec9d51
      Iustin Pop authored
      This patch adds a method that implements updating of a disk
      (object.Disk) size, together with its children.
      
      While this will not track the exact disk size, it allows at least an
      approximate size to be recorded in the configuration (and queried).
      
      Reviewed-by: imsnah
      acec9d51
    • Iustin Pop's avatar
      Implement block device grow at the rpc layer · 4c8ba8b3
      Iustin Pop authored
      This simple patch exposes the block device grow operation at the rpc
      layer. It does not increase the protocol version as it has been recently
      changed by the live failover rpc call.
      
      Reviewed-by: imsnah
      4c8ba8b3
  9. 16 Jun, 2008 5 commits
    • Iustin Pop's avatar
      Expose block device grow in backend.py · 594609c0
      Iustin Pop authored
      This patch adds a wrapper over the block device grow operation that
      converts the input and output parameters as needed for the rpc layer.
      
      Reviewed-by: imsnah
      594609c0
    • Iustin Pop's avatar
      bdev: implement disk resize for lvm/drbd8 · 1005d816
      Iustin Pop authored
      This patch implements disk resize at the bdev level for the LVM and
      DRBD8 disk types. It is not implemented for DRBD7 and MD since the way
      MD works with its underlaying devices makes it harder and this
      combination is also deprecated.
      
      The LVM resize operation is tried three times, with different allocation
      policies:
        - contiguous first, since this is best for allocation purposes (it
          won't fragment too much the PV)
        - cling, which is supported only by more recent LVM versions, will try
          to place the new extents on the same PV as the rest of the LV
        - and finally normal, which is the default
      
      Reviewed-by: imsnah
      1005d816
    • Guido Trotter's avatar
      Move SetKey to WritableSimpleStore and use it · 05f86716
      Guido Trotter authored
      Before we used to be able to update SimpleStore by just calling SetKey, this
      feature is now moved to an external class, which inherits from it. In this
      patch the new WritableSimpleStore class is also put to use, in the LUs that
      need it. Rather than making each LU instantiate it, we have a new LogicalUnit
      flag REQ_WSSTORE which defaults to False, but when declared to be True asks the
      LogicalUnit to be initialized with a writeable version of the SimpleStore.
      LUMasterFailover and LURenameCluster are then changed to use it.
      
      InitCluster is also changed to instantiate a WritableSimpleStore, rather
      than a normal one.
      
      Reviewed-by: imsnah
      05f86716
    • Iustin Pop's avatar
      Add migration support at the rpc layer · 2a10865c
      Iustin Pop authored
      This patch adds the migration rpc call and its implementation in the
      backend. The patch does not deal with the correct activation of disks.
      
      Because of the new RPC, the protocol version is increased.
      
      Reviewed-by: imsnah
      2a10865c
    • Iustin Pop's avatar
      hypervisor: add live migration support · 6e7275c0
      Iustin Pop authored
      This is just the hypervisor-level migration (e.g. “xm migrate”) not the
      whole node coordination work.
      
      Reviewed-by: ultrotter
      6e7275c0
  10. 15 Jun, 2008 3 commits
    • Guido Trotter's avatar
      Activate down instances' disks on replace-disks · 22985314
      Guido Trotter authored
      When replacing disks or evacuating nodes with instances administratively
      down ganeti fails because the instance disks are not active. This patch
      activates them, performs the replacement, and shuts them down again.
      Changing this also fixes the same issue on gnt-node evacuate.
      
      Reviewed-by: iustinp
      
      22985314
    • Guido Trotter's avatar
      FailoverInstance: change AddInstance with Update · b6102dab
      Guido Trotter authored
      We're not adding a new instance, just making configuration changes to
      the one we're working on.
      
      Reviewed-by: imsnah
      
      b6102dab
    • Iustin Pop's avatar
      Fix an error message in instance add · 3e91897b
      Iustin Pop authored
      There is a mistake in the error message generated when we can't reach a
      node for checking for available disk space. Without it, the error
      message is:
      Failure: prerequisites not met for this operation:
      Cannot get current information from node '{u'gnte2.lab.k1024.org':
      {'cpu_total': 1, 'memory_free': 480, 'vg_size': 131068, 'memory_total':
      504, 'bootid': '2176dd3b-2f96-42f0-8b6e-2873ecaf5f9c', 'memory_dom0':
      134, 'vg_free': 130172}, u'gnte1.lab.k1024.org': False}'
      
      instead of the expected:
      Failure: prerequisites not met for this operation:
      Cannot get current information from node 'gnte2.lab.k1024.org'
      
      Reviewed-by: imsnah
      3e91897b
  11. 13 Jun, 2008 2 commits
  12. 12 Jun, 2008 6 commits
  13. 11 Jun, 2008 1 commit
    • Guido Trotter's avatar
      Remove SimpleStore cache · c9673d92
      Guido Trotter authored
      SimpleStore is instantiated anew most of the times it's used, so having
      a cache inside it serves no purpose. Removing it.
      
      Reviewed-by: iustinp
      c9673d92
  14. 06 Jun, 2008 1 commit
  15. 31 May, 2008 1 commit
    • Iustin Pop's avatar
      Add check for node memory in instance creation · 49ce1563
      Iustin Pop authored
      Currently the check for enough memory is done only on instance start
      command and failover command. But we also start an instance in instance
      create, therefore we need to check this instead of failing to start in
      the hypervisor phase.
      
      The patch adds a check for node memory in the case the creation command
      specifies that the instance should be started. It is allowed for the
      memory to be less than needed if the instance will not be started, in
      order to allow migration and other such cases.
      
      Reviewed-by: imsnah
      49ce1563