1. 08 Jun, 2009 2 commits
      Enable stripped LVs · fecbe9d5
      This patch enables stripped LVs, falling back to non-stripped if the
      stripped creation fails. If the configure-time lvm-stripecount is 1,
      this patch becomes a noop (with an insignificant python-level overhead,
      but no extra lvm calls).
      The effect of this patch is that new instances will get stripped LVs
      from the start, whereas old instances will have their LVs stripped as
      soon as replace-disks is run for them.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      Add more constants for DRBD and change sync tests · 3c003d9d
      This patch adds constants for the connection status, peer roles and disk
      status, and it changes the rules for when the disk is considered as
      “resyncing” - previously it was only for syncsource/synctarget, but
      there are many other transient statuses which could be misinterpreted as
      ‘degraded’ (because they where not considered as resyncing, but the disk
      is not consistent in these statuses).
      Furthermore, cmdlib.py:WaitForSync determines if a device is syncing or
      not based on sync_percent being not none. Not all DRBD resync statuses
      offer a percent done, so if we are syncing but don't have a sync
      percent, we'll report a zero sync percent (and no time estimate).
      The patch also removes a few unused variables (is_sync_target,
      peer_sync_target, is_resync) whose value doesn't make sense anymore with
      the new sync rules.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
  2. 03 Jun, 2009 1 commit
      Assemble DRBD using the known size · f069addf
      This patch changes DRBD disk attachment to force the wanted size, as opposed to
      letting the device auto-discover its size.
      This should make the disks more resilient with regard to small differences in
      size (e.g. due to LVM rounding). This still works with regard to disk
      growth, but the instances needs to be fully restarted (including disks)
      in that case.
      This passes a full burning without problems, but it's still a tricky
      change - if the config.data is not synced with the reality, we might
      tell DRBD a wrong size. At least this will fail outright (and not
      introduce silent errors), as DRBD (per a quick check at the sources)
      tracks the size in the meta-dev and also does not allow shrinking
      consistent devices.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
  3. 28 May, 2009 1 commit
      Change the bdev init signatures · 464f8daf
      This patch changes all the bdev.BlockDev constructors to take an
      additional ‘size’ parameter, all the backend functions that call those
      functions to pass it and also changes backend.BlocdevCreate() to not use
      the size passed via the rpc call but instead directly disk.size (this is
      the only way it's called).
      Note that this patch doesn't do anything with this parameter, just
      stores it on the blockdev objects.
      With the patch, we actually have a more uniform init sequence (before
      create had the parameter, but the other functions not).
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
  4. 05 May, 2009 2 commits
      Fix compatibility with DRBD 8.3 · 01e2ce3a
      DRBD 8.3 changes two more things compared to 8.2:
        - /proc/drbd format changed in multiple ways; the part we're
          interested is the ‘st:’ to ‘ro:‘ change (in the changelog named as
          “Renamed 'state' to 'role'”
        - “drbdsetup /dev/drbdN show” changed the ‘device’ stanza from:
            device "/dev/drbd0";
            device                  minor 0;
      This patch fixes these both and adds data files and unittests for DRBD
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Fix compatibility with DRBD 8.2 · 34e71fea
      This patch adds (and suppresses) the extra ipv4/ipv6 words before the
      actual address that newer DRBD versions add.
      [iustin@google.com: slightly changed the patch to conform to style
      guide, and changed the commit message]
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
  5. 16 Feb, 2009 1 commit
      Convert IOErrors for /proc/drbd into our errors · f6eaed12
      If /proc/drbd can't be opened, this raises an IOError, but all the
      error-handling behaviour in backend treats only BlockDeviceErrors. This
      creates a plain failure in cluster verify and in other RPC calls.
      This patch simply converts EnvironmentErrors into BlockDeviceErrors, and
      also changes the RPC result for NV_DRBDLIST and its handling to be able
      to show the error. The other RPC calls work by default now, due the
      existing error handling.
      Reviewed-by: ultrotter
  6. 11 Feb, 2009 1 commit
      FileStorage: abort creating over an existing file · aed77cea
      In FileStorage there is a TODO:
       decide whether we should check for existing files and
       abort or not
      After Ganeti ate my instance data I decided. Let's abort.
      In general there is no reason we should overwrite existing files, and
      doing it can be very harmful for preexisting files on the host.
      Reviewed-by: iustinp
  7. 10 Feb, 2009 5 commits
      Some error message cleanups · 33bc6f01
      Reviewed-by: imsnah
      Cleanup of DRBD8._CheckMetaSize · 9c793cfb
      This patch converts the _CheckMetaSize method to raise exceptions
      instead of logging and returning False. This fits now in the new rpc
      return types, so it's a cheap change.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Iustin Pop authored
      functions to raise exceptions instead of returning False. This is a big
      patch, since the assembly functions touch other functions: add children,
      creation, etc. However, the patch does not add much new code, rather it
      reworks existing code.
      One of the biggest changes is in the rework of the DRBD8._SlowAssemble()
      method (one of the most complicated/ugly ones). Hopefully the new
      version is a little bit more readable.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Iustin Pop authored
      This doesn't permit any error detail reporting.
      This patch changes the return type to None for success, and raises
      BlockDeviceError in case of failure. This permits the details to be
      passed up the stack.
      The patch also simplifies a little the Remove method of file-based
      devices (no stat first, just try unlink).
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Iustin Pop authored
      This doesn't permit any error detail reporting.
      This patch changes the return type to None for success, and raises
      BlockDeviceError in case of failure. This permits the details to be
      passed up the stack.
      For LVM and file-backed devices, this is a simple change. For DRBD, we
      first remove the shutdown of disks in case of network activation
      failures (since with static minors the minor is used anyway, we don't
      gain anything by clearing it), and the we simply change _ShutdownAll()
      to raise an exception.
      Reviewed-by: ultrotter
  8. 09 Feb, 2009 1 commit
      bdev: add and use two utility functions · 82463074
      This patch adds two utility functions for raising BlockDeviceError
      exceptions and for running functions while ignoring this error. Most of
      the manual “raise errors.BlockDeviceError” cases are converted to
      _ThrowError, as this makes the code clearer.
      We also change most of the DRBD error messages to include the minor
      number because with the parallel execution of commands it's not longer
      possible to identify the failed DRBD just from the timestamp, and the
      minor number can be mapped back to the instance easier.
      Reviewed-by: ultrotter
  9. 23 Jan, 2009 1 commit
      Remove checking of DRBD metadata for validity · 3b559640
      Currently the DRBD code checks that the metadata devices are valid
      before creation, initial disk attachment and add children.
      However, the process for checking validity requires a free DRBD minor,
      and this conflict with parallel checking.
      There are at least three possible solutions:
        - serialize all checks, which means we reduce parallelism and need
          extra locks
        - don't pass a valid minor number, but one like “/dev/drbd256” (which
          is invalid); this works for current version of DRBD, but since it's
          not guaranteed to remain so it doesn't look nice
        - don't do the checking at all, and rely on “drbdsetup ... disk ...”
          to fail by itself
      The reason for checking metadata was that in 1.2, this was much cheaper
      than trying to activate devices (and the subsequent iteration over the
      minors). However, in 2.0, they have the same cost, so we can choose
      option 3: just remove the explicit checking and rely on drbdsetup and
      the kernel to fail.
      Since DRBD8._InitMeta still requires a minor number, the two places
      where this is run are handled as follows:
        - Create: we just use our own (unused currently) minor number
        - AddChildren: we keep using FindUnusedMinor, with the caveat that
          this function (used by replace-disks -n ...) cannot be yet
      Reviewed-by: ultrotter
  10. 20 Jan, 2009 2 commits
      Make cluster-verify check the drbd minors space · 6d2e83d5
      This patch adds support for verification of drbd minors space in cluster
      verify: minors which belong to running instances and should be online
      but are not, and minors which do not belong to any instace but are in
      The patch requires exposing some methods from bdev.DRBD8 and
      config.ConfigWriter which were until now private methods.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Iustin Pop authored
      abort at create time if our minor is already in use. For this we need to
      also modify DRBD8Status to be able to parse cs:Unconfigured devices.
      Reviewed-by: ultrotter
  11. 19 Jan, 2009 1 commit
      Block device creation cleanup · 6c626518
      Currently when creation LVM-based instances, we always get the
      extremely-confusing message "ERROR Can't find LV /dev/xenvg/..." which
      is actually expected. This behaviour was introduced before we had
      UUID-style LV names, since at that point it was not a unexpected to have
      such volumes laying around after a failed creation.
      Today, it's much more of an error to see existing volumes, and it's
      better to abort with a failure. Since bdev.LogicalVolume.Create() method
      will raise an error in case it exists, we can remove this check in
      backend before creating the device.
      The Create methods for DRBD and FileStorage currently don't raise
      exception, as behaviour is not very well defined here.
      We also change some exception types raised in bdev so that all
      exceptions raised by device creation are a subclass of GenericError.
      Reviewed-by: ultrotter
  12. 13 Jan, 2009 4 commits
      Forward-port DrbdNetReconfig · 6b93ec9d
      This is a modified forward-port of DrbdNetReconfig and their associated
      RPCs. In Ganeti 2.0, these functions will be used for two things:
        - live migration (as in 1.2)
        - and for other network reconfiguration tasks, since DRBD8.Attach()
          doesn't do them anymore
      Because of the Attach() changes, we can now implement the
      AttachNet/DisconnectNet functions as independent entities, and we don't
      need the cache anymore.
      Note these functions are copies of the latest 1.2 code, and not
      cherry-picks of the (many) patches that went into 1.2.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Iustin Pop authored
      backend function to show that the intent is to fully assemble the device
      (and it's always allowed to modify the device).
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Iustin Pop authored
      alter the device state. This is suboptimal, and it has been worked
      around in 1.2 via a special cache in the node daemon so that we don't
      need to call Attach() again in migration, for example.
      Since in 2.0 we have static minors, we can change these functions so
        - Attach() does not affect the device in any way, and only checks if
          the minor is already in use or not
        - Assemble() has two logic paths, one for startup from unused minor
          (the old Assemble, now renamed _FastAssemble) and one for
          re-checking/fixing an in-use minor (the old Attach, now renamed
      Basically Attach was renamed to _SlowAttach, Assemble to _FastAssemble,
      and we have a new, simple Assemble that calls one or the other based on
      the result of the new Attach.
      The LUReplaceDisks (with new secondary) is relying on the special
      semantics of Attach modifying the device and is broken until the end of
      the patch series.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Iustin Pop authored
      explicit recursion over all the children of the device, with better
      error reporting. As such, we don't need this repeated assembly inside
      the base BlockDev class.
      Reviewed-by: ultrotter
  13. 09 Jan, 2009 1 commit
      Work around a DRBD sync speed race condition · 7d585316
      This is modified forward-port of commit 1544 on the 1.2 branch:
        When DRBD is doing its dance to establish a connection with its
        peer, it also sends the synchronization speed over the wire. In
        some cases setting the sync speed only after setting up both
        sides can race with DRBD connecting, hence we set it here before
        telling DRBD anything about its peer.
        Reviewed-by: iustinp
      The modification we make is that we split SetSyncSpeed in two so that we
      don't need to modify our minor temporarily, and the fact that we call
      this function from within _AssembleNet (right before enabling network),
      instead of Assemble()/Attach().
      Original-Author: imsnah
  14. 08 Jan, 2009 1 commit
      bdev: forward-port ReAttachNet/DisconnectNet · cf8df3f3
      This is plain copy of the 1.2 ReAttachNet and DisconnectNet methods on
      the DRBD8 device, with the logger to logging module changes and the
      ReAttachNet method renamed to AttachNet.
      These methods are not used anywhere right now, but will be used for
      migration and a simpler disk-replace.
      The code was originally committed on the 1.2 branch as revision numbers
      1165 and 1204.
      Originally-Reviewed-by: imsnah, ultrotter
  15. 11 Dec, 2008 1 commit
      Fix epydoc format warnings · c41eea6e
      This patch should fix all outstanding epydoc parsing errors; as such, we
      switch epydoc into verbose mode so that any new errors will be visible.
      Reviewed-by: imsnah
  16. 27 Nov, 2008 1 commit
      Fix file-based block devices · ecb091e3
      We changed a while ago the protocol for opening block devices, but
      FileStorage was not changed. This patch makes it work again.
      Reviewed-by: imsnah
  17. 29 Sep, 2008 3 commits
      Move a hardcoded constant to constants.py · 3c03759a
      For now we only use the ‘C’ protocol so we can put it in constants.py
      instead of hardcoding it.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Iustin Pop authored
      (hardcoded in constants.py) the md5 digest algorithm.
      For making this more flexible, either we implement a cluster parameter
      (once the new model is in place), or we can make it ./configure-time
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Iustin Pop authored
      attribute), extends the logical and physical id of the DRBD disks with a
      shared secret attribute. This is generated at disk creation time and
      saved in the config file.
      The generation of the secret is done so that we don't have duplicates in
      the configuration (otherwise the goal of preventing cross-connection
      will not be reached), so we add to config.py more than just a simple
      call to utils.GenerateSecret().
      The patch does not yet enable the use of the secrets.
      Reviewed-by: imsnah
  18. 23 Sep, 2008 1 commit
      Switch to static minors for DRBD · a1578d63
      With some todos remaining, this patch switches the DRBD devices to use
      the passed minors, and the cmdlib code (add instance and replace disks)
      to request and assign minors to the DRBD disks.
        - look at the disk RPC calls to see which can be optimized away, since
          we now know the minor beforehand
        - remove the _FindUnusedMinor usage from the few places it's still
          used (not for actual disks, but for temporary use in meta devs) and
          eventually replace with _CheckMinorUnused or such
      Of course, this and/or the previous two patches break existing clusters.
      Reviewed-by: imsnah
  19. 22 Sep, 2008 1 commit
      Extend DRBD disks with minors attribute · ffa1c0dc
      This patch converts the DRBD disks to contain also a minor (per each
      node) attribute. This minor is not yet used and is always initialized
      with None, so the patch does not have any real-world impact - except for
      automatically upgrading config files (it adds the minors as None, None).
      Reviewed-by: imsnah
  20. 09 Jul, 2008 2 commits
      Reduce duplicate Attach() calls in bdev · cb999543
      Currently, the 'public' functions of bdev (FindDevice and
      AttachOrAssemble) will call the Attach() method right after class
      But the constructor itself calls this function, and therefore we have
      duplicate Attach() calls (which are not cheap at all).
      The patch introduces a new 'attached' instance attribute that tells if
      the last Attach() was successful. The public functions reuse this so
      that we only do the minimum required number of calls.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Iustin Pop authored
      new module.
      Reviewed-by: imsnah
  21. 25 Jun, 2008 1 commit
      Cleanup LV status computation · 99e8295c
      Currently, when seeing if a LV is degraded or not (i.e. virtual volume),
      we first attach to the device (which does an lvdisplay), then do a lvs
      in order to display the lv_attr. This generates two external commands to
      do (almost) the same thing.
      This patch changes the Attach() method for LVs to call lvs and display
      both the major/minor (needed for attach) and the lv_status (needed for
      GetSyncStatus). Thus, later in GetSyncStatus, we don't need to run lvs
      again, and instead just return the value computed in Attach().
      Reviewed-by: imsnah
  22. 18 Jun, 2008 1 commit
      Rework the DRBD8 device status computation · 6b90c22e
      Currently, compute the status of a drbd8 device in GetSyncStatus and
      return only the values that we need (and fit in the framework of
      GetSyncStatus). However, the full status details are useful (and needed)
      in other places, so the patch attempts to improve this situation.
      We abstract the status of a device outside in a separate class, that
      knows how to parse contents from /proc/drbd and set easily accessible
      attributes. We then simplify the GetSyncStatus to use this and return
      the values that it needs, and add a separate method that returns the
      full status object.
      The move to a separate class cleans up a little bit the old
      sync-progress computation from GetSyncStatus, but it's still many
      The patch also adds unittests for a few statuses, and modifies one
      BaseDRBD call to accept a custom filename instead of '/proc/drbd' to
      ease unittests.
      Reviewed-by: imsnah
  23. 16 Jun, 2008 1 commit
      bdev: implement disk resize for lvm/drbd8 · 1005d816
      This patch implements disk resize at the bdev level for the LVM and
      DRBD8 disk types. It is not implemented for DRBD7 and MD since the way
      MD works with its underlaying devices makes it harder and this
      combination is also deprecated.
      The LVM resize operation is tried three times, with different allocation
        - contiguous first, since this is best for allocation purposes (it
          won't fragment too much the PV)
        - cling, which is supported only by more recent LVM versions, will try
          to place the new extents on the same PV as the rest of the LV
        - and finally normal, which is the default
      Reviewed-by: imsnah
  24. 30 May, 2008 1 commit
      Complete removal of md/drbd 0.7 code · abdf0113
      This patch removes the last of the md and drbd 0.7 code. Cluster which
      have the old device types will be broken if they have this applied.
      Reviewed-by: imsnah
  25. 15 May, 2008 2 commits
      Fix drbd show parser to handle valueless keywords · 63012024
      It turns out in some cases there can exist keywords without an
      associated value exported by drbdsetup show. This patch makes the value
      part optional in our parser, so that if it's not present the parsing
      result will contain an array with just the keyword in it. This is not a
      problem since we check all keyword names before accessing their values,
      so we won't mistakenly try to access the value of a valueless keyword.
      Reviewed-by: iustinp
      Split drbd command creation and execution · 333411a7
      Make _AssembleDisk more similar to _AssembleNet by splitting the
      generation of the drbdsetup command and its execution. While not
      changing anything this makes it easier to manipulate the command just in
      certain cases, which in the future we'll need to do.
      Reviewed-by: iustinp
  26. 12 May, 2008 1 commit
      bdev: always log command output if it failed · 6c896e2f
      Currently many error handling code paths in bdev.py log only
      result.fail_reason (i.e. exit code or signal that killed the command)
      but not its output. This makes debugging very hard.
      The patch changes all places where we only log fail_reason to also log
      Reviewed-by: ultrotter