1. 16 Jul, 2009 1 commit
    • Iustin Pop's avatar
      Use full-stripe size in LVM growth · 38256320
      Iustin Pop authored
      
      
      LVM has issues when growing stripped volumes, so it's best to specify
      the growth in exact multiples of the full stripe size (as precise as
      possible). For this we need to do a couple of changes:
        - in LVM Attach(), we query additionally the VG extent size and the LV
          stripe count; since this makes lvs return a (possibly) multi-line
          output, we now split it into lines and only take the last one
        - in LVM Grow(), we round up the increase in multiples of the full
          stripe size
      
      The patch also sets the correct target size in DRBD growth.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarOlivier Tharan <olive@google.com>
      38256320
  2. 01 Jul, 2009 1 commit
  3. 08 Jun, 2009 2 commits
    • Iustin Pop's avatar
      Enable stripped LVs · fecbe9d5
      Iustin Pop authored
      
      
      This patch enables stripped LVs, falling back to non-stripped if the
      stripped creation fails. If the configure-time lvm-stripecount is 1,
      this patch becomes a noop (with an insignificant python-level overhead,
      but no extra lvm calls).
      
      The effect of this patch is that new instances will get stripped LVs
      from the start, whereas old instances will have their LVs stripped as
      soon as replace-disks is run for them.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      fecbe9d5
    • Iustin Pop's avatar
      Add more constants for DRBD and change sync tests · 3c003d9d
      Iustin Pop authored
      
      
      This patch adds constants for the connection status, peer roles and disk
      status, and it changes the rules for when the disk is considered as
      “resyncing” - previously it was only for syncsource/synctarget, but
      there are many other transient statuses which could be misinterpreted as
      ‘degraded’ (because they where not considered as resyncing, but the disk
      is not consistent in these statuses).
      
      Furthermore, cmdlib.py:WaitForSync determines if a device is syncing or
      not based on sync_percent being not none. Not all DRBD resync statuses
      offer a percent done, so if we are syncing but don't have a sync
      percent, we'll report a zero sync percent (and no time estimate).
      
      The patch also removes a few unused variables (is_sync_target,
      peer_sync_target, is_resync) whose value doesn't make sense anymore with
      the new sync rules.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      3c003d9d
  4. 03 Jun, 2009 1 commit
    • Iustin Pop's avatar
      Assemble DRBD using the known size · f069addf
      Iustin Pop authored
      
      
      This patch changes DRBD disk attachment to force the wanted size, as opposed to
      letting the device auto-discover its size.
      
      This should make the disks more resilient with regard to small differences in
      size (e.g. due to LVM rounding). This still works with regard to disk
      growth, but the instances needs to be fully restarted (including disks)
      in that case.
      
      This passes a full burning without problems, but it's still a tricky
      change - if the config.data is not synced with the reality, we might
      tell DRBD a wrong size. At least this will fail outright (and not
      introduce silent errors), as DRBD (per a quick check at the sources)
      tracks the size in the meta-dev and also does not allow shrinking
      consistent devices.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      f069addf
  5. 28 May, 2009 1 commit
    • Iustin Pop's avatar
      Change the bdev init signatures · 464f8daf
      Iustin Pop authored
      
      
      This patch changes all the bdev.BlockDev constructors to take an
      additional ‘size’ parameter, all the backend functions that call those
      functions to pass it and also changes backend.BlocdevCreate() to not use
      the size passed via the rpc call but instead directly disk.size (this is
      the only way it's called).
      
      Note that this patch doesn't do anything with this parameter, just
      stores it on the blockdev objects.
      
      With the patch, we actually have a more uniform init sequence (before
      create had the parameter, but the other functions not).
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      464f8daf
  6. 05 May, 2009 2 commits
    • Iustin Pop's avatar
      Fix compatibility with DRBD 8.3 · 01e2ce3a
      Iustin Pop authored
      
      
      DRBD 8.3 changes two more things compared to 8.2:
        - /proc/drbd format changed in multiple ways; the part we're
          interested is the ‘st:’ to ‘ro:‘ change (in the changelog named as
          “Renamed 'state' to 'role'”
        - “drbdsetup /dev/drbdN show” changed the ‘device’ stanza from:
            device "/dev/drbd0";
          to:
            device                  minor 0;
      
      This patch fixes these both and adds data files and unittests for DRBD
      8.3.1.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      01e2ce3a
    • Karsten Keil's avatar
      Fix compatibility with DRBD 8.2 · 34e71fea
      Karsten Keil authored
      
      
      This patch adds (and suppresses) the extra ipv4/ipv6 words before the
      actual address that newer DRBD versions add.
      
      [iustin@google.com: slightly changed the patch to conform to style
      guide, and changed the commit message]
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      34e71fea
  7. 16 Feb, 2009 1 commit
    • Iustin Pop's avatar
      Convert IOErrors for /proc/drbd into our errors · f6eaed12
      Iustin Pop authored
      If /proc/drbd can't be opened, this raises an IOError, but all the
      error-handling behaviour in backend treats only BlockDeviceErrors. This
      creates a plain failure in cluster verify and in other RPC calls.
      
      This patch simply converts EnvironmentErrors into BlockDeviceErrors, and
      also changes the RPC result for NV_DRBDLIST and its handling to be able
      to show the error. The other RPC calls work by default now, due the
      existing error handling.
      
      Reviewed-by: ultrotter
      f6eaed12
  8. 11 Feb, 2009 1 commit
    • Guido Trotter's avatar
      FileStorage: abort creating over an existing file · aed77cea
      Guido Trotter authored
      In FileStorage there is a TODO:
       decide whether we should check for existing files and
       abort or not
      After Ganeti ate my instance data I decided. Let's abort.
      In general there is no reason we should overwrite existing files, and
      doing it can be very harmful for preexisting files on the host.
      
      Reviewed-by: iustinp
      aed77cea
  9. 10 Feb, 2009 5 commits
    • Iustin Pop's avatar
      Some error message cleanups · 33bc6f01
      Iustin Pop authored
      Reviewed-by: imsnah
      33bc6f01
    • Iustin Pop's avatar
      Cleanup of DRBD8._CheckMetaSize · 9c793cfb
      Iustin Pop authored
      This patch converts the _CheckMetaSize method to raise exceptions
      instead of logging and returning False. This fits now in the new rpc
      return types, so it's a cheap change.
      
      Reviewed-by: ultrotter
      9c793cfb
    • Iustin Pop's avatar
      Change the disk assembly to raise exceptions · 1063abd1
      Iustin Pop authored
      This big patch converts the bdev Assemble() methods and the supporting
      functions to raise exceptions instead of returning False. This is a big
      patch, since the assembly functions touch other functions: add children,
      creation, etc. However, the patch does not add much new code, rather it
      reworks existing code.
      
      One of the biggest changes is in the rework of the DRBD8._SlowAssemble()
      method (one of the most complicated/ugly ones). Hopefully the new
      version is a little bit more readable.
      
      Reviewed-by: ultrotter
      1063abd1
    • Iustin Pop's avatar
      Change BlockDev.Remove() failure result · 0c6c04ec
      Iustin Pop authored
      Currently, the Remove() methods of block devices return True/False.
      This doesn't permit any error detail reporting.
      
      This patch changes the return type to None for success, and raises
      BlockDeviceError in case of failure. This permits the details to be
      passed up the stack.
      
      The patch also simplifies a little the Remove method of file-based
      devices (no stat first, just try unlink).
      
      Reviewed-by: ultrotter
      0c6c04ec
    • Iustin Pop's avatar
      Change BlockDev.Shutdown() failure result · 746f7476
      Iustin Pop authored
      Currently, the Shutdown() methods of block devices return True/False.
      This doesn't permit any error detail reporting.
      
      This patch changes the return type to None for success, and raises
      BlockDeviceError in case of failure. This permits the details to be
      passed up the stack.
      
      For LVM and file-backed devices, this is a simple change. For DRBD, we
      first remove the shutdown of disks in case of network activation
      failures (since with static minors the minor is used anyway, we don't
      gain anything by clearing it), and the we simply change _ShutdownAll()
      to raise an exception.
      
      Reviewed-by: ultrotter
      746f7476
  10. 09 Feb, 2009 1 commit
    • Iustin Pop's avatar
      bdev: add and use two utility functions · 82463074
      Iustin Pop authored
      This patch adds two utility functions for raising BlockDeviceError
      exceptions and for running functions while ignoring this error. Most of
      the manual “raise errors.BlockDeviceError” cases are converted to
      _ThrowError, as this makes the code clearer.
      
      We also change most of the DRBD error messages to include the minor
      number because with the parallel execution of commands it's not longer
      possible to identify the failed DRBD just from the timestamp, and the
      minor number can be mapped back to the instance easier.
      
      Reviewed-by: ultrotter
      82463074
  11. 23 Jan, 2009 1 commit
    • Iustin Pop's avatar
      Remove checking of DRBD metadata for validity · 3b559640
      Iustin Pop authored
      Currently the DRBD code checks that the metadata devices are valid
      before creation, initial disk attachment and add children.
      
      However, the process for checking validity requires a free DRBD minor,
      and this conflict with parallel checking.
      
      There are at least three possible solutions:
        - serialize all checks, which means we reduce parallelism and need
          extra locks
        - don't pass a valid minor number, but one like “/dev/drbd256” (which
          is invalid); this works for current version of DRBD, but since it's
          not guaranteed to remain so it doesn't look nice
        - don't do the checking at all, and rely on “drbdsetup ... disk ...”
          to fail by itself
      
      The reason for checking metadata was that in 1.2, this was much cheaper
      than trying to activate devices (and the subsequent iteration over the
      minors). However, in 2.0, they have the same cost, so we can choose
      option 3: just remove the explicit checking and rely on drbdsetup and
      the kernel to fail.
      
      Since DRBD8._InitMeta still requires a minor number, the two places
      where this is run are handled as follows:
        - Create: we just use our own (unused currently) minor number
        - AddChildren: we keep using FindUnusedMinor, with the caveat that
          this function (used by replace-disks -n ...) cannot be yet
          parallelized
      
      Reviewed-by: ultrotter
      3b559640
  12. 20 Jan, 2009 2 commits
    • Iustin Pop's avatar
      Make cluster-verify check the drbd minors space · 6d2e83d5
      Iustin Pop authored
      This patch adds support for verification of drbd minors space in cluster
      verify: minors which belong to running instances and should be online
      but are not, and minors which do not belong to any instace but are in
      use.
      
      The patch requires exposing some methods from bdev.DRBD8 and
      config.ConfigWriter which were until now private methods.
      
      Reviewed-by: ultrotter
      6d2e83d5
    • Iustin Pop's avatar
      DRBD: check for in-use minor during Create · 767d52d3
      Iustin Pop authored
      In order to prevent errors with old, in-use DRBD minors, we check and
      abort at create time if our minor is already in use. For this we need to
      also modify DRBD8Status to be able to parse cs:Unconfigured devices.
      
      Reviewed-by: ultrotter
      767d52d3
  13. 19 Jan, 2009 1 commit
    • Iustin Pop's avatar
      Block device creation cleanup · 6c626518
      Iustin Pop authored
      Currently when creation LVM-based instances, we always get the
      extremely-confusing message "ERROR Can't find LV /dev/xenvg/..." which
      is actually expected. This behaviour was introduced before we had
      UUID-style LV names, since at that point it was not a unexpected to have
      such volumes laying around after a failed creation.
      
      Today, it's much more of an error to see existing volumes, and it's
      better to abort with a failure. Since bdev.LogicalVolume.Create() method
      will raise an error in case it exists, we can remove this check in
      backend before creating the device.
      
      The Create methods for DRBD and FileStorage currently don't raise
      exception, as behaviour is not very well defined here.
      
      We also change some exception types raised in bdev so that all
      exceptions raised by device creation are a subclass of GenericError.
      
      Reviewed-by: ultrotter
      6c626518
  14. 13 Jan, 2009 4 commits
    • Iustin Pop's avatar
      Forward-port DrbdNetReconfig · 6b93ec9d
      Iustin Pop authored
      This is a modified forward-port of DrbdNetReconfig and their associated
      RPCs. In Ganeti 2.0, these functions will be used for two things:
        - live migration (as in 1.2)
        - and for other network reconfiguration tasks, since DRBD8.Attach()
          doesn't do them anymore
      
      Because of the Attach() changes, we can now implement the
      AttachNet/DisconnectNet functions as independent entities, and we don't
      need the cache anymore.
      
      Note these functions are copies of the latest 1.2 code, and not
      cherry-picks of the (many) patches that went into 1.2.
      
      Reviewed-by: ultrotter
      6b93ec9d
    • Iustin Pop's avatar
      backend: rename AttachOrAssemble to Assemble · f96e3c4f
      Iustin Pop authored
      Since now the Assemble function is different than Attach, we rename this
      backend function to show that the intent is to fully assemble the device
      (and it's always allowed to modify the device).
      
      Reviewed-by: ultrotter
      f96e3c4f
    • Iustin Pop's avatar
      drbd: change the semantics of Attach vs. Assemble · 2d0c8319
      Iustin Pop authored
      Currently, both the Attach and Assemble methods for DRBD8 devices will use and
      alter the device state. This is suboptimal, and it has been worked
      around in 1.2 via a special cache in the node daemon so that we don't
      need to call Attach() again in migration, for example.
      
      Since in 2.0 we have static minors, we can change these functions so
      that:
        - Attach() does not affect the device in any way, and only checks if
          the minor is already in use or not
        - Assemble() has two logic paths, one for startup from unused minor
          (the old Assemble, now renamed _FastAssemble) and one for
          re-checking/fixing an in-use minor (the old Attach, now renamed
          _SlowAttach)
      
      Basically Attach was renamed to _SlowAttach, Assemble to _FastAssemble,
      and we have a new, simple Assemble that calls one or the other based on
      the result of the new Attach.
      
      The LUReplaceDisks (with new secondary) is relying on the special
      semantics of Attach modifying the device and is broken until the end of
      the patch series.
      
      Reviewed-by: ultrotter
      2d0c8319
    • Iustin Pop's avatar
      bdev: Do not call Assemble() on children · f87548b5
      Iustin Pop authored
      The caller of dev.Assemble() (backend._RecursiveAssembleBD) is doing an
      explicit recursion over all the children of the device, with better
      error reporting. As such, we don't need this repeated assembly inside
      the base BlockDev class.
      
      Reviewed-by: ultrotter
      f87548b5
  15. 09 Jan, 2009 1 commit
    • Iustin Pop's avatar
      Work around a DRBD sync speed race condition · 7d585316
      Iustin Pop authored
      This is modified forward-port of commit 1544 on the 1.2 branch:
      
        When DRBD is doing its dance to establish a connection with its
        peer, it also sends the synchronization speed over the wire. In
        some cases setting the sync speed only after setting up both
        sides can race with DRBD connecting, hence we set it here before
        telling DRBD anything about its peer.
      
        Reviewed-by: iustinp
      
      The modification we make is that we split SetSyncSpeed in two so that we
      don't need to modify our minor temporarily, and the fact that we call
      this function from within _AssembleNet (right before enabling network),
      instead of Assemble()/Attach().
      
      Original-Author: imsnah
      7d585316
  16. 08 Jan, 2009 1 commit
    • Iustin Pop's avatar
      bdev: forward-port ReAttachNet/DisconnectNet · cf8df3f3
      Iustin Pop authored
      This is plain copy of the 1.2 ReAttachNet and DisconnectNet methods on
      the DRBD8 device, with the logger to logging module changes and the
      ReAttachNet method renamed to AttachNet.
      
      These methods are not used anywhere right now, but will be used for
      migration and a simpler disk-replace.
      
      The code was originally committed on the 1.2 branch as revision numbers
      1165 and 1204.
      
      Originally-Reviewed-by: imsnah, ultrotter
      cf8df3f3
  17. 11 Dec, 2008 1 commit
    • Iustin Pop's avatar
      Fix epydoc format warnings · c41eea6e
      Iustin Pop authored
      This patch should fix all outstanding epydoc parsing errors; as such, we
      switch epydoc into verbose mode so that any new errors will be visible.
      
      Reviewed-by: imsnah
      c41eea6e
  18. 27 Nov, 2008 1 commit
    • Iustin Pop's avatar
      Fix file-based block devices · ecb091e3
      Iustin Pop authored
      We changed a while ago the protocol for opening block devices, but
      FileStorage was not changed. This patch makes it work again.
      
      Reviewed-by: imsnah
      ecb091e3
  19. 29 Sep, 2008 3 commits
    • Iustin Pop's avatar
      Move a hardcoded constant to constants.py · 3c03759a
      Iustin Pop authored
      For now we only use the ‘C’ protocol so we can put it in constants.py
      instead of hardcoding it.
      
      Reviewed-by: imsnah
      3c03759a
    • Iustin Pop's avatar
      Enable the use of shared secrets · 2899d9de
      Iustin Pop authored
      This patch enables the use of the shared secrets for DRBD8 disks, using
      (hardcoded in constants.py) the md5 digest algorithm.
      
      For making this more flexible, either we implement a cluster parameter
      (once the new model is in place), or we can make it ./configure-time
      selectable.
      
      Reviewed-by: imsnah
      2899d9de
    • Iustin Pop's avatar
      Extend DRBD disks with shared secret attribute · f9518d38
      Iustin Pop authored
      This patch, which is similar to r1679 (Extend DRBD disks with minors
      attribute), extends the logical and physical id of the DRBD disks with a
      shared secret attribute. This is generated at disk creation time and
      saved in the config file.
      
      The generation of the secret is done so that we don't have duplicates in
      the configuration (otherwise the goal of preventing cross-connection
      will not be reached), so we add to config.py more than just a simple
      call to utils.GenerateSecret().
      
      The patch does not yet enable the use of the secrets.
      
      Reviewed-by: imsnah
      f9518d38
  20. 23 Sep, 2008 1 commit
    • Iustin Pop's avatar
      Switch to static minors for DRBD · a1578d63
      Iustin Pop authored
      With some todos remaining, this patch switches the DRBD devices to use
      the passed minors, and the cmdlib code (add instance and replace disks)
      to request and assign minors to the DRBD disks.
      
      Todos:
        - look at the disk RPC calls to see which can be optimized away, since
          we now know the minor beforehand
        - remove the _FindUnusedMinor usage from the few places it's still
          used (not for actual disks, but for temporary use in meta devs) and
          eventually replace with _CheckMinorUnused or such
      
      Of course, this and/or the previous two patches break existing clusters.
      Again.
      
      Reviewed-by: imsnah
      a1578d63
  21. 22 Sep, 2008 1 commit
    • Iustin Pop's avatar
      Extend DRBD disks with minors attribute · ffa1c0dc
      Iustin Pop authored
      This patch converts the DRBD disks to contain also a minor (per each
      node) attribute. This minor is not yet used and is always initialized
      with None, so the patch does not have any real-world impact - except for
      automatically upgrading config files (it adds the minors as None, None).
      
      Reviewed-by: imsnah
      ffa1c0dc
  22. 09 Jul, 2008 2 commits
    • Iustin Pop's avatar
      Reduce duplicate Attach() calls in bdev · cb999543
      Iustin Pop authored
      Currently, the 'public' functions of bdev (FindDevice and
      AttachOrAssemble) will call the Attach() method right after class
      instantiation.
      
      But the constructor itself calls this function, and therefore we have
      duplicate Attach() calls (which are not cheap at all).
      
      The patch introduces a new 'attached' instance attribute that tells if
      the last Attach() was successful. The public functions reuse this so
      that we only do the minimum required number of calls.
      
      Reviewed-by: imsnah
      cb999543
    • Iustin Pop's avatar
      Convert bdev.py to the logging module · 468c5f77
      Iustin Pop authored
      This does not enhance in any way the messages; it just switches to the
      new module.
      
      Reviewed-by: imsnah
      468c5f77
  23. 25 Jun, 2008 1 commit
    • Iustin Pop's avatar
      Cleanup LV status computation · 99e8295c
      Iustin Pop authored
      Currently, when seeing if a LV is degraded or not (i.e. virtual volume),
      we first attach to the device (which does an lvdisplay), then do a lvs
      in order to display the lv_attr. This generates two external commands to
      do (almost) the same thing.
      
      This patch changes the Attach() method for LVs to call lvs and display
      both the major/minor (needed for attach) and the lv_status (needed for
      GetSyncStatus). Thus, later in GetSyncStatus, we don't need to run lvs
      again, and instead just return the value computed in Attach().
      
      Reviewed-by: imsnah
      99e8295c
  24. 18 Jun, 2008 1 commit
    • Iustin Pop's avatar
      Rework the DRBD8 device status computation · 6b90c22e
      Iustin Pop authored
      Currently, compute the status of a drbd8 device in GetSyncStatus and
      return only the values that we need (and fit in the framework of
      GetSyncStatus). However, the full status details are useful (and needed)
      in other places, so the patch attempts to improve this situation.
      
      We abstract the status of a device outside in a separate class, that
      knows how to parse contents from /proc/drbd and set easily accessible
      attributes. We then simplify the GetSyncStatus to use this and return
      the values that it needs, and add a separate method that returns the
      full status object.
      
      The move to a separate class cleans up a little bit the old
      sync-progress computation from GetSyncStatus, but it's still many
      regexes.
      
      The patch also adds unittests for a few statuses, and modifies one
      BaseDRBD call to accept a custom filename instead of '/proc/drbd' to
      ease unittests.
      
      Reviewed-by: imsnah
      6b90c22e
  25. 16 Jun, 2008 1 commit
    • Iustin Pop's avatar
      bdev: implement disk resize for lvm/drbd8 · 1005d816
      Iustin Pop authored
      This patch implements disk resize at the bdev level for the LVM and
      DRBD8 disk types. It is not implemented for DRBD7 and MD since the way
      MD works with its underlaying devices makes it harder and this
      combination is also deprecated.
      
      The LVM resize operation is tried three times, with different allocation
      policies:
        - contiguous first, since this is best for allocation purposes (it
          won't fragment too much the PV)
        - cling, which is supported only by more recent LVM versions, will try
          to place the new extents on the same PV as the rest of the LV
        - and finally normal, which is the default
      
      Reviewed-by: imsnah
      1005d816
  26. 30 May, 2008 1 commit
    • Iustin Pop's avatar
      Complete removal of md/drbd 0.7 code · abdf0113
      Iustin Pop authored
      This patch removes the last of the md and drbd 0.7 code. Cluster which
      have the old device types will be broken if they have this applied.
      
      Reviewed-by: imsnah
      abdf0113
  27. 15 May, 2008 1 commit
    • Guido Trotter's avatar
      Fix drbd show parser to handle valueless keywords · 63012024
      Guido Trotter authored
      It turns out in some cases there can exist keywords without an
      associated value exported by drbdsetup show. This patch makes the value
      part optional in our parser, so that if it's not present the parsing
      result will contain an array with just the keyword in it. This is not a
      problem since we check all keyword names before accessing their values,
      so we won't mistakenly try to access the value of a valueless keyword.
      
      Reviewed-by: iustinp
      
      63012024