1. 04 Mar, 2008 8 commits
    • Michael Hanselmann's avatar
      Codestyle updates for locking code · cdb08f44
      Michael Hanselmann authored
      Reviewed-by: ultrotter
    • Guido Trotter's avatar
      LockSet: make acquire() able to get the whole set · 3b7ed473
      Guido Trotter authored
      This new functionality makes it possible to acquire a whole set, by passing
      "None" to the acquire() function as the list of elements. This will avoid new
      additions to the set, and then acquire all the current elements. The list of
      all elements acquired will be returned at the end.
      Deletions can still happen during the acquire process and we'll deal with it by
      just skipping the deleted elements: it's effectively as if they were deleted
      before we called the function. After we've finished though we hold all the
      elements, so no more deletes can be performed before we release them.
      Any call to release() will then first of all release the "set-level" lock if
      we're holding it, and then all or some of the locks we have.
      Some new tests checks that this feature works as intended.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      LockSet: encapsulate acquire() in try-except · 806e20fd
      Guido Trotter authored
      This patch adds a try/except area around most of the acquire() code (everything
      after the intial condition checks). Since the except: clause contains just a
      'raise' nothing really changes except the indentation of the code.
      This is done in a separate commit to insulate and make clearer what the real
      code changes done in the upcoming patch are.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Make LockSet.__names() return a list, not a set · 0cf257c5
      Guido Trotter authored
      Previously the private version of the __names function returned directly a set.
      We'll keep this in the public interface but change the private version to a
      list in order to be able to sort() its result and then loop on it, even though
      we'll need to do this with the usual care that some keys may disappear in
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      LockSet: improve remove() api · 3f404fc5
      Guido Trotter authored
      Lockset's remove() function used to return a list of locks we failed to remove.
      Rather than doing this we'll return a list of removed locks, so it's more
      similar to how acquire() behaves. This patch also fixes the relevant unit tests.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      LockSet: make acquire() return the set of names · 0cc00929
      Guido Trotter authored
      In a LockSet acquire() returned True on success. This code changes that to
      return a set containing the names of the elements acquired. This is still a
      true value if we acquired any lock but is slightly more useful (because if
      needed one has access to this data without querying for it). The only change
      happens if acquiring no locks, which though is a usage which should not
      normally happen because it has no practical use.
      The patch also changes a some tests to check that the new format is respected.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      LockSet: invert try/for nesting in acquire() · 8b68f394
      Guido Trotter authored
      This patch changes nothing to the functionality of a LockSet. Rather than
      trying to do the whole for loop we try each of its steps. This opens the way to
      handle differently a single failure.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Initial GanetiLockManager implementation · 7ee7c0c7
      Guido Trotter authored
      Includes some locking-related constants and explanations on how the
      LockManager should be used, the class itself and its test cases.
      The class includes:
        - a basic constructor
        - functions to acquire and release lists of locks at the same level
        - functions to add and remove list of locks at modifiable levels
        - dynamic checks against out-of-order acquisitions and other illegal ops
      Its testing library checks that the LockManager behaves correctly and that the
      external assumptions it relies on are respected.
      Reviewed-by: imsnah
  2. 29 Feb, 2008 3 commits
  3. 28 Feb, 2008 5 commits
    • Guido Trotter's avatar
      LockSet: make acquire() fail faster on wrong locks · e6c200d6
      Guido Trotter authored
      This patch makes acquire() first look up all the locks in the dict and then try
      to acquire them later. The advantage is that if a lockname is already wrong
      since the beginning we won't need to first queue and acquire other locks to
      find this out.
      Of course since there is no locking between the two steps a delete() could
      still happen in between, but SharedLocks are safe in this regard and will just
      make the .acquire() operation fail if this unfortunate condition happens.
      Since the right way to check if an instance/node exists and make sure it won't
      stop existing after that is acquiring its lock this improves the common case
      (checking for an incorrect name) while not penalizing correctness, or
      performance as would happen if we kept a lock for the whole process.
      Reviewed-by: iustinp
    • Guido Trotter's avatar
      LockSet implementation and unit tests · aaae9bc0
      Guido Trotter authored
      A LockSet represents locking for a set of resources of the same type. A thread
      can acquire multiple resources at the same time, and release some or all of
      them, but cannot acquire more resources incrementally at different times
      without releasing all of them in between.
      Internally a LockSet uses a SharedLock for each resource to be able to grant
      both exclusive and shared acquisition. It also supports safe addition and
      removal of resources at runtime. Acquisitions are ordered alphabetically in
      order to grant them to be deadlock-free. A lot of assumptions about how the
      code interacts are made in order to grant both safety and speed; in order to
      document all of them the code features pretty lenghty comments.
      The test suit tries to catch most common interactions but cannot really tests
      tight race conditions, for which we still need to rely on human checking.
      This is the second basic building block for the Ganeti Lock Manager. Instance
      and Node locks will be put in LockSets to manage their acquisition and release.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Fix the gnt-cluster init man page · f3b100e1
      Guido Trotter authored
      Some options were missing in the gnt-cluster init man page.  This patch adds
      them, removes an empty line, and clarifies a bit more some requirements.
      Reviewed-by: schreiberal
    • Guido Trotter's avatar
      Don't allow renaming to an existing instance · 7bde3275
      Guido Trotter authored
      Even if the target instance is down or we are not checking for IP conflicts
      changing an instance name to a new one which is already in the cluster is
      doomed to fail, because in a lot of places (among which figures the mind of
      most users/admins) instance names are assumed to be unique.
      Reviewed-by: imsnah
    • Alexander Schreiber's avatar
      Clarify online help for xc-instance reinstall. · 5336d63d
      Alexander Schreiber authored
      Reviewed-by: imsnah
  4. 27 Feb, 2008 2 commits
  5. 26 Feb, 2008 2 commits
  6. 25 Feb, 2008 2 commits
  7. 23 Feb, 2008 1 commit
    • Guido Trotter's avatar
      Improve ganeti example cron file · 0d349b3a
      Guido Trotter authored
      The cron file in ganeti's example directory is now static, and executes
      ganeti-watcher in /usr/local/sbin no matter where it's really installed. With
      this patch we generate it at build time substituting the right value of
      @SBINDIR@ from ganeti.cron.in. We also make sure ganeti-watcher exists and is
      executable before running it.
      This is targeted at 1.2 as well.
      Reviewed-by: iustinp
  8. 22 Feb, 2008 3 commits
    • Manuel Franceschini's avatar
      Small comment fix. · 6c8af3d0
      Manuel Franceschini authored
    • Manuel Franceschini's avatar
    • Iustin Pop's avatar
      Break trunk by removing twisted · 81010134
      Iustin Pop authored
      This patch switches from the twisted usage for inter-node protocol to
      simple BaseHTTPServer/httplib. The patch has more deletions because we
      use no authentication, no encryption at all.
      As such, this is just for trunk, and only for testing. What it brings is
      the ability to use the rpc library from within multiple threads in
      parallel (or it should so).
      Since the changes are very few and non-intrusive, they can be reverted
      without impacting the rest of the code.
      This passes burnin. QA was not tested.
      Reviewed-by: imsnah
  9. 21 Feb, 2008 1 commit
    • Guido Trotter's avatar
      Add a few SharedLock delete() tests · 84152b96
      Guido Trotter authored
      - Check that even a shared acquire() fails on a deleted lock
      - Check that delete() fails on a lock you share (must own it or nothing)
      These are assumptions I build on in future code, so better check for them.
      Currently no code change is necessary for them to be valid.
      Reviewed-by: iustinp
  10. 20 Feb, 2008 2 commits
    • Guido Trotter's avatar
      SharedLock: fix a wrong unit-test helper code · 4354ab03
      Guido Trotter authored
      The _doItDelete helper code was supposed to be used to dispatch threads that
      deleted the SharedLock. It actually just acquired it exclusively. This remained
      unnoticed as the helper thread is just used to test interaction, not the delete
      code by itself, and delete requires an exclusive acquire anyway.
      Reviewed-by: imsnah
    • Guido Trotter's avatar
      Add another 1.1->1.2 compatibility alias · 00ce8b29
      Guido Trotter authored
      gnt-instance replace-disks used to be called replace_disks.
      Reviewed-by: iustinp
  11. 19 Feb, 2008 1 commit
    • Guido Trotter's avatar
      Add the delete() operation to SharedLock · a95fd5d7
      Guido Trotter authored
      This new operation lets a lock be cleanly deleted. The lock will be exclusively
      held before deletion, and after it pending and future acquires will raise an
      exception. Other SharedLock operations are modify to deal with delete() and to
      avoid code duplication.
      This patch also adds unit testing for the new function and its interaction with
      the other lock features. The helper threads are sligtly modified to handle and
      report the condition of a deleted lock. As a bonus a non-related unit test
      about not supporting non-blocking mode yet has been added as well.
      This feature will be used by the LockSet in order to support deadlock-free
      delete of resources. This in turn will be useful to gracefully handle the
      removal of instances and nodes from the cluster dealing with the fact that
      other operations may be pending on them.
      Reviewed-by: iustinp
  12. 18 Feb, 2008 5 commits
  13. 16 Feb, 2008 1 commit
    • Guido Trotter's avatar
      Fix gnt-instance info i1 i2 ... · 515207af
      Guido Trotter authored
      Due to an indentation error only the last instance queried got returned by
      LUQueryInstanceData. Moving the append() call inside the for cycle to fix this
      This is a one-liner targeted at 1.2.3
      Reviewed-by: iustinp
  14. 15 Feb, 2008 2 commits
  15. 14 Feb, 2008 2 commits
    • Iustin Pop's avatar
      Alter the device activation code · 40a03283
      Iustin Pop authored
      This tiny patch fixes the breakage that the previous patch about
      activation did by removing the Close() call after activation.
      The initial reason for that call was that if the device is already
      active and open, but we need it closed, we close it automatically.
      This however conflicts with the 2-step open in the case the instance is
      already open.
      It makes sense to remove the call since in the current Ganeti setup,
      just doing Close() is not enough to change the device from (e.g.)
      primary to secondary, as some devices (e.g. md) might need Shutdown not
      It also gets rid of a Close() in the CreateBlockDevice function, due to
      the same reasoning (although in Create the child should not have a
      different status anyway).
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Two small improvements to burnin · d7b47a77
      Iustin Pop authored
      This tiny patch fixes the verbose option to actually work, and also when
      creating instances it logs the secondary node too (even if this doesn't
      apply for plain templates, it doesn't create an error).
      Reviewed-by: imsnah