Skip to content
Snippets Groups Projects
  1. Jul 21, 2010
    • Iustin Pop's avatar
      Preliminary support for coverage during live-test · dc61c50b
      Iustin Pop authored
      While this doesn't work correctly yet (hpc sum seems to only take common
      modules, not the sum of modules?), it prepares for gathering coverage
      data during live-test (as an alternative to unittest coverage data).
      dc61c50b
    • Iustin Pop's avatar
      Add some more imports to QC.hs · 223dbe53
      Iustin Pop authored
      This is needed so that in the coverage report we list all modules, even
      the ones we don't test at all, such that we get the complete results.
      223dbe53
    • Iustin Pop's avatar
      Change the meaning of the N+1 fail metric · c3c7a0c1
      Iustin Pop authored
      Currently, this metric tracks the nodes failing the N+1 check. While
      this helps (in some cases) to evacuate such nodes, it's not a good
      metric since rarely it will change during a step (only at the last
      instance moving away). Therefore we replace it with the count of
      instances living on such nodes, which is much better because:
      - moving an instance away while the node is still N+1 failing will still
        reflect in the score as an optimization
      - moving the last instance causing an N+1 failure will result in a heavy
        decrease of this score, thus giving the right bonus to clear this
        status
      c3c7a0c1
    • Iustin Pop's avatar
      Introduce per-metric weights · 8a3b30ca
      Iustin Pop authored
      Currently all metrics have the same weight (we just sum them together).
      However, for the hard constraints (N+1 failures, offline nodes, etc.)
      we should handle the metrics differently based on their meaning. For
      example, an instance living on a primary offline node is worse than an
      instance having its secondary node offline, which in turn is worse than
      an instance having its secondary node failing N+1.
      
      To express this case in our code, we introduce a table of weights for
      the metrics, with which we can influence their relative importance.
      8a3b30ca
    • Iustin Pop's avatar
      Allow balancing moves to introduce N+1 errors · 2cae47e9
      Iustin Pop authored
      This patch switches the applyMove function to the extended versions of
      Node.addPri and addSec, and passes the override flag based on the state
      of the node that we're moving away from.
      2cae47e9
    • Iustin Pop's avatar
      Introduce a relaxed add instance mode · 3e3c9393
      Iustin Pop authored
      In case an instance is living on an offline node, it doesn't make sense
      to refuse moving it because that would create N+1 failures; failing N+1
      is still much better than not running at all. Similarly, if the
      secondary node of an instance is offline, meaning the instance doesn't
      have any redundancy, we have a worse case than having a secondary that
      is N+1 failing and it could not accept the instance as primary, but it
      stil does redundancy for it.
      
      To allow this, we rename Node.addPri to addPriEx and introduce an extra
      parameter (addPri is a partial application of addPriEx and keeps the
      same signature). Node.addSec gets the same treatement.
      3e3c9393
  2. Jul 19, 2010
  3. Jul 18, 2010
    • Iustin Pop's avatar
      Allow '+' in node list fields · 6dfa04fd
      Iustin Pop authored
      When the field list is prefixed with a plus sign, this will extend the
      default field list, instead of replacing it entirely.
      6dfa04fd
    • Iustin Pop's avatar
      Update the node list fields · 16f08e82
      Iustin Pop authored
      This patch renames the pri/sec to pcnt/scnt, and adds the real primary
      and secondary instance lists, the peermap and the index of a node as
      selectable options.
      16f08e82
    • Iustin Pop's avatar
      Cleanup a node's peer map when possible · 124b7cd7
      Iustin Pop authored
      If the last secondary instance of a peer is deleted (detected by the new
      peer memory value being equal to zero), then the pair (pdx, 0) should be
      deleted completely. This is not optimization per se, but rather cleanup
      (the speedup is at most a percent, and only in some corner cases).
      124b7cd7
  4. Jul 16, 2010
  5. Jun 21, 2010
  6. Jun 08, 2010
    • Iustin Pop's avatar
      Optimise the Luxi.recvMsg function · 95f490de
      Iustin Pop authored
      Since the current buffer cannot contain (during network reads) an EOM,
      we should look for the EOM only in the newly-received string.  While
      this shouldn't make much difference, in some tests it cuts the recvMsg
      total time by around half.
      
      On entering recvMsg, we have though to search the old buffer for a
      message though, since we could have received two Luxi messages on the
      last network query; this is however a one-off cost, compared to
      continuously looking for the EOM in the old string (at each receive
      loop).
      95f490de
  7. Jun 07, 2010
    • Iustin Pop's avatar
      Complete the client Luxi implementation · 04282772
      Iustin Pop authored
      All current Luxi calls are supported after this patch. A bug in
      ArchiveJob is also fixed (Ganeti's job IDs are strings).
      04282772
    • Iustin Pop's avatar
      Add support for more LUXI calls · 9622919d
      Iustin Pop authored
      While not are directly useful, having them will open some possibilities
      (e.g. polling for job changes in hbal's -X mode, and auto-archiving the
      jobs once they are successful).
      9622919d
  8. Jun 02, 2010
    • Iustin Pop's avatar
      Fix some lint errors in the unit tests · 4a007641
      Iustin Pop authored
      4a007641
    • Iustin Pop's avatar
      Change the Luxi operations structure · 683b1ca7
      Iustin Pop authored
      Currently, we define the LuxiOp type as a simple enumeration, and leave
      the arguments structure to the users of the Ganeti.Luxi module. This is
      suboptimal for a couple of reasons: first, we decouple the operation
      type from operation arguments, and that means we don't use the type
      system for validation of the arguments; second, the clients themselves
      have to know about the JSON encoding of the protocol.
      
      For the above arguments, we change the operation type to contain the
      arguments too, and then the entire conversion/serialization is
      restricted to the Ganeti.Luxi module. Also, the removal of the JSON
      encoding from the clients results in an overall simplification of the
      code.
      683b1ca7
  9. Jun 01, 2010
  10. May 30, 2010
    • Iustin Pop's avatar
      Modify the test runner to show test exceptions · 8c5652f6
      Iustin Pop authored
      QuickCheck's batch driver (at least v1) doesn't show the test aborts,
      but simply discards the specific exception and increases the abort
      count. This makes it hard to debug the tests, so we modify our own test
      wrapper (which so far only tracked total failures) to show any
      exceptions.
      8c5652f6
  11. May 28, 2010
    • Iustin Pop's avatar
      Reduce the warnings during the unittests · 9e35522c
      Iustin Pop authored
      Since the unittests are not 'clean' from the p.o.v. of type
      declarations, and cannot be made clean in all respects (e.g. orphan
      instances), we silence some warnings for the test target, to have a
      cleaner output.
      9e35522c
  12. May 27, 2010
    • Iustin Pop's avatar
      Improve the test driver · 06fe0cea
      Iustin Pop authored
      The tests are moved to a separate data structure, and we can select a
      subset of tests to run.
      06fe0cea
    • Iustin Pop's avatar
      Introduce OpCode unittests · 88f25dd0
      Iustin Pop authored
      88f25dd0
    • Iustin Pop's avatar
      Introduce suport for optional keys in JObjects · f36a8028
      Iustin Pop authored
      Some keys are optional in the Ganeti opcodes (e.g. ‘node’ in the
      OpReplaceDisks), and as such we need to transform them in a Maybe value,
      instead of failing.
      
      The patch reworks a bit fromObj and adds maybeFromObj which parses such
      optional values. It then uses it in the opcode reading.
      f36a8028
    • Iustin Pop's avatar
      Replace fromJResult with annotateJResult · c96d44df
      Iustin Pop authored
      This patch removes all old uses of fromJResult with the annotated
      version, and removes the non-annotated version. All JSON parsing points
      should now have annotated errors.
      c96d44df
    • Iustin Pop's avatar
      Add annotations to loadJSArray · c8b662f1
      Iustin Pop authored
      This allows, for example, the RAPI backend to detail which information
      (instance or node data) fails to parse.
      c8b662f1
    • Iustin Pop's avatar
      Change fromObj error messages · 50d26669
      Iustin Pop authored
      Currently fromObj doesn't detail what we're trying to read, which can
      lead to cryptic messages: "Cannot read Int". The patch changes this
      function to annotate the error messages with the key/value we're trying
      to convert, by using a new version of fromJResult.
      
      Since the display of the key in tryFromObj is now redundant (it was
      already redundant in the 'not found' case), we remove it.
      
      The new version of fromJResult (annotateJResult) simply prepends a
      description string to the actual error message.
      50d26669
  13. May 26, 2010
  14. May 25, 2010
  15. May 20, 2010
    • Iustin Pop's avatar
      Add more unit tests for allocation/balance · 3fea6959
      Iustin Pop authored
      The patch adds some simple unit-tests for both the allocation function
      (we can allocate small instances on an empty cluster, we can allocate in
      tiered more starting from any size) and the balancing functions (one
      single instance is placed optimally, a full cluster plus an empty node
      can be rebalanced). The coverage has increased greatly, since this is
      the bulk of the algorithm/code.
      
      Also, the cluster tests are now being run with different options, since
      they are much slower.
      3fea6959
    • Iustin Pop's avatar
      Move two functions from hspace to Cluster.hs · 3ce8009a
      Iustin Pop authored
      This is done so we can test a longer pipeline.
      3ce8009a
    • Iustin Pop's avatar
      Make CStats instance of show · 8423f76b
      Iustin Pop authored
      This helps debugging via ghci.
      8423f76b
    • Iustin Pop's avatar
      Clarify options related to name passing · ada2fc6d
      Iustin Pop authored
      After the name patches, we can pass in either the short or the full
      name, so update the hbal man page accordingly.
      ada2fc6d
    • Iustin Pop's avatar
      Another haddoc fix… · 381be58a
      Iustin Pop authored
      381be58a
    • Iustin Pop's avatar
      Accept both full and short names in CLI · c854092b
      Iustin Pop authored
      This patch introduces some new functionality in the base Element type
      and in Container which supports searching for all 'known' names of an
      element, such that both short and full names are accept for various
      options like '-O' and '--excluded-instances'.
      c854092b
Loading