1. 20 Dec, 2010 1 commit
    • Iustin Pop's avatar
      Change the Node.group attribute · 10ef6b4e
      Iustin Pop authored
      
      
      Currently, the Node.group attribute is the UUID of the group, as until
      recently Ganeti didn't export the node group properties. Since it does
      so now, we make the following changes (again apologies for a big
      patch):
      
      - we change the group attribute to be an index, similar to the way an
        Instance.pnode and snode attributes point to the parent node(s)
      - on load, we read the group.uuid attribute and we use that to lookup
        the actual group index, from previously-loaded groups info
      - this means that we now first read groups, then read nodes using the
        group info, and then read instances using the node info
      
      This patch leaves a few functions showing the group index (ugly since
      it's htools internal), will be converted later.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarBalazs Lecz <leczb@google.com>
      10ef6b4e
  2. 09 Dec, 2010 4 commits
  3. 01 Dec, 2010 5 commits
  4. 09 Nov, 2010 1 commit
    • Iustin Pop's avatar
      Fix tag exclusion weight · 306cccd5
      Iustin Pop authored
      Currently, the tag exclusion metric has a weight of one, which means
      there might be cases where we won't move instances around because it
      upsets the cluster metrics. However, we do want to make a higher effort
      for cleaning up tag collisions, so we increase the weight to an
      empirically-determined value of 2.
      306cccd5
  5. 03 Sep, 2010 1 commit
  6. 30 Aug, 2010 1 commit
  7. 27 Jul, 2010 1 commit
  8. 21 Jul, 2010 3 commits
    • Iustin Pop's avatar
      Change the meaning of the N+1 fail metric · c3c7a0c1
      Iustin Pop authored
      Currently, this metric tracks the nodes failing the N+1 check. While
      this helps (in some cases) to evacuate such nodes, it's not a good
      metric since rarely it will change during a step (only at the last
      instance moving away). Therefore we replace it with the count of
      instances living on such nodes, which is much better because:
      - moving an instance away while the node is still N+1 failing will still
        reflect in the score as an optimization
      - moving the last instance causing an N+1 failure will result in a heavy
        decrease of this score, thus giving the right bonus to clear this
        status
      c3c7a0c1
    • Iustin Pop's avatar
      Introduce per-metric weights · 8a3b30ca
      Iustin Pop authored
      Currently all metrics have the same weight (we just sum them together).
      However, for the hard constraints (N+1 failures, offline nodes, etc.)
      we should handle the metrics differently based on their meaning. For
      example, an instance living on a primary offline node is worse than an
      instance having its secondary node offline, which in turn is worse than
      an instance having its secondary node failing N+1.
      
      To express this case in our code, we introduce a table of weights for
      the metrics, with which we can influence their relative importance.
      8a3b30ca
    • Iustin Pop's avatar
      Allow balancing moves to introduce N+1 errors · 2cae47e9
      Iustin Pop authored
      This patch switches the applyMove function to the extended versions of
      Node.addPri and addSec, and passes the override flag based on the state
      of the node that we're moving away from.
      2cae47e9
  9. 19 Jul, 2010 2 commits
    • Iustin Pop's avatar
      hbal: print short names in steps list · 14c972c7
      Iustin Pop authored
      This was a regression from the name handling changes, as we started
      using the original names for the solution list (which is not designed
      for parsing/feeding back into ganeti).
      14c972c7
    • Iustin Pop's avatar
      Remove an obsolete function · fb33aaaf
      Iustin Pop authored
      printSolution is no longer used, as we print the solution iteratively
      now.
      fb33aaaf
  10. 18 Jul, 2010 1 commit
    • Iustin Pop's avatar
      Allow '+' in node list fields · 6dfa04fd
      Iustin Pop authored
      When the field list is prefixed with a plus sign, this will extend the
      default field list, instead of replacing it entirely.
      6dfa04fd
  11. 20 May, 2010 4 commits
    • Iustin Pop's avatar
      Add more unit tests for allocation/balance · 3fea6959
      Iustin Pop authored
      The patch adds some simple unit-tests for both the allocation function
      (we can allocate small instances on an empty cluster, we can allocate in
      tiered more starting from any size) and the balancing functions (one
      single instance is placed optimally, a full cluster plus an empty node
      can be rebalanced). The coverage has increased greatly, since this is
      the bulk of the algorithm/code.
      
      Also, the cluster tests are now being run with different options, since
      they are much slower.
      3fea6959
    • Iustin Pop's avatar
      Move two functions from hspace to Cluster.hs · 3ce8009a
      Iustin Pop authored
      This is done so we can test a longer pipeline.
      3ce8009a
    • Iustin Pop's avatar
      Make CStats instance of show · 8423f76b
      Iustin Pop authored
      This helps debugging via ghci.
      8423f76b
    • Iustin Pop's avatar
      Stop modifying names for internal computations · 3e4480e0
      Iustin Pop authored
      Currently the name used internally is modified and holds the shortened
      name of the nodes/instances. This has caused issues before, since we
      always have to strip the suffix from input data and reapply it if we
      need to send data back to Ganeti.
      
      This patch changes the code such that the names are never modified, only
      the alias, and all the internal computations can forget about the common
      suffix addition/removal.
      3e4480e0
  12. 18 May, 2010 1 commit
    • Iustin Pop's avatar
      Remove the noLimit values and always use limits · f4c0b8c5
      Iustin Pop authored
      This patch moves from allowing no-limits for disk/cpu ratios, and always
      use a real limit. For disk, it's simple since we use 0, which means no
      reservations for disks. For CPU, we set an (arbitrary) limit of 64 v/p,
      which should be reasonable as a default limit (it can be changed via the
      command line).
      f4c0b8c5
  13. 04 May, 2010 1 commit
    • Iustin Pop's avatar
      Fix hspace's KM metrics · e2436511
      Iustin Pop authored
      We returned the KM_POOL_* metrics as the final state, not as the delta
      between the final and the initial state.
      e2436511
  14. 15 Apr, 2010 2 commits
    • Iustin Pop's avatar
      Add a new function to compute allocation deltas · 9b8fac3d
      Iustin Pop authored
      Given two cluster states, the new function can answer the following
      questions:
      
      - how much resources currently allocated
      - how much resources finally allocated (delta from above is how much we
        can actually allocate on the cluster)
      - unallocable resources (whatever is left free after the previous step)
      9b8fac3d
    • Iustin Pop's avatar
      Introduce total vcpu tracking in CStats · 86ecce4a
      Iustin Pop authored
      We add a new field that tracks the available virtual cpus (expressed as
      node cpus times the vcpu ratio).
      86ecce4a
  15. 25 Feb, 2010 1 commit
  16. 23 Feb, 2010 1 commit
  17. 22 Feb, 2010 4 commits
  18. 14 Jan, 2010 2 commits
    • Iustin Pop's avatar
      Move instance relocation test upper in the chain · a804261a
      Iustin Pop authored
      Currently we test each instance for relocation in checkMove; however, it
      is a little more clear if we pass only the relocatable instances to
      checkMove. The patch also slightly rewrites (indendation/style) the
      second half of the checkMove function.
      a804261a
    • Iustin Pop's avatar
      Split the balancing function in two parts · 5ad86777
      Iustin Pop authored
      Currently in the balancing function we do two thing:
      
      - take the decision where to do a new balancing round or not
      - and actually computing the balancing round
      
      This is not nice, as the two parts are conceptually separate, so this
      patch splits the decision on whether to descend or not to a new
      function.
      5ad86777
  19. 11 Dec, 2009 3 commits
  20. 17 Nov, 2009 1 commit
    • Iustin Pop's avatar
      Use conflicting primaries count in cluster score · d844fe88
      Iustin Pop authored
      This small patch adds the number of conflicting primaries in the cluster
      score. This is different from the other non-CV metrics where we usually
      compute the percentage of failing instances (for that metric); but for a
      somewhat big cluster, 1-2% failing instances will be a too small value
      to cause the relocation of conflicting instances (future patches will
      also switch other non-CV metrics to this method).
      d844fe88