Commits · 34c00528ac07cadc45cf805e210bd09ae1e66ce4 · itminedu / snf-ganeti

Dec 30, 2010

Convert Loader.RqType to ClusterData · 34c00528

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

34c00528

Add a new type ClusterData · 7b6e99b3

Iustin Pop authored 14 years ago


This will be used to hold all the disparate uses of the cluster data:
we have either tuples with these four elements, or functions taking
these four arguments, etc.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

7b6e99b3

Ajust hspace manpage for the new simulation syntax · 45cb5963
Iustin Pop authored 14 years ago
```
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>
```
45cb5963

Simulation backend: read the allocation policy too · 6c7448bb

Iustin Pop authored 14 years ago


This patch moves the allocation policy from hardcoded to be read from
the given specification, and extends the error message for invalid
specifications.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

6c7448bb

Simulation backend: allow multiple node groups · 9983063b

Iustin Pop authored 14 years ago


This patch changes the behaviour of the --simulation option to be an
incremental option, where each new use defines a new node group. This
allows simulation of more complex clusters.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

9983063b

Dec 23, 2010

Merge branch 'stable-0.2' · 54cffd50

Iustin Pop authored 14 years ago

* stable-0.2:
  Move man files to man/ subdirectory

Conflicts (all removed):
        man/hail.1
        man/hbal.1
        man/hscan.1
        man/hspace.1

54cffd50

Move man files to man/ subdirectory · ab0521f9

Iustin Pop authored 14 years ago


This is just change on the 0.2 branch to synchronize with the master
branch. It allows automated builds to work better across the two
versions.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Adeodato Simo <dato@google.com>

ab0521f9

Merge branch 'stable-0.2' · 50211c86

Iustin Pop authored 14 years ago

* devel-0.2:
  Update NEWS file for 0.2.8 release
  hbal: return meaningful exit code for job failures
  Change the balancing function

50211c86

Update NEWS file for 0.2.8 release · d7f18640

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Adeodato Simo <dato@google.com>

d7f18640

hbal: return meaningful exit code for job failures · 23448f82

Iustin Pop authored 14 years ago


Currently, LUXI job failures only display a warning message, while
still returning a success exit code. We change hbal to return
true/false from within execJobSet/runJobSet, and add a wrapper for
simpler code.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Adeodato Simo <dato@google.com>

23448f82

Change the balancing function · 4715711d

Iustin Pop authored 14 years ago


Currently the balancing function is a modified version of the standard
deviation (stddev divided by list length), due to historical reasons.

While this works fine for small clusters, for big clusters it makes
the balancing effect too "weak", and in some cases it refuses to
balance correctly some clusters. It also makes the balancing behaviour
dependant on the cluster size, which is a big no-no.

Therefore we revert to the normal version of standard deviation, and
we also rename the function to reflect what it does. The new version
correctly balances some corner cases that the previous version didn't,
and passes the current balancing unittests.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Adeodato Simo <dato@google.com>

4715711d

Move some tiered spec functionality to Cluster.hs · 949397c8

Iustin Pop authored 14 years ago


This splits out a bit of code from hspace.hs and moves it into its own
function in Cluster.hs.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

949397c8

Dec 20, 2010

IAllocator: respect the alloc_policy for groups · 73206d0a

Iustin Pop authored 14 years ago


This patch changes the allocate mode to respect the alloc_policy for
groups. It does this by changing the sort key from simply the solution
score, to a tuple with two elements: the alloc policy (which is now an
Ord instance) and the solution score. Also, the unallocable groups are
filtered out in the filterMGResults phase.

The patch also slightly enhances the informational message by
including the policy in the group information, to help debugging.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

73206d0a

hail: allow overriding cluster data from requests · 01fec0a1

Iustin Pop authored 14 years ago


Currently, it's not easy to generate “fake” IAllocator request files
for hail. As such, testing on simulated clusters is hard to do.

To workaround this, we change hail to also take the ‘-t’ and
‘--simulate’ options, so that we can override the cluster data read
from the request. Note that this will not change the request itself
(so for example an evacuation will need to make sure uses the correct
node names), but it's a step forward in testing hail. The other tools
already can use text files which allow for better flexibility.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

01fec0a1

Text: read/write the allocation policy · f4c7d37a

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

f4c7d37a

Luxi: read the allocation policy from the cluster · c4c37257
Iustin Pop authored 14 years ago
```
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>
```
c4c37257
Rapi: read the allocation policy from the cluster · 2ddabf4f
Iustin Pop authored 14 years ago
```
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>
```
2ddabf4f

Implement a JSON instance for AllocPolicy · b2ba4669

Iustin Pop authored 14 years ago


This will allow reading this attribute via the Rapi/Luxi backends.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

b2ba4669

live-test: support multi-group clusters · 1a3cc8ad

Iustin Pop authored 14 years ago


Since currently hbal can only work on single groups at a time, we need
to be able to specify the target group when running the live test.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

1a3cc8ad

Text.hs: serialize cluster tags when writing data · 716c6be5

Iustin Pop authored 14 years ago


This is the complement to the reading part. Now the live-test works
correctly against clusters with configured exclusion tags.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

716c6be5

Text.hs: also read cluster tags from the data file · afcd5a0b

Iustin Pop authored 14 years ago


This means that a file with the correct information is as accurate as
the other backends (Luxi, Rapi). Serialization of tags is in the next
patch.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

afcd5a0b

Text.hs: change to use sepSplit · a604456d

Iustin Pop authored 14 years ago


The new sepSplit function can split based on empty lines, so we remove
the hackish text splitting from before and simply use sepSplit. This
is needed as the addition of extra sections would have increased the
code linearly, which we don't want :)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

a604456d

Generalise the sepSplit function · 748d5d50

Iustin Pop authored 14 years ago


Currently it works on splitting strings by individual chars, but we
can generalise it to split lists by list elements, which means we can
reuse it later in the Text module for splitting both lists of chars by
'|' or lists of lines by empty newlines. The change also makes the
code cleaner (uses “null xs” instead of string-specific “xs == ""”).

Note: I tried to rewrite this in a nicer, functional style using
unfolds, but I failed to account for the final terminator case
(e.g. ab|cd|) resulting in a valid but empty element.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

748d5d50

hail: display group names in info messages · aec636b9

Iustin Pop authored 14 years ago


This patch switches from the group index to the group name for the
informational messages in the hail results.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

aec636b9

hbal: display the group name in the multi-group case · e0c85e08
Iustin Pop authored 14 years ago
```
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>
```
e0c85e08

Text.hs: also save the group data when serialising · e4d8071d

Iustin Pop authored 14 years ago


This should have been in the previous patches, but sent separate for
clarity.

The live-test script is updated to read the first node from the
cluster, now that the text files don't start anymore with the node
data.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

e4d8071d

Change the Node.group attribute · 10ef6b4e

Iustin Pop authored 14 years ago


Currently, the Node.group attribute is the UUID of the group, as until
recently Ganeti didn't export the node group properties. Since it does
so now, we make the following changes (again apologies for a big
patch):

- we change the group attribute to be an index, similar to the way an
  Instance.pnode and snode attributes point to the parent node(s)
- on load, we read the group.uuid attribute and we use that to lookup
  the actual group index, from previously-loaded groups info
- this means that we now first read groups, then read nodes using the
  group info, and then read instances using the node info

This patch leaves a few functions showing the group index (ugly since
it's htools internal), will be converted later.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

10ef6b4e

Rework the data loader pipelines to read groups · a679e9dc

Iustin Pop authored 14 years ago


This (invasive) patch changes all the loader pipelines to read the node
groups data from the cluster, via the various backends. It is invasive
as it needs coordinated changes across all the loaders.

Note that the new group data is not used, just returned.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

a679e9dc

Add lookupGroup utility function · f4531f51

Iustin Pop authored 14 years ago


This will be used in the various backends similar to the lookupNode
function.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

f4531f51

Add a new Group.hs module describing node groups · 0dc1bf87

Iustin Pop authored 14 years ago


This is not yet used by the rest of the code.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

0dc1bf87

Add the new OpQueryGroups opcode definition · edd0a48f

Iustin Pop authored 14 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

edd0a48f

Dec 15, 2010

Update hscolour usage · 133f1791

Iustin Pop authored 14 years ago


This patch fixes two issues related to coloured sources generation.
First, recent hscolour has changed the css file (and we need to update
it), but it also can output it at runtime, so there's no need to store
it anymore in the source tree.

Second, the current source generation predates the addition of sources
in Ganeti/ (as opposed to just Ganeti/HTools), and thus we were missing
the sources in that directory. We replace the target file name to
account for different base directories.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

133f1791

Dec 09, 2010

Improve error reporting for small clusters · dec88196

Iustin Pop authored 14 years ago


When doing a two-node allocation on a cluster/group in which only one
node is online, or a one-node allocation without any online nodes, we
didn't show a valid error mesage. The patch changes tryAlloc to "fail
hard" in this case, to make the failure explicit.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

dec88196

hail/allocate: implement multi-group support · 9b1584fc

Iustin Pop authored 14 years ago


This is a bit hackish. We add a new function that takes the input data,
splits it into groups, runs the original tryAlloc for each group, and
then chooses the best solution, but adds the log messages from all the
groups, as to give better debugging information. In hail, we just point
to this new function.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

9b1584fc

hail: remove the custom info message generation · db4d9a9b

Iustin Pop authored 14 years ago


Since the solutions are "self-annotated", we can remove the custom code
from hail, and just keep a very small processResults function.

After this change, allocation/failure shows the new detailed
information.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

db4d9a9b

Add a 'log' attribute to allocation solutions · 859fc11d

Iustin Pop authored 14 years ago


And also a couple of functions for describing a given solution; these
will be used in the future instead of the ones currently in hail.

The patch also enhances the description of failure messages.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

859fc11d

Change AllocSolution from tuple to its own type · 85d0ddc3

Iustin Pop authored 14 years ago


Tuples are good for two, three, at most four elements. Beyond that, the
continuous pattern matching and construction/deconstruction becomes
tedious.

Since in the future we'll probably keep more information in the
AllocSolution type, we change it now from a triple to a "real" data
type. We also do some cleanups: adding a real emptyAlloc value, instead
of the previous hardcoded ones, and add some more comments on how we do
the multi-evacuation.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

85d0ddc3

Dec 01, 2010

Cleanup AllocSolution after AllocElement changes · a334d536

Iustin Pop authored 14 years ago


Since we added the score to AllocElement, we don't need to wrap
AllocElement in yet another tuple, just to attach the cluster score. So
we simplify the AllocSolution type.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

a334d536

AllocElement: extend with the cluster score · 7d3f4253

Iustin Pop authored 14 years ago


AllocElement, a type used as a result of allocations, holds the status
of the nodes after the allocation. In most cases, we'll compare this
allocation result with others, to see which allocation decision makes
the most sense. This comparison is done via the cluster score.

However, if we later need to redo this computation, as part of other
comparisons, we'd need to evaluate it again, etc. So it's easier to just
compute the score at the place where we compute the node list in the
initial step.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

7d3f4253

Add two utility functions for the Result type · 06fb841e

Iustin Pop authored 14 years ago


Actually, this just moves the functions from the QC module to Types, and
removes a duplicate entry from Cluster.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

06fb841e