Commits · a629ecb966b86d039d800d593f88e01d18a8d45f · itminedu / snf-ganeti

Oct 12, 2011

Standardise LUXI call argument types · a629ecb9

Iustin Pop authored 13 years ago


Currently, we have 4 types of arguments in LUXI calls:

- most common, a list of values
- a single argument that is sent as a list of one element
- a single argument that is sent by itself
- a dictionary (only Query and QueryFields)

This inconsistency makes it not only harder to auto-generate the
HTools LUXI interface, but also in general to check the arguments and
(if we ever want to do it) auto-generate the Python LUXI client.

Compare this with the node daemon, which uses consistently a list for
its arguments, and even with way more changes over time had no issues
with extending the interface.

In case we want to extend a call, there are two options:

- preferred: add a new call, keep the old one unchanged
- possible: add further parameters to the current argument list

The patch against HTools will follow—sending separately as the Python
changes are very clear by themselves.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

a629ecb9

Rename filter and filter_ to qfilter · 2e5c33db

Iustin Pop authored 13 years ago


We currently use 'filter' as the OpCode, QueryRequest and RAPI field
name for representing a query filter. However, since 'filter' is a
built-in function, we actually have to use filter_ throughout the code
in order to not override the built-in function.

This patch simply goes and does a global sed over the code. Due to the
fact that the RAPI interface already exposed this field, we add
compatibility code for now which handles both forms.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

2e5c33db

Haskell support for generic Query in Luxi · 92678b3c

Iustin Pop authored 13 years ago


Untill now htools did not have support for generic Query in Luxi. This
patch introduces Query as a supported Luxi operation and replaces
QueryNodes, QueryInstances and QueryGroups with Query.

Signed-off-by: Agata Murawska <agatamurawska@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

92678b3c

TH simplification for Luxi · 9d74cb04

Agata Murawska authored 13 years ago


This patch simplifies the generation of save constructors for LuxiOp
by always using showJSON over an array of JSValues, instead of having
to pass showJSON in most cases, except the 5-tuple case.

Signed-off-by: Agata Murawska <agatamurawska@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
[iustin@google.com: fixed a few issues]

9d74cb04

Dots in docstings and hlint error fixes for htools · 05ff7a00

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

05ff7a00

Add design doc for the resource model changes · d85f01e7

Iustin Pop authored 13 years ago


This is not complete, but is as close as I can get it for now. I
expect people actually implementing the various changes to extend the
design doc.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

d85f01e7

Oct 11, 2011

Remove the oneline output option in hbal · e19ee6e4

Iustin Pop authored 13 years ago


This was, AFAIK, never used, and complicates the output code enough
that it's better to remove it.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e19ee6e4

Rework/split hbal's main function · 5dad2589

Iustin Pop authored 13 years ago


This is just moving code around. A subsequent patch will do a bit more
cleanup and changing the output.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

5dad2589

Skip application of 'id' in TH code · 60de49c3

Iustin Pop authored 13 years ago


This is just beautification when dumping splices to stdout, as ghc
will optimise the 'id' away anyway.

Original generate code:

  opToArgs QueryTags kind name = J.showJSON (id kind, id name)

Afterwards:

  opToArgs QueryTags kind name = J.showJSON (kind, name)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

60de49c3

Oct 07, 2011

Don't send gratuitous ARP if master IP setup fails · 9888b9e6

Andrea Spadaccini authored 13 years ago


Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

9888b9e6

Document --ignore-errors and --error-codes · 830fc5da

Andrea Spadaccini authored 13 years ago


Update the man page of gnt-cluster to contain the documentation of the
--ignore-errors and --error-codes verify options. Also, include the list
of the error codes and their documentation.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

830fc5da

Add error codes documentation · 3ac3f5e4

Andrea Spadaccini authored 13 years ago


lib/constants.py
* add to each CV_E* tuple the documentation of the error code
* add the DOCUMENTED_CONSTANTS constant for the doc preprocessor

autotools/docpp
* add a new directive class CONSTANTS_<kind>, that gets data from
  constants.DOCUMENTED_CONSTANTS

lib/cmdlib.py
* modify the code that unpacked the CV_E* tuples to ignore the
  documentation parameter

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

3ac3f5e4

Generalize docpp and sphinx_ext · 12637df5

Andrea Spadaccini authored 13 years ago


autotools/docpp
* handle generic custom directives in the form <class>_<kind>
* adapt handling of query fields

build/sphinx_ext.py
* add the BuildValuesDoc function to output definitions using the sphinx
  syntax that was already used for query fields
* adapt BuildQueryFields to use BuildValuesDoc

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12637df5

Oct 06, 2011

Use TemplateHaskell to create LUXI operations · a0090487

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

a0090487

Oct 05, 2011

Documentation update for ovfconverter · a17deeab

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

a17deeab

Fixes for ovfconverter + vmware · fa337742

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

fa337742

Demote to warnings the errors in --ignore-errors · 980d1330

Andrea Spadaccini authored 13 years ago


Treat the gnt-cluster verify errors identified by the error codes in
--ignore-errors as warnings; just print a warning message for the user.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

980d1330

Add --ignore-errors parameter to cluster verify · 93f2399e

Andrea Spadaccini authored 13 years ago


lib/cli.py
- add IGNORE_ERROR_OPT;

client/gnt_cluster.py
- pass the ignore_errors parameter to the opcodes

lib/opcode.py
- update OpClusterVerifyConfig, OpClusterVerify and OpClusterVerifyGroup
  to accept the ignore_errors parameter

lib/cmdlib.py
- pass the ignore_errors parameter to the opcodes that need it

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

93f2399e

Move cluster verify error codes to constants · eedf99b5

Andrea Spadaccini authored 13 years ago


- move the cluster verify error codes from cmdlib._VerifyErrors to
  constants;
- add to each of them the CV (Cluster Verify) prefix;
- add the CV_ALL_ECODES and CV_ALL_ECODES_STRINGS constants;
- wrap the lines that exceed 80 characters after changing the error
  code names to the new ones.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

eedf99b5

Restore backend.GetMasterInfo return values order · 909b3a0e

Andrea Spadaccini authored 13 years ago


Change 5a8648eb changed the order of the
return values of backend.GetMasterInfo(). This broke the users of the
master_info RPC.

This change restores the original order, and adds a comment in
bootstrap.py about the new value added to the return values of
master_info.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

909b3a0e

Add cluster netmask parameter · 5a8648eb

Andrea Spadaccini authored 13 years ago


Add the master_netmask cluster parameter, that represents the netmask of
the master IP, encoded as a CIDR suffix.

This parameter can be set via the --master-netmask of gnt-cluster init
and gnt-cluster modify. The default behaviour is to be consistent with
the old default (/32 for IPv4 and /128 for IPv6).

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

5a8648eb

Add ValidateNetmask and GetClass IPAddress methods · 7df2c4f0

Andrea Spadaccini authored 13 years ago


Add the following methods to netutils.IPAddress:
* ValidateNetmask
* GetClassFromIpVersion
* GetClassFromIpFamily

Also, add related tests to the test suite.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

7df2c4f0

Oct 04, 2011

Merge branch 'devel-2.5' · ea9c753d

Andrea Spadaccini authored 13 years ago


* devel-2.5:
  cluster-merge: log an info message at node readd
  Bump version to 2.5.0~rc1
  Fix issue when verifying cluster files
  Revert "utils.log: Write error messages to stderr"
  Fix adding nodes after commit 64c7b383
  LUClusterVerifyGroup: Spread SSH checks over more nodes
  Optimise cli.JobExecutor with many pending jobs

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

ea9c753d

Merge branch 'stable-2.5' into devel-2.5 · a080bab8

Andrea Spadaccini authored 13 years ago


Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

a080bab8

cluster-merge: log an info message at node readd · 419bb2ef

Guido Trotter authored 13 years ago


node readd can take a long time, it's good to have info messages to see
progress.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

419bb2ef

Bump version to 2.5.0~rc1 · 07cea902

Michael Hanselmann authored 13 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

07cea902

Fix Makefile rules for QCHelper.hs · 9822b1dd

Iustin Pop authored 13 years ago


Include QCHelper.hs in the distributed files, and also exclude it and
the THH.hs file from coverage reports.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Andrea Spadaccini <spadaccio@google.com>

9822b1dd

Fix issue when verifying cluster files · 170b02b7

Michael Hanselmann authored 13 years ago


If a cluster has any non-master-candidate nodes, those don't contain all
files (e.g. config.data). With commit aef59ae7 (March 31st, 2011)
the logic was changed and subsequently verifying a cluster with non-mc
nodes would complain.

This patch fixes this issue by changing the algorithm. It also adds an
additional check for files which shouldn't exist on a machine. A newly
added unittest is included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

170b02b7

Oct 03, 2011

Revert "utils.log: Write error messages to stderr" · d728ac75

Michael Hanselmann authored 13 years ago


This reverts commit 34aa8b7c. Writing
error messages to stderr would also include backtraces, something we
tried to avoid in the past.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

d728ac75

Fix adding nodes after commit · ca6b16e5

Michael Hanselmann authored 13 years ago


Commit 64c7b383 changed the RPC call for verifying SSH connections.
Unfortunately this case in adding nodes was missed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

ca6b16e5

Some TH simplifications · 53664e15

Iustin Pop authored 13 years ago


Now that the basic code works, let's use some aliases for simpler code
and less ))))))))).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

53664e15

A few minor test improvements · 72bb6b4e

Iustin Pop authored 13 years ago


This patch adds a few niceties to the test suite:

- allows matching test groups case insensitive and emit warnings when
  we give test group names that don't match anything
- add a new operator that is similar to assertEqual in Python: it
  tests for equality and emits the two values in case of error

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

72bb6b4e

Use TemplateHaskell to decorate tests with names · 23fe06c2

Iustin Pop authored 13 years ago


This makes error message change from "Test 4 failed …" to "Test
prop_Loader_mergeData failed", which is much more readable. It also
removes the duplication of test suite names in the test.hs file.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

23fe06c2

Use TemplateHaskell to generate opcode serialisation · 12c19659

Iustin Pop authored 13 years ago


This replaces the hand-coded opcode serialisation code with
auto-generation based on TemplateHaskell.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

12c19659

Use TemplateHaskell to build the opID function · 6111e296

Iustin Pop authored 13 years ago


This replaces the hand-coded opID with one automatically generated
from the constructor names, similar to the way Python does it, except
it's done at compilation time as opposed to runtime.

Again, the code line delta does not favour this patch, but this
eliminates error-prone, manual code with auto-generated one; in case
we add more opcode support, this will help a lot.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

6111e296

Use TemplateHaskell instead of hand-coded instances · e9aaa3c6

Iustin Pop authored 13 years ago

This patch replaces the current hard-coded JSON instances (all alike,
just manual conversion to/from string) with auto-generated code based
on Template Haskell
(http://www.haskell.org/haskellwiki/Template_Haskell

).

The reduction in code line is not big, as the helper module is well
documented and thus overall we gain about 70 code lines; however, if
we ignore comments we're in good shape, and any future addition of
such data types will be much simpler and less error-prone.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

e9aaa3c6

Rename some helper functions for consistency · 2c9336a4

Iustin Pop authored 13 years ago


This changes the names for some helper functions so that future
patches are touching less unrelated code. The change replaces
shortened prefixes with the full type name.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

2c9336a4

Split part of Utils.hs into JSON.hs · f047f90f

Iustin Pop authored 13 years ago


Utils is a bit big, let's split the JSON stuff (not all of it) into a
separate module that doesn't have any other dependencies.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

f047f90f

Sep 30, 2011

LUClusterVerifyGroup: Spread SSH checks over more nodes · 64c7b383

Michael Hanselmann authored 13 years ago


When verifying a group the code would always check SSH to all nodes in
the same group, as well as the first node for every other group. On big
clusters this can cause issues since many nodes will try to connect to
the first node of another group at the same time. This patch changes the
algorithm to choose a different node every time.

A unittest for the selection algorithm is included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

64c7b383

Optimise cli.JobExecutor with many pending jobs · 11705e3d

Iustin Pop authored 13 years ago


In the case we submit many pending jobs (> 100) to the masterd, the
JobExecutor 'spams' the master daemon with status requests for the
status of all the jobs, even though in the end it will only choose a
single job for polling.

This is very sub-optimal, because when the master is busy processing
small/fast jobs, this query forces reading all the jobs from
this. Restricting the 'window' of jobs that we query from the entire
set to a smaller subset makes a huge difference (masterd only, 0s
delay jobs, all jobs to tmpfs thus no I/O involved):

- submitting/waiting for 500 jobs:
  - before: ~21 s
  - after:   ~5 s
- submitting/waiting for 1K jobs:
  - before: ~76 s
  - after:   ~8 s

This is with a batch of 25 jobs. With a batch of 50 jobs, it goes from
8s to 12s. I think that choosing the 'best' job for nice output only
matters with a small number of jobs, and that for more than that
people will not actually watch the jobs. So changing from 'perfect
job' to 'best job in the first 25' should be OK.

Note that most jobs won't execute as fast as 0 delay, but this is
still a good improvement.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

11705e3d