Commits · 8eb148ae5f1189e390ea7bb1ff50de07bd8afc10 · itminedu / snf-ganeti

May 04, 2009

Fix gnt-cluster getmaster on non-master nodes · 8eb148ae

Iustin Pop authored 15 years ago


The current implementation of “gnt-cluster getmaster” doesn't work on
non-master nodes, which is a regression from 1.2. This patch implements
it (again) via ssconf.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Alexander Schreiber <als@google.com>

8eb148ae

Apr 24, 2009

gnt-instance info --all · 220cde0b

Guido Trotter authored 15 years ago

Don't show all instances info by default, but require --all to be passed
for this time consuming operation.

Reviewed-by: iustinp

220cde0b

Apr 15, 2009

A bunch of doc and other small fixes · 949bdabe

Iustin Pop authored 15 years ago

This patch adds a couple of both externally and internally reported
issues:
  - missing SGML tags (Issue 54), report and patch by superdupont
  - wrong variable used in the init.d script, report and patch by
    Karsten Keil <karsten-keil@t-online.de>
  - man page for gnt-instance reinstall needs clarification (Issue 56)
  - gnt-instance man page missing --disks documentation for
    replace-disks
  - gnt-node modify help output is unclear about the -C/-D/-O input
    format, and the man page doesn't document this command at all
  - “gnt-node modify -C yes” for offline or drained nodes had wrong
    error message
  - “gnt-instance reinstall --select-os” has wrong prompt, we only
    accept a number for the OS and not the template name

Reviewed-by: ultrotter

949bdabe

Apr 06, 2009

Disable synchronous (locking) queries · 77921a95

Iustin Pop authored 15 years ago

This patch raises an error in the master daemon in case the user
requests a locking query; accordingly, all clients were modified to send
only lockless queries. This is short-term fix, for proper fix the
clients should be modified to submit a job when the user request a
locking query.

The other approach would be to ignore the flag passed by the client;
this would be worse as client's wouldn't get at least an error.

The possible impact of this is multiple:
  - some commands could have been not converted, and thus fail; this
    can be remedied easily
  - the consistency of commands is lost; e.g. node failover will not
    lock the node *while we get the node info*, so we could miss some
    data; this is again in the thread of atomic operations which are
    missing in the current model of query-and-act from gnt-* scripts

Reviewed-by: imsnah, ultrotter

77921a95

Mar 20, 2009

Raise on invalid gnt-cluster queue commands · 2e668b38

Guido Trotter authored 16 years ago

 # gnt-cluster queue foo
 Failure: prerequisites not met for this operation:
 Command 'foo' is not valid.

Reviewed-by: iustinp

2e668b38

Mar 12, 2009

Fix the --net option to gnt-instance add · dc30b0e4

Iustin Pop authored 16 years ago

Similar to the --disk fixes a while ago, --net is broken too. This patch
fixes it.

Reviewed-by: imsnah

dc30b0e4

Feb 24, 2009

Remove the extra_args parameter in instance start · 07813a9e

Iustin Pop authored 16 years ago

This patch removes the extra_args parameter and instead switches the
instance to the HV_KERNEL_ARGS hypervisor option.

This is a big change, but it's a needed cleanup, this extra parameter on
all RPC calls is not generic and we also need to have a persistent value
here.

Reviewed-by: imsnah

07813a9e

gnt-instance info: remove hvattr descriptions · dfff41f8

Guido Trotter authored 16 years ago

Having hvattr descriptions is only confusing for the user, because even
if they explain better what an attribute is about, they don't help in
deciding what keyword should be used to actually set it. If in the
future we want descriptions they should probably live in constants.py,
and be displayed together with the key, rather than instead of it.

This patch also changes the handling of the vnc console connection
description, making compatible work with both kvm and xen-hvm.

Reviewed-by: iustinp

dfff41f8

Feb 13, 2009

Implement the backward-compatible ‘-s’ disk option · c0e4a2c3

Iustin Pop authored 16 years ago

This patch adds back to the instance creation command (gnt-instace add,
gnt-backup import) the ‘-s’ short form option for specifying a
single-disk instance.

Also a small bug in gnt-backup import is fixed.

Reviewed-by: ultrotter

c0e4a2c3

Feb 12, 2009

Some command line scripts fixes · f1de3563

Iustin Pop authored 16 years ago

This patch changes the gnt-node and gnt-job list commands to accept
argument and list only the selected items, which is useful when having
many nodes or jobs.

It also removes the “--units” option from gnt-job list as we don't
actually use it.

Reviewed-by: imsnah

f1de3563

Always use the same short option for iallocator · 633b36db

Iustin Pop authored 16 years ago

This patch changes the scripts so that the short name for the
“--iallocator” option is always ‘-I’.

Reviewed-by: ultrotter

633b36db

Some batcher fixes · 4082e6f9

Iustin Pop authored 16 years ago

Currently the batcher hypervisor parameter must be a dict with one
element (e.g. {"xen-hvm": { "acpi": true }}). This is overly complex and
hard to validate correctly; the patch splits it in two:
  - one "hypervisor" string parameter, with the name of the hypervisor
  - one "hvparams" dictionary, with the hypervisor parameters

The patch also changes the error handling in parsing the definition file
- since this is not a long-running file, we are less concerned with safe
closing of the file, and more with presenting meaningful error
messages.

Reviewed-by: killerfoxi

4082e6f9

Feb 11, 2009

gnt-instance fix a typo in AddInstance · 6633774e
Guido Trotter authored 16 years ago
```
It's hvparams, not opts.hvparams.

Reviewed-by: iustinp
```
6633774e

gnt-cluster, pass hvparams directly to dict() · f8e7ddca

Guido Trotter authored 16 years ago

If hvparams is not set it will be [], so dict() will transform it to an
empty dict, which is safe in all cases.

Reviewed-by: iustinp

f8e7ddca

Feb 10, 2009

Sort instance data in gnt-node info · ae07a1d3

Iustin Pop authored 16 years ago

The patch sorts the instance list in gnt-node info output, in order to
make it more readable (and stable).

Reviewed-by: imsnah

ae07a1d3

Some fixes to node add and re-add · 82e12743

Iustin Pop authored 16 years ago

The patch changes the pre-checks in node-add and re-add:
  - if the node is not already in the cluster, refuse to re-add
  - when re-adding, reuse the secondary IP from the cluster
    configuration
  - when re-adding, reset the offline and drained flags, so that RPC
    calls work (and we can actually upload the keys)

The patch also adds a missing log entry in LUSetNodeParams.

Reviewed-by: imsnah

82e12743

Instance parameters: force typing · a5728081

Guido Trotter authored 16 years ago

We want all the hv/be parameters to have a known type, rather than a
random mix of empty string, boolean values, and None, so we declare the
type of each variable and we enforce/convert it.

- Add some new constants for enforceable value types
- Add new constants dicts HVS_PARAMETER_TYPES and BES_PARAMETER_TYPES
  holding not only the valid parameters but also their types
- Drop the old HVS_PARAMETERS and BES_PARAMETERS constants and calculate
  the values from the type dict
- Convert all the default parameters to a valid type value
- Create a new ForceDictType utils function, to check/enforce a dict's
  element value types, with relevant unit tests
- Drop a few custom functions to check/convert the BE param types in
  utils and cli, in favor of ForceDictType
- Double-check the parameter types using ForceDictType in both scripts
  and LogicalUnits, when possible.

As a bonus:
- Remove some old commented-out code in gnt-instance
- Remove some already fixed FIXME
- Fix a bug which prevented VALUE_DEFAULT to be applied to BE parameters
  in SetInstanceParams because the value was checked for validity before
  that transformation was made
- Fix a bug which prevented initing a cluster and passing hvparams to
  work at all
- ForceDictType allows an allowed_values for exceptions, which makes us
  able to do the checking even when some values must not be
  converted/typechecked (for example the 'default' string in
  SetInstanceParameters)

Reviewed-by: iustinp

a5728081

Implement modification of the drained flag · c9d443ea

Iustin Pop authored 16 years ago

This patch adds LU and cli-level support for modification of the node
drained flag. It is similar to the offline changes.

Reviewed-by: imsnah

c9d443ea

Allow query of the drained node attribute · 0b2454b9

Iustin Pop authored 16 years ago

This patch exports the drained attribute:
  - LUQueryNodes accepts now the drained field
  - RAPI exports it for node objects
  - gnt-node info shows it now (along newly-added master_candidate and
    offline flags)
  - gnt-node list can list it (but not by default)
  - to the iallocator scripts

Reviewed-by: imsnah

0b2454b9

Feb 09, 2009

Add a new instance query flag ‘disk_usage’ · 024e157f

Iustin Pop authored 16 years ago

This patch adds a new instance query flag called disk_usage that
retrieves the overall space used by an instance on each of its nodes.
This can be used when balancing the cluster or checking N+1 status.

The flag is also exported in RAPI. Note the flag is currently broken for
file-based instances, as it represents the amount of space in the
cluster volume group.

Reviewed-by: ultrotter

024e157f

Export the cpu nodes and sockets from Xen · 0105bad3

Iustin Pop authored 16 years ago

This is a hand-picked forward patch of commit 1755 on the 1.2 branch
(hand-picked since the trees diverged too much since then):

    The patch changed the xen hypervisor to compute the number of cpu
    sockets/nodes and enables the command line and the RAPI to show this
    information (for RAPI is enabled by default in node details, for gnt-one
    one can use the new “cnodes” and “csockets” fields).

    Originally-Reviewed-by: ultrotter

For the KVM and fake hypervisors, the patch just exports 1 for both
nodes and sockets. This can be fixed, by looking at the
/sys/devices/system/cpu/cpuN/topology directories, and computing the
actual information, but that should be done in a separate patch.

Reviewed-by: imsnah

0105bad3

Feb 05, 2009

Fix some issues for lockless queries · 2e7b8369

Iustin Pop authored 16 years ago

This patch converts some more jobs with only queries into cheaper luxi
queries (no job created), and fixes some fallout from the lockless
queries changes.

Reviewed-by: ultrotter

2e7b8369

Feb 04, 2009

Enable lockless node queries · bc8e4a1a

Iustin Pop authored 16 years ago

Similar to the instance list, this patch enables lockless node queris.
“gnt-node list” accepts now the “--sync” flag which enables locking, the
default is lockless.

Reviewed-by: imsnah

bc8e4a1a

ssconf: add some more keys and some fixes · 81a49123

Iustin Pop authored 16 years ago

This patch adds the online node list and instance list to the ssconf
keys. In order to do distribute correctly the instance list, we need to
update the cluster serial number on instance additions and removals.

The patch also changes the permissions on the ssconf files to be 0444:
  - no write for root, in order to signal that these file should not be
    modified
  - read for everyone since the files don't contain sensitive data
    anymore (and permissions can be controlled via the parent directory
    if needed)

The patch also fixes a small typo on gnt-cluster.

Reviewed-by: ultrotter

81a49123

Implement lockless query operations · ec79568d

Iustin Pop authored 16 years ago

This patch adds the framework for, and enables lockless OpQueryInstances. This
means that instances will be shown in ERROR_up or ERROR_down state, even though
this is not an error (but just an in-progress job).

The framework is implemented as follows:
  - the OpQueryInstances, OpQueryNodes and OpQueryExports opcodes take
    an additional “use_locking” flag which will denote whether to lock
    or not; this patch only implements this for LUQueryInstances
  - the luxi query functions take an additional argument use_locking
    which is passed to the master daemon, and then passed to the above
    opcodes
  - cli.py export a new SYNC_OPT command line options which implement
    setting this flag to true
  - except for gnt-instance list, which uses this option, and for
    name-only queries (e.g. QueryNodes(fields=["names"])), all other
    callers are setting this flag to True
  - RAPI also sets the flag to True

The patch was tested with a continuous (0.2s sleep in-between)
gnt-instance list during a burnin, and no problems were observed.

Reviewed-by: ultrotter

ec79568d

Feb 03, 2009

Allow gnt-node evacuate to use an iallocator · c4ed32cb

Iustin Pop authored 16 years ago

This is a partial implementation of fully automated node evacuation:
we allow passing an iallocator and all instance replace-disks will be
execute via that iallocator.

The individual OpReplaceDisks opcodes are submitted in a single job,
which causes them to be executed serially and thus keeps the iallocator
runs consistent. This also changes the behaviour so that the first
reallocation that failed will stop all the reallocations.

Reviewed-by: ultrotter

c4ed32cb

Add gnt-node migrate · 40ef0ed6

Iustin Pop authored 16 years ago

This is a (modified) forward-port of commit 1190 on the 1.2 branch:

  This is the same as gnt-node failover, and is also a cut&paste of its
  code (almost). It will be really really useful to quickly empty a
  healthy node. I can be persuaded to merge MigrateNode and FailoverNode
  in a common codebase, but could also forget about it and submit it if
  nobody cares.

  Reviewed-by: iustinp

The original MigrateNode function has been converted to the 2.0 style
(cli.JobExecutor). Also commit 2076 has been added that fixes a missing
opcode parameter.

Original-author: ultrotter
Reviewed-by: ultrotter

40ef0ed6

An attempt at fixing some encoding issues · 26f15862

Iustin Pop authored 16 years ago

This patch unifies the hardcoded re-encoding attempts into a single
function in utils.py. This function is used to take either an unicode or
str object and convert it to a ASCII-only str object which can be safely
displayed and transmitted.

We replace then the current manual re-encodings with this function. In
mcpu we stop re-encoding the hooks output and instead we do it right at
the hook generation in backend.py.

This passes on my 'custom' lvs output with non-ASCII chars. But there
are probably other places we will need to fix.

Reviewed-by: ultrotter

26f15862

Feb 01, 2009

gnt-instance: support no_PARAMETER value · e9d622bc

Guido Trotter authored 16 years ago

Since parameters get set to False if a no_ is prefixed don't try to
interpret those boolean values, and pass them unchanged.

Reviewed-by: iustinp

e9d622bc

Jan 29, 2009

gnt-instance list: accept input names · 5ffaa51d

Iustin Pop authored 16 years ago

Currently gnt-instance list will refuse to take arguments, and always
return the full list of instances. This patch allows it to pass names to
LUQueryInstances, so that we restrict the input to a given set of
instances.

Reviewed-by: ultrotter

5ffaa51d

Check that instance exists before confirm. queries · a76f0c4a

Iustin Pop authored 16 years ago

Currently we ask the user for confirmation, and only after (try to)
remove, failover or migrate the instance. This doesn't work nicely if
the instance doesn't exist, so we make a query for the instance before
the prompt, which will throw an error in case it doesn't exist.

Side-note: the way the query works today is not really nice. It would be
better if we could query explicitly for a missing instance name, so that
this is done cleaner (explicit check) instead of side-effect (throw
exception). We do add code for this explicit check, except that today it
won't be used actually.

Reviewed-by: ultrotter

a76f0c4a

Jan 27, 2009

Rework the multi-instance gnt commands · 479636a3

Iustin Pop authored 16 years ago

This patch changes the multi-instance gnt-* commands (gnt-instance
start/stop, gnt-node evacuate/failover) such that the individual
operations are submitted in parallel, ideally improving the speed of the
execution.

The patch does this by abstracting the job set functionality into a new
class in cli.py, that takes care of the job submit, job poll and error
handling.

Reviewed-by: ultrotter

479636a3

Jan 23, 2009

Sort the instance names in batcher · 7312b33d

Iustin Pop authored 16 years ago

In case we submit multiple instances via batcher, it's nicer to have the
sorted nicely.

Reviewed-by: imsnah

7312b33d

Fix batcher for 2.0-style disks and nics · 9939547b

Iustin Pop authored 16 years ago

This patch fixes the gnt-instance batch-create command, and in doing so
also slightly changes two other functions:
  - we change utils.ParseUnit so that it accepts integer values also
    (both ParseUnit(5) and ParseUnit("5") return the same value)
  - a bridge 'None' in LUCreateInstance will be converted to the default
    bridge; currently only missing bridges will be accepted to mean the
    default one

The main changes to batcher were the change to variable number of disks
and NICs.

The patch also adds a batcher-instances.json example file copied from
the 1.2 branch and properly modified.

Reviewed-by: imsnah, killerfoxi

9939547b

Make iallocator work with offline nodes · 1325da74

Iustin Pop authored 16 years ago

This patch changes the iallocator framework to work with and properly
export to plugins offline nodes. It does this by only exporting the
static configuration data for those nodes, and not attempting to parse
the runtime data.

The patch also fixes bugs in iallocator related to the RpcResult
conversion, changes the should_run to admin_up attribute name (as per
the internals change), and adds “-I” as a short option for
“--iallocator” in gnt-instance, gnt-backup and burnin.

Reviewed-by: ultrotter

1325da74

A couple of small fixes to iallocator · 8901997e

Iustin Pop authored 16 years ago

This removes some constraints:
  - only two disks supported, this is no longer true as the underlying
    functions can now compute size for a variable number of disks
  - error when the hypervisor was not being passed
  - typo error

Reviewed-by: imsnah

8901997e

Jan 20, 2009
- Fix a couple of epydoc warnings · 2f907a8c
  Iustin Pop authored 16 years ago
```
Reviewed-by: ultrotter
```
  2f907a8c
Jan 19, 2009

Move the default MAC prefix to the constants file · c5e489f7

Iustin Pop authored 16 years ago

Instead of having the default live in the gnt-cluster script, we move it
to the constants file. The patch also fixes a typo on constants.py.

Reviewed-by: ultrotter

c5e489f7

Jan 13, 2009

Forward port the live migration from 1.2 branch · 53c776b5

Iustin Pop authored 16 years ago

This is forward port via copy (and not individual patches cherry-pick)
of the latest code on the 1.2 branch related to the migration.

The changes compared to 1.2 are the fact that we don't need the
IdentifyDisks step anymore (the drbd rpc calls are independent now), and
the rpc module improvements.

Reviewed-by: ultrotter

53c776b5

Jan 12, 2009

Skip offline nodes in gnt-cluster commands · 4040a784

Iustin Pop authored 16 years ago

This patch makes gnt-cluster copyfile and command skip the offline
nodes.

Reviwed-by: ultrotter, imsnah

4040a784