Commits · f8638e288c7a5a74966d2452c97825826edc7b50 · itminedu / snf-ganeti

Aug 08, 2011

Detect globbing patterns as query arguments · f8638e28

Michael Hanselmann authored 13 years ago

Short: this patch enables the use of “gnt-instance list '*.site'”.

Detailed description: This patch changes the command line interface code
to try to deduce the kind of filter from the arguments to a “list”
command. If it's a list of plain names an old-style name filter is used.
If filtering is forced or the single argument is potentially a filter,
it is parsed as a query filter string. Any name looking like a globbing
pattern (e.g. “*.site” or “web?.example.com”) is treated as such.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

f8638e28

cluster-merge: implement params delta mercifulness · 1fcd3b81

Guido Trotter authored 13 years ago


Sometimes it's good to tell the user about parameter differences but
then proceed anyway. Strictness is still enforced for those parameters
that would break the cluster (volume group name, storage dir if file
storage is enabled).

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

1fcd3b81

cluster-merge: consider file storage enable state · bb074298

Guido Trotter authored 13 years ago


There's no point in checking whether the file storage dir in the two
clusters is the same if file storage is not even enabled

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

bb074298

Allow fixing of split instances via relocate · e3a19474

Iustin Pop authored 13 years ago


Currently, the IAllocator code requests strictly that the (set of) groups of
the nodes we're relocating from is equal to the set of groups we're
relocating to.

This, however, makes is impossible to fix split instances, since (by
definition) the secondary of a split instance is not in the same group
as the primary node, and after the fixing is it the same.

The patch changes the test from group equality to check that the final
group set (across both primary and secondary nodes) is a subset of the
initial group set (again across both nodes). This means we can't
"extend" the group of nodes but keeping the same or decreasing it is
allowed.

After this patch, one can finally fix (automatically) split instances
via a gnt-instance replace-disks.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e3a19474

Revert deprecation of evacuate mode in hail · 88df1fa9

Iustin Pop authored 13 years ago


As discussed offline, the new node-change mode could be used for
evacuation, but it's not directly useful as it returns a list of
opcodes; therefore, we need to partially revert commits fbe5fcf6 and
5b53ca79 that removed it (and multi-evacuate, which remains removed).

The new version of relocate is actually just a wrapper over the
tryNodeEvac (which does the node evacuate); we run that and then we do
some extra checks that the nodes we got from that function are
consistent with the instance's new state.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

88df1fa9

Further cleanup after multi-evacuate removal · f5fab862

Iustin Pop authored 13 years ago


Commit f0edfcf6 removed the parsing of multi-evacuate result, but the
code went from:

  if mode in (multi-evac, relocate):
    …
    if mode == relocate:
      …

to:

  if mode == relocate:
    …
    if mode == relocate
      …

This patch simply removes the nested if.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

f5fab862

Fix bug in IAllocator parsing of Evacuate result · 2f41ea77

Iustin Pop authored 13 years ago


Commit 342f9172 added stricter checks for the iallocator result in
evacuate mode, but it does this irrespective of the result
status. When the result has failed and (according to the design) the
list of nodes is empty, this code will trigger the following:

    node1# gnt-instance replace-disks -I hail instance14
    Failure: command execution error:
    Groups of nodes returned by iallocator () differ from original groups (default)

After the patch, the result is:

    node1# gnt-instance replace-disks -I hail instance14
    Failure: prerequisites not met for this operation:
    error type: insufficient_resources, error details:
    Can't compute nodes using iallocator 'hail': Request failed: …

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

2f41ea77

Aug 05, 2011

Implement globbing operator for filters · 16629d10

Michael Hanselmann authored 13 years ago


The operators “=*” and “!*” do globbing in filters, e.g.:

$ gnt-instance list --no-headers -o name 'name =* "*.site"'
inst1.site.example.com

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

16629d10

Zero DRBD metadata before creation · 18e4dee6

Iustin Pop authored 13 years ago

The docstring of the DRBD8 class says:

  … The meta device is checked for valid size and is zeroed on create.

which is not done today, hence we have
http://code.google.com/p/ganeti/issues/detail?id=182

:

  node1# mkreiserfs -f /dev/xenvg/t8
  …
  ReiserFS is successfully created on /dev/xenvg/t8.
  node1# drbdmeta --force /dev/drbd256 v08 /dev/xenvg/t8 0 create-md
  md_offset 0
  al_offset 4096
  bm_offset 36864

  Found reiser filesystem

  This would corrupt existing data.
  If you want me to do this, you need to zero out the first part
  of the device (destroy the content).
  You should be very sure that you mean it.
  Operation refused.

I've tested and even just 1MB is enough to wipe the meta, but let's be
safer and pass a 'clean' meta to drbd.

Note: I didn't copy _WipeDevice from backend.py since it seemed more
complex than needed here.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

18e4dee6

Remove iallocator's “multi-evacuate” mode · f0edfcf6

Michael Hanselmann authored 13 years ago


It is no longer used and has been deprecated in 2.5.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

f0edfcf6

confd.querylib: Remove long-deprecated query mode · 58aa30d7

Michael Hanselmann authored 13 years ago


This was never used by a stable version.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

58aa30d7

Add docstring to cmdlib.TLReplaceDisks._FindFaultyDisks · ce4cd929
Michael Hanselmann authored 13 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
ce4cd929

watcher: Fix breakage caused by · 6f9e71bb

Michael Hanselmann authored 13 years ago


The first argument to str.split is the separator, not the maximum number
of splits.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

6f9e71bb

LUGroupVerifyDisks: Use _CheckInstanceNodeGroups' result · cb386168

Michael Hanselmann authored 13 years ago


… instead of getting the list of instances once again from the
configuration.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

cb386168

cmdlib: Factorize checking node groups' instances · b9ff3e15
Michael Hanselmann authored 13 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
b9ff3e15

Include hooks.rst in version check · d5114fa1

Michael Hanselmann authored 13 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

d5114fa1

Bump version to 2.5.0~beta1 · 9f604ab8

Michael Hanselmann authored 13 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

9f604ab8

watcher: Write per-group instance status, merge into global one · 9bb69bb5

Michael Hanselmann authored 13 years ago

Each per-group watcher process writes its own instance status file. Once
that's done it tries to acquire an exclusive lock on the global file and
will proceed to read all status file, merging them based on each file's
mtime. If an instance is moved to another group, the newer status will
supersede that of an older file which hasn't yet been updated.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

9bb69bb5

cleaner: Remove watcher's instance status file after 21 days · 6890cf2e
Michael Hanselmann authored 13 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
6890cf2e

utils.ReadFile: Add pre-read callback · 0e5084ee

Michael Hanselmann authored 13 years ago


This will be used by the watcher to store the file's fstat(2). It must
be done from the filehandle.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

0e5084ee

Merge branch 'stable-2.4' · 2f1fe558

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

2f1fe558

Bumping version to 2.4.3 · 2f994ece

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

2f994ece

Aug 04, 2011

Fixed a typo in utils/process.py · eee68d57

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

eee68d57

Fix unittest failure after list_owned changes · e8906f7d

Iustin Pop authored 13 years ago


We just need an object that has a list_owned method.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

e8906f7d

Remove 15-second sleep from LUInstanceCreate · 12126847

Apollon Oikonomopoulos authored 13 years ago


Remove 15 second sleep when wait_for_sync is not set. LUInstanceCreate already
calls _WaitForSync with oneshot=True, which already performs an internal
wait-loop for disks to start syncing.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12126847

Add a readability alias · af993a2c

Iustin Pop authored 13 years ago


lu.glm.list_owned becomes lu.owned_locks, which is clearer for the
reader.

Also rename three variables (which were before named owned_locks) to
make clearer what they track.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

af993a2c

Fix broken object references in docstrings · ce523de1

Michael Hanselmann authored 13 years ago


The module is called “objects”, not “object”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

ce523de1

Add “gnt-instance change-group” command · bd2a5569

Michael Hanselmann authored 13 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

bd2a5569

Add opcode to change instance's group · 1aef3df8

Michael Hanselmann authored 13 years ago


This is quite similar to evacuating a group, but the locking
is different.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

1aef3df8

Factorize checking instance's node groups · eafa26af

Michael Hanselmann authored 13 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

eafa26af

Update the NEWS file for 2.4.3 · e20832af

René Nussbaumer authored 13 years ago


Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e20832af

ganeti-cleaner: Remove old watcher state files · 7b642c49

Michael Hanselmann authored 13 years ago


Watcher state files can stay around if node groups are removed. With
this patch they're removed after 21 days.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

7b642c49

Remove WATCHER_STATEFILE constant · 173dbf05

Michael Hanselmann authored 13 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

173dbf05

cfgupgrade: Remove old watcher state file · a292020f

Michael Hanselmann authored 13 years ago


Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

a292020f

ganeti-watcher: Split for node groups · 16e0b9c9

Michael Hanselmann authored 13 years ago


This patch brings a huge change to ganeti-watcher to make it aware of
node groups. Each node group is processed in its own subprocess,
reducing the impact of long-running operations.

The global watcher state file, $datadir/ganeti/watcher.data, is replaced
with a state file per node group ($datadir/ganeti/watcher.${uuid}.data).

Previously a lock on the state file was used to ensure only one instance
of watcher was running at the same time. Some operations, e.g.
“gnt-cluster renew-crypto”, blocked the watcher by acquiring an
exclusive lock on the state file. Since the watcher processes now use
different files, this method is no longer usable. Locking multiple files
isn't atomic. Instead a dedicated lock file is used and every watcher
process acquires a shared lock on it. If a Ganeti command wants to block
the watcher it acquires the lock in exclusive mode.

Each per-nodegroup watcher process also acquires an exclusive lock on
its state file. This prevents multiple watchers from running for the
same nodegroup.

The code is reorganized heavily to clear up dependencies between
functions and to get rid of the global “client” variable. The utility
class “Watcher” is removed in favour of stand-alone utility functions.

Since the parent watcher process won't wait for its children by
default, a new option (--wait-children) was added. It is used, for
example, by QA.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

16e0b9c9

Lock potential target nodes for group evacuation · de9c12f7

Michael Hanselmann authored 13 years ago


All potential target nodes should be locked while calculating
a group evacuation.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

de9c12f7

Small changes in group evacuation · 6e80da8b

Michael Hanselmann authored 13 years ago


- Use OpPrereqError in CheckPrereq
- Clarify command synopsis

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

6e80da8b

cmdlib: Factorize getting iallocator · a14065ac

Michael Hanselmann authored 13 years ago


The same logic will be used for changing an instance's group.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

a14065ac

Add design document for Ganeti 2.5 · d774ce92

Michael Hanselmann authored 13 years ago


Including the designs which were actually implemented.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

d774ce92

Aug 03, 2011

Pause DRBD sync for OS install if not wait_for_sync · 41e1e79e

Apollon Oikonomopoulos authored 13 years ago


When wait_for_sync is set to False in LUInstanceCreate, Ganeti lets DRBD sync
in the background while performing the rest of the installation steps,
including OS installation.

However, OS installation is a very disk-intensive task that intereferes badly
with the background I/O caused by DRBD's initial sync. To this end, we pause
the background sync before OS installation and unpause it afterwards, which
yields a significant speed boost for OS installation. The following should be
noted:

a) The user has requested not to wait for sync, i.e. the instance will be
   non-redundant for an unspecified interval anyway and delaying this by a
   couple of minutes is not a big compromise.

b) This approach is also followed during disk wiping.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
[iustin@google.com: simplify an if check]
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

41e1e79e