Commits · 05ccd98357162c2947c0984a82b44f7415a13ba4 · itminedu / snf-ganeti

Jun 27, 2008

LUAddNode: use node-verify to check node hostname · 5c0527ed

Guido Trotter authored 16 years ago

As we can't use ssh.VerifyNodeHostname directly, we'll set up a mini
node-verify to do checking between the master and the new node. In the
future networking checks, or more nodes, can be added as well.

Reviewed-by: iustinp

5c0527ed

LUAddNode: use self.sstore, not a local ss · 3d1e7706

Guido Trotter authored 16 years ago

Since we're inside a LU we have access to self.sstore.
No need to use ss, which separate instantiation will disappear in a few
patches! ;)

Reviewed-by: iustinp

3d1e7706

LUAddNode: upload files via rpc, not scp · b5602d15

Guido Trotter authored 16 years ago

We used to scp all the ssconf files, and the vnc password file to the
new node. With this patch we use the upload_file rpc, specifying just
the new node as a destination. All the files previously copied by scp
are already allowed by the backend.

Reviewed-by: iustinp

b5602d15

Allow VNC_PASSWORD_FILE to be rpc-uploaded · 90fae627
Guido Trotter authored 16 years ago
```
What could possibly go wrong?

Reviewed-by: iustinp
```
90fae627

Change fping to TcpPing in two LUs · 937f983d

Guido Trotter authored 16 years ago

Two LUs are using RunCmd to call fping, in order to check for an IP
presence on the network. Substituting it with TcpPing will get rid of
it, which makes it not break in the new world order, where the master
cannot fork.

Reviewed-by: iustinp

937f983d

raise QuitGanetiException in LeaveCluster · 6d8b6238
Guido Trotter authored 16 years ago
```
Reviewed-by: iustinp
```
6d8b6238

Simplify QuitGanetiException instantiation · 9f9c8ee2

Guido Trotter authored 16 years ago

Rather than packing all the arguments in a tuple, let's pass them
plainly. The superclass won't complain.

Reviewed-by: iustinp

9f9c8ee2

logger: Set formatter for stderr · 5023934a

Michael Hanselmann authored 16 years ago

Having a timestamp on log messages is very useful. The default
format string doesn't include a timestamp.

Reviewed-by: ultrotter

5023934a

Jun 26, 2008

When removing a node don't ssh to it · d489ca4f

Guido Trotter authored 17 years ago

Even in 1.2 this behaviour is broken, as the rpc call will remove the
ssh keys before we get a chance to log in. Now the rpc takes care of
shutting down the node daemon as well, so we definitely can avoid this.

This makes the LURemoveNode operation work again with the threaded
master daemon.

Reviewed-by: iustinp

d489ca4f

Add errors.QuitGanetiException · e50bdd68

Guido Trotter authored 17 years ago

This exception does not signal an error but serves the purpose of making
the ganeti daemon shut down after handling a request. Currently it will
be used by ganeti-noded but in the future ganeti-masterd might make use
of it as well. Its usage is documented in the docstring.

Reviewed-by: iustinp

e50bdd68

Add missing empty line in SshKeyError's docstring · b0059682
Guido Trotter authored 17 years ago
```
Reviewed-by: iustinp
```
b0059682

Remove spurious check during LUAddNode · 49abbd3e

Guido Trotter authored 17 years ago

There is no point in checking whether the cluster VNC password file
exists as a prerequisite for AddNode, considering the check happens on
the master node, not the target one. Removing this check.

Reviewed-by: iustinp

49abbd3e

Improve LURemoveNode BuildHooksEnv docstring · d08869ee
Guido Trotter authored 17 years ago
```
Reviewed-by: iustinp
```
d08869ee

Jun 25, 2008

Cleanup old DRBD 0.7.x code · 00fb8246

Michael Hanselmann authored 17 years ago

Apparently there were still some leftovers. While removing an instance,
I got the message "unhandled exception 'module' object has no attribute
'LD_MD_R1'".

Reviewed-by: iustinp

00fb8246

Cleanup LV status computation · 99e8295c

Iustin Pop authored 17 years ago

Currently, when seeing if a LV is degraded or not (i.e. virtual volume),
we first attach to the device (which does an lvdisplay), then do a lvs
in order to display the lv_attr. This generates two external commands to
do (almost) the same thing.

This patch changes the Attach() method for LVs to call lvs and display
both the major/minor (needed for attach) and the lv_status (needed for
GetSyncStatus). Thus, later in GetSyncStatus, we don't need to run lvs
again, and instead just return the value computed in Attach().

Reviewed-by: imsnah

99e8295c

Jun 23, 2008

Remove lib/Makefile.libcommon · 5878b1b5
Michael Hanselmann authored 17 years ago
```
Reviewed-by: iustinp
```
5878b1b5

Fix gnt-cluster “command” and “copyfile” · b3989551

Iustin Pop authored 17 years ago

Since the disabling of forking in the master daemon, the two ssh-based
subcommands were not working anymore. However, there is no need at all
for the commands to be run from the master daemon (permissions to read
the cluster private ssh key notwithstanding), they can be run directly
from the command line utilities.

The patch removes the two opcodes OpRunClusterCommand and
OpClusterCopyFile (and their associated LUs) and changes the code in
‘gnt-cluster’ to query the list of nodes and run directly the SshRunner
over the list. As such, all forking is done from the gnt-cluster script,
and the commands are working again.

Reviewed-by: imsnah

b3989551

objects: Remove config_version from cluster configuration · cf9cb46a
Michael Hanselmann authored 17 years ago
```
Reviewed-by: ultrotter
```
cf9cb46a
Add functions to calculate version number to constants.py · 1b45f4e5
Michael Hanselmann authored 17 years ago
```
In cfgupgrade, we need to extract parts of and build new version numbers.

Reviewed-by: iustinp
```
1b45f4e5

utils.WriteFile: Remove optional check_abspath parameter · 04a8d789

Michael Hanselmann authored 17 years ago

cfgupgrade will not work with relative paths at all, but rather get them
from constants.py.

Reviewed-by: iustinp

04a8d789

Jun 22, 2008

Add a ‘tags’ field to instance and node listing · 130a6a6f

Iustin Pop authored 17 years ago

Currently there isn't any easy way to list all nodes or instance and
their tags; you have to query each node in turn, or list all the tags
via something like “gnt-cluster search-tags '.*'”. Of course, this is
not optimal.

The patch adds a new fields to “gnt-instance list” and “gnt-node list”
called ‘tags’, that will list the tags of the object in comma-separated
form. This field will be empty if there are no tags (when using a
separator this output can still be parsed by other scripts).

At opcode level, there is a new fields called ‘tags’ that returns a
(python) list of the object tags.

Reviewed-by: ultrotter

130a6a6f

Jun 21, 2008

Implement handling of luxi errors in cli.py · 03a8dbdc

Iustin Pop authored 17 years ago

Currently the generic handling of ganeti errors in cli.py (GenericMain
and FormatError) only handles the core ganeti errors, and not the client
protocol errors (which live in a separate hierarchy).

This patch adds handling of luxi errors too, and also adds another luxi
error for the case when the master is not running. This gives us a nice:

  gnta1:~# gnt-node list
  Cannot communicate with the master daemon.
  Is it running and listening on '/var/run/ganeti-master.sock'?

error message instead of a traceback.

Reviewed-by: amishchenko

03a8dbdc

Jun 20, 2008

Add a rpc call for BlockDev.Close() · d61cbe76

Iustin Pop authored 17 years ago

This patch adds rpc layer calls (in rpc.py and the equivalent in
ganeti-noded) to close a list of block devices, and the wrapper in
backend.py that takes a list of Disk objects, identifies them and
returns correctly formatted results.

The reason why this very basic call was missing until now from the rpc
layer is that we usually don't care about device closes (though we
should, and will do so in the future) as only drbd has a meaningful
Close() operation; right now we directly do Shutdown().

The patch is clean enough that it's actually independent of the live
migration implementation.

Reviewed-by: imsnah

d61cbe76

Jun 19, 2008

Use a single Makefile.am instead of many · e8230860

Michael Hanselmann authored 17 years ago

This change allows us to use cleaner dependencies between
directories. The build system is basically rewritten in large parts
and may contain bugs.

Reviewed-by: iustinp

e8230860

Jun 18, 2008

Rework the DRBD8 device status computation · 6b90c22e

Iustin Pop authored 17 years ago

Currently, compute the status of a drbd8 device in GetSyncStatus and
return only the values that we need (and fit in the framework of
GetSyncStatus). However, the full status details are useful (and needed)
in other places, so the patch attempts to improve this situation.

We abstract the status of a device outside in a separate class, that
knows how to parse contents from /proc/drbd and set easily accessible
attributes. We then simplify the GetSyncStatus to use this and return
the values that it needs, and add a separate method that returns the
full status object.

The move to a separate class cleans up a little bit the old
sync-progress computation from GetSyncStatus, but it's still many
regexes.

The patch also adds unittests for a few statuses, and modifies one
BaseDRBD call to accept a custom filename instead of '/proc/drbd' to
ease unittests.

Reviewed-by: imsnah

6b90c22e

ganeti-watcher: Replace custom exceptions with ganeti.error.* · 7bca53e4
Michael Hanselmann authored 17 years ago
```
Reviewed-by: iustinp
```
7bca53e4

Add more parameters to utils.WriteFile · 71714516

Michael Hanselmann authored 17 years ago

- Make closing file optional: Required by ganeti-watcher to keep
  file open after writing it. Changes return value of utils.WriteFile
  if "close" parameter evaluates to True.
- Pre- and post-write functions: Can be used to lock files. This
  will be used by ganeti-watcher to lock the temporary file before
  renaming.

Reviewed-by: iustinp

71714516

Replace custom logging code in watcher with logging module · 438b45d4

Michael Hanselmann authored 17 years ago

- Log timestamp for all messages
- Write everything to logfile and optionally to stderr
- Log messages are no longer buffered, allowing a user to see progress

Reviewed-by: ultrotter

438b45d4

Make sure serialized data ends with EOL character · e91ffe49

Michael Hanselmann authored 17 years ago

Also fix the regular expression to not remove newlines. The simplejson
module puts whitespace at line endings when using indentation. Remove
unnecessary import of ConfigParser module.

Reviewed-by: ultrotter

e91ffe49

Jun 17, 2008

Allow disk object to set their own physical ID · 0402302c

Iustin Pop authored 17 years ago

Currently, the way to customize a DRBD disk from (node name 1, node name
2, port) to (ip1, port, ip2, port) is to use the ConfigWriter method
SetDiskID. However, since this needs a ConfigWriter object, it can be
run only on the master, and therefore disk object can't be passed to
more than one node unchanged. This, coupled with the rpc layer
limitation that all nodes in a multi-node call receive the same
arguments, prevent any kind of multi-node operation that has disks as an
argument.

This patch takes the SetDiskID method from ConfigWriter and ports it to
the disk object itself, and instead of the full node configuration it
uses a simple {node_name: replication_ip} mapping for all the nodes
involved in the disk tree (currently we only pass primary and secondary
node since we don't support nested drbd devices).

This allows us to send disks to both the primary and secondary nodes at
once and perform synchronized drbd activation on primary/secondary
nodes.

Note that while for the 1.2 branch this will not change old methods, it
is worth to investigate and possible replace all such calls on the
master to the nodes themselves for the 2.0 branch.

Reviewed-by: ultrotter

0402302c

Fix an error-handling case · c7cdfc90

Iustin Pop authored 17 years ago

There is a mistake in handling grow-disk for an invalid disk. This patch
fixes it.

Reviewed-by: imsnah

c7cdfc90

Implement disk grow at LU level · 8729e0d7

Iustin Pop authored 17 years ago

This patch adds a new opcode and LU for growing an instance's disk.

The opcode allows growing only one disk at time, and will throw an error
if the operation fails midway (e.g. on the primary node after it has
been increased on the secondary node). As such, it might actually leave
different sized LVs on different nodes, but this will not create
problems.

Reviewed-by: imsnah

8729e0d7

Add method to update a disk object size · acec9d51

Iustin Pop authored 17 years ago

This patch adds a method that implements updating of a disk
(object.Disk) size, together with its children.

While this will not track the exact disk size, it allows at least an
approximate size to be recorded in the configuration (and queried).

Reviewed-by: imsnah

acec9d51

Implement block device grow at the rpc layer · 4c8ba8b3

Iustin Pop authored 17 years ago

This simple patch exposes the block device grow operation at the rpc
layer. It does not increase the protocol version as it has been recently
changed by the live failover rpc call.

Reviewed-by: imsnah

4c8ba8b3

Jun 16, 2008

Expose block device grow in backend.py · 594609c0

Iustin Pop authored 17 years ago

This patch adds a wrapper over the block device grow operation that
converts the input and output parameters as needed for the rpc layer.

Reviewed-by: imsnah

594609c0

bdev: implement disk resize for lvm/drbd8 · 1005d816

Iustin Pop authored 17 years ago

This patch implements disk resize at the bdev level for the LVM and
DRBD8 disk types. It is not implemented for DRBD7 and MD since the way
MD works with its underlaying devices makes it harder and this
combination is also deprecated.

The LVM resize operation is tried three times, with different allocation
policies:
  - contiguous first, since this is best for allocation purposes (it
    won't fragment too much the PV)
  - cling, which is supported only by more recent LVM versions, will try
    to place the new extents on the same PV as the rest of the LV
  - and finally normal, which is the default

Reviewed-by: imsnah

1005d816

Move SetKey to WritableSimpleStore and use it · 05f86716

Guido Trotter authored 17 years ago

Before we used to be able to update SimpleStore by just calling SetKey, this
feature is now moved to an external class, which inherits from it. In this
patch the new WritableSimpleStore class is also put to use, in the LUs that
need it. Rather than making each LU instantiate it, we have a new LogicalUnit
flag REQ_WSSTORE which defaults to False, but when declared to be True asks the
LogicalUnit to be initialized with a writeable version of the SimpleStore.
LUMasterFailover and LURenameCluster are then changed to use it.

InitCluster is also changed to instantiate a WritableSimpleStore, rather
than a normal one.

Reviewed-by: imsnah

05f86716

Add migration support at the rpc layer · 2a10865c

Iustin Pop authored 17 years ago

This patch adds the migration rpc call and its implementation in the
backend. The patch does not deal with the correct activation of disks.

Because of the new RPC, the protocol version is increased.

Reviewed-by: imsnah

2a10865c

hypervisor: add live migration support · 6e7275c0

Iustin Pop authored 17 years ago

This is just the hypervisor-level migration (e.g. “xm migrate”) not the
whole node coordination work.

Reviewed-by: ultrotter

6e7275c0

Jun 15, 2008

Activate down instances' disks on replace-disks · 22985314

Guido Trotter authored 17 years ago

When replacing disks or evacuating nodes with instances administratively
down ganeti fails because the instance disks are not active. This patch
activates them, performs the replacement, and shuts them down again.
Changing this also fixes the same issue on gnt-node evacuate.

Reviewed-by: iustinp

22985314