Commit 7ed400f0 authored by Stratos Psomadakis's avatar Stratos Psomadakis Committed by Iustin Pop

rbd disk template documentation and manpages

Add documentation and modify manpages for the RBD disk template.
Signed-off-by: default avatarConstantinos Venetsanopoulos <cven@grnet.gr>
Signed-off-by: default avatarStratos Psomadakis <psomas@grnet.gr>
Signed-off-by: default avatarIustin Pop <iustin@google.com>
Reviewed-by: default avatarIustin Pop <iustin@google.com>
parent 631aedf9
......@@ -19,6 +19,8 @@ Before installing, please verify that you have the following programs:
versions 0.11.X or above have shown good behavior).
- `DRBD <http://www.drbd.org/>`_, kernel module and userspace utils,
version 8.0.7 or above
- `RBD <http://ceph.newdream.net/>`_, kernel modules (rbd.ko/libceph.ko)
and userspace utils (ceph-common)
- `LVM2 <http://sourceware.org/lvm2/>`_
- `OpenSSH <http://www.openssh.com/portable.html>`_
- `bridge utilities <http://www.linuxfoundation.org/en/Net:Bridge>`_
......@@ -50,7 +52,7 @@ These programs are supplied as part of most Linux distributions, so
usually they can be installed via the standard package manager. Also
many of them will already be installed on a standard machine. On
Debian/Ubuntu, you can use this command line to install all required
packages, except for DRBD and Xen::
packages, except for RBD, DRBD and Xen::
$ apt-get install lvm2 ssh bridge-utils iproute iputils-arping \
ndisc6 python python-pyopenssl openssl \
......
......@@ -115,7 +115,7 @@ The are multiple options for the storage provided to an instance; while
the instance sees the same virtual drive in all cases, the node-level
configuration varies between them.
There are four disk templates you can choose from:
There are five disk templates you can choose from:
diskless
The instance has no disks. Only used for special purpose operating
......@@ -138,6 +138,10 @@ drbd
to obtain a highly available instance that can be failed over to a
remote node should the primary one fail.
rbd
The instance will use Volumes inside a RADOS cluster as backend for its
disks. It will access them using the RADOS block device (RBD).
IAllocator
~~~~~~~~~~
......@@ -510,6 +514,13 @@ The instance will be started with an amount of memory between its
target node, or the operation will fail if that's not possible. See
:ref:`instance-startup-label` for details.
If the instance's disk template is of type rbd, then you can specify
the target node (which can be any node) explicitly, or specify an
iallocator plugin. If you omit both, the default iallocator will be
used to determine the target node::
gnt-instance failover -n TARGET_NODE INSTANCE_NAME
Live migrating an instance
~~~~~~~~~~~~~~~~~~~~~~~~~~
......@@ -530,6 +541,13 @@ migrating it, unless the ``--no-runtime-changes`` option is passed, in
which case the target node should have at least the instance's current
runtime memory free.
If the instance's disk template is of type rbd, then you can specify
the target node (which can be any node) explicitly, or specify an
iallocator plugin. If you omit both, the default iallocator will be
used to determine the target node::
gnt-instance migrate -n TARGET_NODE INSTANCE_NAME
Moving an instance (offline)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......@@ -1247,6 +1265,10 @@ of a cluster installation by following these steps on all of the nodes:
6. Remove the ganeti state directory (``rm -rf /var/lib/ganeti/*``),
replacing the path with the correct path for your installation.
7. If using RBD, run ``rbd unmap /dev/rbdN`` to unmap the RBD disks.
Then remove the RBD disk images used by Ganeti, identified by their
UUIDs (``rbd rm uuid.rbd.diskN``).
On the master node, remove the cluster from the master-netdev (usually
``xen-br0`` for bridged mode, otherwise ``eth0`` or similar), by running
``ip a del $clusterip/32 dev xen-br0`` (use the correct cluster ip and
......
......@@ -41,7 +41,7 @@ using the first one whose filename matches the one given by the user.
Command line interface changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The node selection options in instanece add and instance replace disks
The node selection options in instance add and instance replace disks
can be replace by the new ``--iallocator=NAME`` option (shortened to
``-I``), which will cause the auto-assignement of nodes with the
passed iallocator. The selected node(s) will be show as part of the
......
......@@ -69,6 +69,10 @@ all Ganeti features. The volume group name Ganeti uses (by default) is
You can also use file-based storage only, without LVM, but this setup is
not detailed in this document.
If you choose to use RBD-based instances, there's no need for LVM
provisioning. However, this feature is experimental, and is not
recommended for production clusters.
While you can use an existing system, please note that the Ganeti
installation is intrusive in terms of changes to the system
configuration, and it's best to use a newly-installed system without
......@@ -300,6 +304,88 @@ instances on a node.
}
}
Installing RBD
+++++++++++++++
Recommended on all nodes: RBD_ is required if you want to create
instances with RBD disks residing inside a RADOS cluster (make use of
the rbd disk template). RBD-based instances can failover or migrate to
any other node in the ganeti cluster, enabling you to exploit of all
Ganeti's high availabilily (HA) features.
.. attention::
Be careful though: rbd is still experimental! For now it is
recommended only for testing purposes. No sensitive data should be
stored there.
.. _RBD: http://ceph.newdream.net/
You will need the ``rbd`` and ``libceph`` kernel modules, the RBD/Ceph
userspace utils (ceph-common Debian package) and an appropriate
Ceph/RADOS configuration file on every VM-capable node.
You will also need a working RADOS Cluster accessible by the above
nodes.
RADOS Cluster
~~~~~~~~~~~~~
You will need a working RADOS Cluster accesible by all VM-capable nodes
to use the RBD template. For more information on setting up a RADOS
Cluster, refer to the `official docs <http://ceph.newdream.net/>`_.
If you want to use a pool for storing RBD disk images other than the
default (``rbd``), you should first create the pool in the RADOS
Cluster, and then set the corresponding rbd disk parameter named
``pool``.
Kernel Modules
~~~~~~~~~~~~~~
Unless your distribution already provides it, you might need to compile
the ``rbd`` and ``libceph`` modules from source. You will need Linux
Kernel 3.2 or above for the kernel modules. Alternatively you will have
to build them as external modules (from Linux Kernel source 3.2 or
above), if you want to run a less recent kernel, or your kernel doesn't
include them.
Userspace Utils
~~~~~~~~~~~~~~~
The RBD template has been tested with ``ceph-common`` v0.38 and
above. We recommend using the latest version of ``ceph-common``.
.. admonition:: Debian
On Debian, you can just install the RBD/Ceph userspace utils with
the following command::
apt-get install ceph-common
Configuration file
~~~~~~~~~~~~~~~~~~
You should also provide an appropriate configuration file
(``ceph.conf``) in ``/etc/ceph``. For the rbd userspace utils, you'll
only need to specify the IP addresses of the RADOS Cluster monitors.
.. admonition:: ceph.conf
Sample configuration file::
[mon.a]
host = example_monitor_host1
mon addr = 1.2.3.4:6789
[mon.b]
host = example_monitor_host2
mon addr = 1.2.3.5:6789
[mon.c]
host = example_monitor_host3
mon addr = 1.2.3.6:6789
For more information, please see the `Ceph Docs
<http://ceph.newdream.net/docs/latest/>`_
Other required software
+++++++++++++++++++++++
......
......@@ -445,6 +445,13 @@ List of parameters available for the **plain** template:
stripes
Number of stripes to use for new LVs.
List of parameters available for the **rbd** template:
pool
The RADOS cluster pool, inside which all rbd volumes will reside.
When a new RADOS cluster is deployed, the default pool to put rbd
volumes (Images in RADOS terminology) is 'rbd'.
The option ``--maintain-node-health`` allows one to enable/disable
automatic maintenance actions on nodes. Currently these include
automatic shutdown of instances and deactivation of DRBD devices on
......
......@@ -27,7 +27,7 @@ ADD
^^^
| **add**
| {-t|--disk-template {diskless | file \| plain \| drbd}}
| {-t|--disk-template {diskless | file \| plain \| drbd \| rbd}}
| {--disk=*N*: {size=*VAL* \| adopt=*LV*}[,vg=*VG*][,metavg=*VG*][,mode=*ro\|rw*]
| \| {-s|--os-size} *SIZE*}
| [--no-ip-check] [--no-name-check] [--no-start] [--no-install]
......@@ -588,6 +588,9 @@ plain
drbd
Disk devices will be drbd (version 8.x) on top of lvm volumes.
rbd
Disk devices will be rbd volumes residing inside a RADOS cluster.
The optional second value of the ``-n (--node)`` is used for the drbd
template type and specifies the remote node.
......@@ -1321,7 +1324,7 @@ GROW-DISK
{*amount*}
Grows an instance's disk. This is only possible for instances having a
plain or drbd disk template.
plain, drbd or rbd disk template.
Note that this command only change the block device size; it will not
grow the actual filesystems, partitions, etc. that live on that
......@@ -1341,10 +1344,10 @@ amount to increase the disk with in mebibytes) or can be given similar
to the arguments in the create instance operation, with a suffix
denoting the unit.
Note that the disk grow operation might complete on one node but fail
on the other; this will leave the instance with different-sized LVs on
the two nodes, but this will not create problems (except for unused
space).
For instances with a drbd template, note that the disk grow operation
might complete on one node but fail on the other; this will leave the
instance with different-sized LVs on the two nodes, but this will not
create problems (except for unused space).
If you do not want gnt-instance to wait for the new disk region to be
synced, use the ``--no-wait-for-sync`` option.
......@@ -1401,16 +1404,25 @@ Recovery
FAILOVER
^^^^^^^^
**failover** [-f] [--ignore-consistency] [--shutdown-timeout=*N*]
[--submit] [--ignore-ipolicy] {*instance*}
| **failover** [-f] [--ignore-consistency] [--ignore-ipolicy]
| [--shutdown-timeout=*N*]
| [{-n|--target-node} *node* \| {-I|--iallocator} *name*]
| [--submit]
| {*instance*}
Failover will stop the instance (if running), change its primary node,
and if it was originally running it will start it again (on the new
primary). This only works for instances with drbd template (in which
case you can only fail to the secondary node) and for externally
mirrored templates (shared storage) (which can change to any other
mirrored templates (blockdev and rbd) (which can change to any other
node).
If the instance's disk template is of type blockdev or rbd, then you
can explicitly specify the target node (which can be any node) using
the ``-n`` or ``--target-node`` option, or specify an iallocator plugin
using the ``-I`` or ``--iallocator`` option. If you omit both, the default
iallocator will be used to specify the target node.
Normally the failover will check the consistency of the disks before
failing over the instance. If you are trying to migrate instances off
a dead node, this will fail. Use the ``--ignore-consistency`` option
......@@ -1443,11 +1455,19 @@ MIGRATE
**migrate** [-f] [--allow-failover] [--non-live]
[--migration-mode=live\|non-live] [--ignore-ipolicy]
[--no-runtime-changes] {*instance*}
Migrate will move the instance to its secondary node without
shutdown. It only works for instances having the drbd8 disk template
type.
[--no-runtime-changes]
[{-n|--target-node} *node* \| {-I|--iallocator} *name*] {*instance*}
Migrate will move the instance to its secondary node without shutdown.
As with failover, it only works for instances having the drbd disk
template or an externally mirrored disk template type such as blockdev
or rbd.
If the instance's disk template is of type blockdev or rbd, then you can
explicitly specify the target node (which can be any node) using the
``-n`` or ``--target-node`` option, or specify an iallocator plugin
using the ``-I`` or ``--iallocator`` option. If you omit both, the
default iallocator will be used to specify the target node.
The migration command needs a perfectly healthy instance, as we rely
on the dual-master capability of drbd8 and the disks of the instance
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment