diff --git a/Makefile.am b/Makefile.am index 0dd6922702e5e4ed47e7e5bcff44c1509817c84b..4303dfed6abcfc4e57d660ed39a8f44de80f5127 100644 --- a/Makefile.am +++ b/Makefile.am @@ -110,11 +110,11 @@ http_PYTHON = \ docsgml = \ doc/hooks.sgml \ doc/install.sgml \ - doc/admin.sgml \ doc/rapi.sgml \ doc/iallocator.sgml docrst = \ + doc/admin.rst \ doc/design-2.0.rst \ doc/security.rst diff --git a/doc/admin.rst b/doc/admin.rst new file mode 100644 index 0000000000000000000000000000000000000000..56a39726324bac4e499811394253f3c0593ac7b3 --- /dev/null +++ b/doc/admin.rst @@ -0,0 +1,294 @@ +Ganeti administrator's guide +============================ + +Documents Ganeti version 2.0 + +.. contents:: + +Introduction +------------ + +Ganeti is a virtualization cluster management software. You are +expected to be a system administrator familiar with your Linux +distribution and the Xen or KVM virtualization environments before +using it. + + +The various components of Ganeti all have man pages and interactive +help. This manual though will help you getting familiar with the +system by explaining the most common operations, grouped by related +use. + +After a terminology glossary and a section on the prerequisites needed +to use this manual, the rest of this document is divided in three main +sections, which group different features of Ganeti: + +- Instance Management +- High Availability Features +- Debugging Features + +Ganeti terminology +~~~~~~~~~~~~~~~~~~ + +This section provides a small introduction to Ganeti terminology, +which might be useful to read the rest of the document. + +Cluster + A set of machines (nodes) that cooperate to offer a coherent + highly available virtualization service. + +Node + A physical machine which is member of a cluster. + Nodes are the basic cluster infrastructure, and are + not fault tolerant. + +Master node + The node which controls the Cluster, from which all + Ganeti commands must be given. + +Instance + A virtual machine which runs on a cluster. It can be a + fault tolerant highly available entity. + +Pool + A pool is a set of clusters sharing the same network. + +Meta-Cluster + Anything that concerns more than one cluster. + +Prerequisites +~~~~~~~~~~~~~ + +You need to have your Ganeti cluster installed and configured before +you try any of the commands in this document. Please follow the +*Ganeti installation tutorial* for instructions on how to do that. + +Managing Instances +------------------ + +Adding/Removing an instance +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Adding a new virtual instance to your Ganeti cluster is really easy. +The command is:: + + gnt-instance add \ + -n TARGET_NODE:SECONDARY_NODE -o OS_TYPE -t DISK_TEMPLATE \ + INSTANCE_NAME + +The instance name must be resolvable (e.g. exist in DNS) and usually +to an address in the same subnet as the cluster itself. Options you +can give to this command include: + +- The disk size (``-s``) for a single-disk instance, or multiple + ``--disk N:size=SIZE`` options for multi-instance disks + +- The memory size (``-B memory``) + +- The number of virtual CPUs (``-B vcpus``) + +- Arguments for the NICs of the instance; by default, a single-NIC + instance is created. The IP and/or bridge of the NIC can be changed + via ``--nic 0:ip=IP,bridge=BRIDGE`` + + +There are four types of disk template you can choose from: + +diskless + The instance has no disks. Only used for special purpouse operating + systems or for testing. + +file + The instance will use plain files as backend for its disks. No + redundancy is provided, and this is somewhat more difficult to + configure for high performance. + +plain + The instance will use LVM devices as backend for its disks. No + redundancy is provided. + +drbd + .. note:: This is only valid for multi-node clusters using DRBD 8.0.x + + A mirror is set between the local node and a remote one, which must + be specified with the second value of the --node option. Use this + option to obtain a highly available instance that can be failed over + to a remote node should the primary one fail. + +For example if you want to create an highly available instance use the +drbd disk templates:: + + gnt-instance add -n TARGET_NODE:SECONDARY_NODE -o OS_TYPE -t drbd \ + INSTANCE_NAME + +To know which operating systems your cluster supports you can use +the command:: + + gnt-os list + +Removing an instance is even easier than creating one. This operation +is irrereversible and destroys all the contents of your instance. Use +with care:: + + gnt-instance remove INSTANCE_NAME + +Starting/Stopping an instance +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Instances are automatically started at instance creation time. To +manually start one which is currently stopped you can run:: + + gnt-instance startup INSTANCE_NAME + +While the command to stop one is:: + + gnt-instance shutdown INSTANCE_NAME + +The command to see all the instances configured and their status is:: + + gnt-instance list + +Do not use the Xen commands to stop instances. If you run for example +xm shutdown or xm destroy on an instance Ganeti will automatically +restart it (via the ``ganeti-watcher``). + +Exporting/Importing an instance +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You can create a snapshot of an instance disk and Ganeti +configuration, which then you can backup, or import into another +cluster. The way to export an instance is:: + + gnt-backup export -n TARGET_NODE INSTANCE_NAME + +The target node can be any node in the cluster with enough space under +``/srv/ganeti`` to hold the instance image. Use the *--noshutdown* +option to snapshot an instance without rebooting it. Any previous +snapshot of the same instance existing cluster-wide under +``/srv/ganeti`` will be removed by this operation: if you want to keep +them move them out of the Ganeti exports directory. + +Importing an instance is similar to creating a new one. The command is:: + + gnt-backup import -n TARGET_NODE -t DISK_TEMPLATE \ + --src-node=NODE --src-dir=DIR INSTANCE_NAME + +Most of the options available for the command *gnt-instance add* are +supported here too. + +High availability features +-------------------------- + +.. note:: This section only applies to multi-node clusters + +Failing over an instance +~~~~~~~~~~~~~~~~~~~~~~~~ + +If an instance is built in highly available mode you can at any time +fail it over to its secondary node, even if the primary has somehow +failed and it's not up anymore. Doing it is really easy, on the master +node you can just run:: + + gnt-instance failover INSTANCE_NAME + +That's it. After the command completes the secondary node is now the +primary, and vice versa. + +Live migrating an instance +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If an instance is built in highly available mode, it currently runs +and both its nodes are running fine, you can at migrate it over to its +secondary node, without dowtime. On the master node you need to run:: + + gnt-instance migrate INSTANCE_NAME + +Replacing an instance disks +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +So what if instead the secondary node for an instance has failed, or +you plan to remove a node from your cluster, and you failed over all +its instances, but it's still secondary for some? The solution here is +to replace the instance disks, changing the secondary node:: + + gnt-instance replace-disks -n NODE INSTANCE_NAME + +This process is a bit long, but involves no instance downtime, and at +the end of it the instance has changed its secondary node, to which it +can if necessary be failed over. + +Failing over the master node +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This is all good as long as the Ganeti Master Node is up. Should it go +down, or should you wish to decommission it, just run on any other +node the command:: + + gnt-cluster masterfailover + +and the node you ran it on is now the new master. + +Adding/Removing nodes +~~~~~~~~~~~~~~~~~~~~~ + +And of course, now that you know how to move instances around, it's +easy to free up a node, and then you can remove it from the cluster:: + + gnt-node remove NODE_NAME + +and maybe add a new one:: + + gnt-node add --secondary-ip=ADDRESS NODE_NAME + +Debugging Features +------------------ + +At some point you might need to do some debugging operations on your +cluster or on your instances. This section will help you with the most +used debugging functionalities. + +Accessing an instance's disks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +From an instance's primary node you have access to its disks. Never +ever mount the underlying logical volume manually on a fault tolerant +instance, or you risk breaking replication. The correct way to access +them is to run the command:: + + gnt-instance activate-disks INSTANCE_NAME + +And then access the device that gets created. After you've finished +you can deactivate them with the deactivate-disks command, which works +in the same way. + +Accessing an instance's console + +The command to access a running instance's console is:: + + gnt-instance console INSTANCE_NAME + +Use the console normally and then type ``^]`` when +done, to exit. + +Instance OS definitions Debugging +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Should you have any problems with operating systems support the +command to ran to see a complete status for all your nodes is:: + + gnt-os diagnose + +Cluster-wide debugging +~~~~~~~~~~~~~~~~~~~~~~ + +The *gnt-cluster* command offers several options to run tests or +execute cluster-wide operations. For example:: + + gnt-cluster command + gnt-cluster copyfile + gnt-cluster verify + gnt-cluster verify-disks + gnt-cluster getmaster + gnt-cluster version + +See the man page *gnt-cluster* to know more about their usage. diff --git a/doc/admin.sgml b/doc/admin.sgml deleted file mode 100644 index 239ba5612bd37c21716a8b7445846a485fbf8baa..0000000000000000000000000000000000000000 --- a/doc/admin.sgml +++ /dev/null @@ -1,455 +0,0 @@ -<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [ -]> - <article class="specification"> - <articleinfo> - <title>Ganeti administrator's guide</title> - </articleinfo> - <para>Documents Ganeti version 2.0</para> - <sect1> - <title>Introduction</title> - - <para> - Ganeti is a virtualization cluster management software. You are - expected to be a system administrator familiar with your Linux - distribution and the Xen or KVM virtualization environments - before using it. - </para> - - <para> - The various components of Ganeti all have man pages and - interactive help. This manual though will help you getting - familiar with the system by explaining the most common - operations, grouped by related use. - </para> - - <para> - After a terminology glossary and a section on the prerequisites - needed to use this manual, the rest of this document is divided - in three main sections, which group different features of - Ganeti: - <itemizedlist> - <listitem> - <simpara>Instance Management</simpara> - </listitem> - <listitem> - <simpara>High Availability Features</simpara> - </listitem> - <listitem> - <simpara>Debugging Features</simpara> - </listitem> - </itemizedlist> - </para> - - <sect2> - <title>Ganeti terminology</title> - - <para> - This section provides a small introduction to Ganeti terminology, which - might be useful to read the rest of the document. - - <glosslist> - <glossentry> - <glossterm>Cluster</glossterm> - <glossdef> - <simpara> - A set of machines (nodes) that cooperate to offer a - coherent highly available virtualization service. - </simpara> - </glossdef> - </glossentry> - <glossentry> - <glossterm>Node</glossterm> - <glossdef> - <simpara> - A physical machine which is member of a cluster. - Nodes are the basic cluster infrastructure, and are - not fault tolerant. - </simpara> - </glossdef> - </glossentry> - <glossentry> - <glossterm>Master node</glossterm> - <glossdef> - <simpara> - The node which controls the Cluster, from which all - Ganeti commands must be given. - </simpara> - </glossdef> - </glossentry> - <glossentry> - <glossterm>Instance</glossterm> - <glossdef> - <simpara> - A virtual machine which runs on a cluster. It can be a - fault tolerant highly available entity. - </simpara> - </glossdef> - </glossentry> - <glossentry> - <glossterm>Pool</glossterm> - <glossdef> - <simpara> - A pool is a set of clusters sharing the same network. - </simpara> - </glossdef> - </glossentry> - <glossentry> - <glossterm>Meta-Cluster</glossterm> - <glossdef> - <simpara> - Anything that concerns more than one cluster. - </simpara> - </glossdef> - </glossentry> - </glosslist> - </para> - </sect2> - - <sect2> - <title>Prerequisites</title> - - <para> - You need to have your Ganeti cluster installed and configured before - you try any of the commands in this document. Please follow the - <emphasis>Ganeti installation tutorial</emphasis> for instructions on - how to do that. - </para> - </sect2> - - </sect1> - - <sect1> - <title>Managing Instances</title> - - <sect2> - <title>Adding/Removing an instance</title> - - <para> - Adding a new virtual instance to your Ganeti cluster is really easy. - The command is: - - <synopsis>gnt-instance add -n <replaceable>TARGET_NODE<optional>:SECONDARY_NODE</optional></replaceable> -o <replaceable>OS_TYPE</replaceable> -t <replaceable>DISK_TEMPLATE</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis> - - The instance name must be resolvable (e.g. exist in DNS) and - usually to an address in the same subnet as the cluster - itself. Options you can give to this command include: - - <itemizedlist> - <listitem> - <simpara>The disk size (<option>-s</option>) for a - single-disk instance, or multiple <option>--disk - <replaceable>N</replaceable>:size=<replaceable>SIZE</replaceable></option> - options for multi-instance disks</simpara> - </listitem> - <listitem> - <simpara>The memory size (<option>-B memory</option>)</simpara> - </listitem> - <listitem> - <simpara>The number of virtual CPUs (<option>-B vcpus</option>)</simpara> - </listitem> - <listitem> - <para> - Arguments for the NICs of the instance; by default, a - single-NIC instance is created. The IP and/or bridge of - the NIC can be changed via <option>--nic - 0:ip=<replaceable>IP</replaceable>,bridge=<replaceable>BRIDGE</replaceable></option> - </para> - </listitem> - </itemizedlist> - </para> - - <para>There are four types of disk template you can choose from:</para> - - <variablelist> - <varlistentry> - <term>diskless</term> - <listitem> - <para>The instance has no disks. Only used for special purpouse - operating systems or for testing.</para> - </listitem> - </varlistentry> - - <varlistentry> - <term>file</term> - <listitem> - <para>The instance will use plain files as backend for its - disks. No redundancy is provided, and this is somewhat - more difficult to configure for high performance.</para> - </listitem> - </varlistentry> - - <varlistentry> - <term>plain</term> - <listitem> - <para>The instance will use LVM devices as backend for its disks. - No redundancy is provided.</para> - </listitem> - </varlistentry> - - <varlistentry> - <term>drbd</term> - <listitem> - <simpara><emphasis role="strong">Note:</emphasis> This is only - valid for multi-node clusters using drbd 8.0.x</simpara> - <simpara> - A mirror is set between the local node and a remote one, which - must be specified with the second value of the --node option. Use - this option to obtain a highly available instance that can be - failed over to a remote node should the primary one fail. - </simpara> - </listitem> - </varlistentry> - - </variablelist> - - <para> - For example if you want to create an highly available instance use the - drbd disk templates: - <synopsis>gnt-instance add -n <replaceable>TARGET_NODE</replaceable><optional>:<replaceable>SECONDARY_NODE</replaceable></optional> -o <replaceable>OS_TYPE</replaceable> -t drbd \ - <replaceable>INSTANCE_NAME</replaceable></synopsis> - - <para> - To know which operating systems your cluster supports you can use - <synopsis>gnt-os list</synopsis> - </para> - - <para> - Removing an instance is even easier than creating one. This - operation is irrereversible and destroys all the contents of - your instance. Use with care: - - <synopsis>gnt-instance remove <replaceable>INSTANCE_NAME</replaceable></synopsis> - </para> - </sect2> - - <sect2> - <title>Starting/Stopping an instance</title> - - <para> - Instances are automatically started at instance creation time. To - manually start one which is currently stopped you can run: - - <synopsis>gnt-instance startup <replaceable>INSTANCE_NAME</replaceable></synopsis> - - While the command to stop one is: - - <synopsis>gnt-instance shutdown <replaceable>INSTANCE_NAME</replaceable></synopsis> - - The command to see all the instances configured and their status is: - - <synopsis>gnt-instance list</synopsis> - - </para> - - <para> - Do not use the xen commands to stop instances. If you run for - example xm shutdown or xm destroy on an instance Ganeti will - automatically restart it (via the - <citerefentry><refentrytitle>ganeti-watcher</refentrytitle> - <manvolnum>8</manvolnum></citerefentry>) - </para> - - </sect2> - - <sect2> - <title>Exporting/Importing an instance</title> - - <para> - You can create a snapshot of an instance disk and Ganeti - configuration, which then you can backup, or import into - another cluster. The way to export an instance is: - - <synopsis>gnt-backup export -n <replaceable>TARGET_NODE</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis> - - The target node can be any node in the cluster with enough - space under <filename class="directory">/srv/ganeti</filename> - to hold the instance image. Use the - <option>--noshutdown</option> option to snapshot an instance - without rebooting it. Any previous snapshot of the same - instance existing cluster-wide under <filename - class="directory">/srv/ganeti</filename> will be removed by - this operation: if you want to keep them move them out of the - Ganeti exports directory. - </para> - - <para> - Importing an instance is similar to creating a new one. The command is: - - <synopsis>gnt-backup import -n <replaceable>TARGET_NODE</replaceable> -t <replaceable>DISK_TEMPLATE</replaceable> --src-node=<replaceable>NODE</replaceable> --src-dir=DIR INSTANCE_NAME</synopsis> - - Most of the options available for the command - <emphasis>gnt-instance add</emphasis> are supported here too. - - </para> - </sect2> - - </sect1> - - - <sect1> - <title>High availability features</title> - - <note> - <simpara>This section only applies to multi-node clusters.</simpara> - </note> - - <sect2> - <title>Failing over an instance</title> - - <para> - If an instance is built in highly available mode you can at - any time fail it over to its secondary node, even if the - primary has somehow failed and it's not up anymore. Doing it - is really easy, on the master node you can just run: - - <synopsis>gnt-instance failover <replaceable>INSTANCE_NAME</replaceable></synopsis> - - That's it. After the command completes the secondary node is - now the primary, and vice versa. - </para> - </sect2> - - <sect2> - <title>Live migrating an instance</title> - - <para> - If an instance is built in highly available mode, it currently - runs and both its nodes are running fine, you can at migrate - it over to its secondary node, without dowtime. On the master - node you need to run: - - <synopsis>gnt-instance migrate <replaceable>INSTANCE_NAME</replaceable></synopsis> - - </para> - </sect2> - - - <sect2> - <title>Replacing an instance disks</title> - - <para> - So what if instead the secondary node for an instance has - failed, or you plan to remove a node from your cluster, and - you failed over all its instances, but it's still secondary - for some? The solution here is to replace the instance disks, - changing the secondary node: - <synopsis>gnt-instance replace-disks <option>-n <replaceable>NODE</replaceable></option> <replaceable>INSTANCE_NAME</replaceable></synopsis> - - This process is a bit long, but involves no instance - downtime, and at the end of it the instance has changed its - secondary node, to which it can if necessary be failed over. - </para> - </sect2> - - <sect2> - <title>Failing over the master node</title> - - <para> - This is all good as long as the Ganeti Master Node is - up. Should it go down, or should you wish to decommission it, - just run on any other node the command: - - <synopsis>gnt-cluster masterfailover</synopsis> - - and the node you ran it on is now the new master. - </para> - </sect2> - <sect2> - <title>Adding/Removing nodes</title> - - <para> - And of course, now that you know how to move instances around, - it's easy to free up a node, and then you can remove it from - the cluster: - - <synopsis>gnt-node remove <replaceable>NODE_NAME</replaceable></synopsis> - - and maybe add a new one: - - <synopsis>gnt-node add <optional><option>--secondary-ip=<replaceable>ADDRESS</replaceable></option></optional> <replaceable>NODE_NAME</replaceable> - - </synopsis> - </para> - </sect2> - </sect1> - - <sect1> - <title>Debugging Features</title> - - <para> - At some point you might need to do some debugging operations on - your cluster or on your instances. This section will help you - with the most used debugging functionalities. - </para> - - <sect2> - <title>Accessing an instance's disks</title> - - <para> - From an instance's primary node you have access to its - disks. Never ever mount the underlying logical volume manually - on a fault tolerant instance, or you risk breaking - replication. The correct way to access them is to run the - command: - - <synopsis>gnt-instance activate-disks <replaceable>INSTANCE_NAME</replaceable></synopsis> - - And then access the device that gets created. After you've - finished you can deactivate them with the deactivate-disks - command, which works in the same way. - </para> - </sect2> - - <sect2> - <title>Accessing an instance's console</title> - - <para> - The command to access a running instance's console is: - - <synopsis>gnt-instance console <replaceable>INSTANCE_NAME</replaceable></synopsis> - - Use the console normally and then type - <userinput>^]</userinput> when done, to exit. - </para> - </sect2> - - <sect2> - <title>Instance OS definitions Debugging</title> - - <para> - Should you have any problems with operating systems support - the command to ran to see a complete status for all your nodes - is: - - <synopsis>gnt-os diagnose</synopsis> - - </para> - - </sect2> - - <sect2> - <title>Cluster-wide debugging</title> - - <para> - The gnt-cluster command offers several options to run tests or - execute cluster-wide operations. For example: - - <screen> -gnt-cluster command -gnt-cluster copyfile -gnt-cluster verify -gnt-cluster verify-disks -gnt-cluster getmaster -gnt-cluster version - </screen> - - See the man page <citerefentry> - <refentrytitle>gnt-cluster</refentrytitle> - <manvolnum>8</manvolnum> </citerefentry> to know more about - their usage. - </para> - </sect2> - - </sect1> - - </article>