Commit eec83a95 authored by Apollon Oikonomopoulos's avatar Apollon Oikonomopoulos Committed by Guido Trotter

Add gnt-network design doc

This design covers high level network block definition and pool
management.
Signed-off-by: default avatarApollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: default avatarGuido Trotter <ultrotter@google.com>
Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
Reviewed-by: default avatarIustin Pop <iustin@google.com>
parent 60152bbe
...@@ -285,6 +285,7 @@ docrst = \ ...@@ -285,6 +285,7 @@ docrst = \
doc/design-impexp2.rst \ doc/design-impexp2.rst \
doc/design-lu-generated-jobs.rst \ doc/design-lu-generated-jobs.rst \
doc/design-multi-reloc.rst \ doc/design-multi-reloc.rst \
doc/design-network.rst \
doc/cluster-merge.rst \ doc/cluster-merge.rst \
doc/design-shared-storage.rst \ doc/design-shared-storage.rst \
doc/devnotes.rst \ doc/devnotes.rst \
......
==================
Network management
==================
.. contents:: :depth: 4
This is a design document detailing the implementation of network resource
management in Ganeti.
Current state and shortcomings
==============================
Currently Ganeti supports two configuration modes for instance NICs:
routed and bridged mode. The ``ip`` NIC parameter, which is mandatory
for routed NICs and optional for bridged ones, holds the given NIC's IP
address and may be filled either manually, or via a DNS lookup for the
instance's hostname.
This approach presents some shortcomings:
a) It relies on external systems to perform network resource
management. Although large organizations may already have IP pool
management software in place, this is not usually the case with
stand-alone deployments. For smaller installations it makes sense to
allocate a pool of IP addresses to Ganeti and let it transparently
assign these IPs to instances as appropriate.
b) The NIC network information is incomplete, lacking netmask and
gateway. Operating system providers could for example use the
complete network information to fully configure an instance's
network parameters upon its creation.
Furthermore, having full network configuration information would
enable Ganeti nodes to become more self-contained and be able to
infer system configuration (e.g. /etc/network/interfaces content)
from Ganeti configuration. This should make configuration of
newly-added nodes a lot easier and less dependant on external
tools/procedures.
c) Instance placement must explicitly take network availability in
different node groups into account; the same ``link`` is implicitly
expected to connect to the same network across the whole cluster,
which may not always be the case with large clusters with multiple
node groups.
Proposed changes
----------------
In order to deal with the above shortcomings, we propose to extend
Ganeti with high-level network management logic, which consists of a new
NIC mode called ``managed``, a new "Network" configuration object and
logic to perform IP address pool management, i.e. maintain a set of
available and occupied IP addresses.
Configuration changes
+++++++++++++++++++++
We propose the introduction of a new high-level Network object,
containing (at least) the following data:
- Symbolic name
- UUID
- Network in CIDR notation (IPv4 + IPv6)
- Default gateway, if one exists (IPv4 + IPv6)
- IP pool management data (reservations)
- Default NIC connectivity mode (bridged, routed). This is the
functional equivalent of the current NIC ``mode``.
- Default host interface (e.g. br0). This is the functional equivalent
of the current NIC ``link``.
- Tags
Each network will be connected to any number of node groups, possibly
overriding connectivity mode and host interface for each node group.
This is achieved by adding a ``networks`` slot to the NodeGroup object
and using the networks' UUIDs as keys.
IP pool management
++++++++++++++++++
A new helper library is introduced, wrapping around Network objects to
give IP pool management capabilities. A network's pool is defined by two
bitfields, the length of the network size each:
``reservations``
This field holds all IP addresses reserved by Ganeti instances, as
well as cluster IP addresses (node addresses + cluster master)
``external reservations``
This field holds all IP addresses that are manually reserved by the
administrator, because some other equipment is using them outside the
scope of Ganeti.
The bitfields are implemented using the python-bitarray package for
space efficiency and their binary value stored base64-encoded for JSON
compatibility. This approach gives relatively compact representations
even for large IPv4 networks (e.g. /20).
Ganeti-owned IP addresses (node + master IPs) are reserved automatically
if the cluster's data network itself is placed under pool management.
Helper ConfigWriter methods provide free IP address generation and
reservation, using a TemporaryReservationManager.
It should be noted that IP pool management is performed only for IPv4
networks, as they are expected to be densely populated. IPv6 networks
can use different approaches, e.g. sequential address asignment or
EUI-64 addresses.
Managed NIC mode
++++++++++++++++
In order to be able to use the new network facility while maintaining
compatibility with the current networking model, a new network mode is
introduced, called ``managed`` to reflect the fact that the given NICs
network configuration is managed by Ganeti itself. A managed mode NIC
accepts the network it is connected to in its ``link`` argument.
Userspace tools can refer to networks using their symbolic names,
however internally, the link argument stores the network's UUID.
We also introduce a new ``ip`` address value, ``constants.NIC_IP_POOL``,
that specifies that a given NIC's IP address should be obtained using
the IP address pool of the specified network. This value is only valid
for managed-mode NICs, where it is also used as a default instead of
``constants.VALUE_AUTO``. A managed-mode NIC's IP address can also be
specified manually, as long as it is compatible with the network the NIC
is connected to.
Hooks
+++++
``OP_NETWORK_ADD``
Add a network to Ganeti
:directory: network-add
:pre-execution: master node
:post-execution: master node
``OP_NETWORK_CONNECT``
Connect a network to a node group. This hook can be used to e.g.
configure network interfaces on the group's nodes.
:directory: network-connect
:pre-execution: master node, all nodes in the connected group
:post-execution: master node, all nodes in the connected group
``OP_NETWORK_DISCONNECT``
Disconnect a network to a node group. This hook can be used to e.g.
deconfigure network interfaces on the group's nodes.
:directory: network-disconnect
:pre-execution: master node, all nodes in the connected group
:post-execution: master node, all nodes in the connected group
``OP_NETWORK_REMOVE``
Remove a network from Ganeti
:directory: network-add
:pre-execution: master node, all nodes
:post-execution: master node, all nodes
Hook variables
^^^^^^^^^^^^^^
``INSTANCE_NICn_MANAGED``
Non-zero if NIC n is a managed-mode NIC
``INSTANCE_NICn_NETWORK``
The friendly name of the network
``INSTANCE_NICn_NETWORK_UUID``
The network's UUID
``INSTANCE_NICn_NETWORK_TAGS``
The network's tags
``INSTANCE_NICn_NETWORK_IPV4_CIDR``, ``INSTANCE_NICn_NETWORK_IPV6_CIDR``
The subnet in CIDR notation
``INSTANCE_NICn_NETWORK_IPV4_GATEWAY``, ``INSTANCE_NICn_NETWORK_IPV6_GATEWAY``
The subnet's default gateway
Backend changes
+++++++++++++++
In order to keep the hypervisor-visible changes to a minimum, and
maintain compatibility with the existing network configuration scripts,
the instance's hypervisor configuration will have host-level link and
mode replaced by the *connectivity mode* and *host interface* of the
given network on the current node group.
The managed mode can be detected by the presence of new environment
variables in network configuration scripts:
Network configuration script variables
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``MANAGED``
Non-zero if NIC is a managed-mode NIC
``NETWORK``
The friendly name of the network
``NETWORK_UUID``
The network's UUID
``NETWORK_TAGS``
The network's tags
``NETWORK_IPv4_CIDR``, ``NETWORK_IPv6_CIDR``
The subnet in CIDR notation
``NETWORK_IPV4_GATEWAY``, ``NETWORK_IPV6_GATEWAY``
The subnet's default gateway
Userland interface
++++++++++++++++++
A new client script is introduced, ``gnt-network``, which handles
network-related configuration in Ganeti.
Network addition/deletion
^^^^^^^^^^^^^^^^^^^^^^^^^
::
gnt-network add --cidr=192.0.2.0/24 --gateway=192.0.2.1 \
--cidr6=2001:db8:2ffc::/64 --gateway6=2001:db8:2ffc::1 \
--nic_connectivity=bridged --host_interface=br0 public
gnt-network remove public (only allowed if no instances are using the network)
Manual IP address reservation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
::
gnt-network reserve-ips public 192.0.2.2 192.0.2.10-192.0.2.20
gnt-network release-ips public 192.0.2.3
Network modification
^^^^^^^^^^^^^^^^^^^^
::
gnt-network modify --cidr=192.0.2.0/25 public (only allowed if all current reservations fit in the new network)
gnt-network modify --gateway=192.0.2.126 public
gnt-network modify --host_interface=test --nic_connectivity=routed public (issues warning about instances that need to be rebooted)
gnt-network rename public public2
Assignment to node groups
^^^^^^^^^^^^^^^^^^^^^^^^^
::
gnt-network connect public nodegroup1
gnt-network connect --host_interface=br1 public nodegroup2
gnt-network disconnect public nodegroup1 (only permitted if no instances are currently using this network in the group)
Tagging
^^^^^^^
::
gnt-network add-tags public foo bar:baz
Network listing
^^^^^^^^^^^^^^^
::
gnt-network list
Name IPv4 Network IPv4 Gateway IPv6 Network IPv6 Gateway Connected to
public 192.0.2.0/24 192.0.2.1 2001:db8:dead:beef::/64 2001:db8:dead:beef::1 nodegroup1:br0
private 10.0.1.0/24 - - -
Network information
^^^^^^^^^^^^^^^^^^^
::
gnt-network info public
Name: public
IPv4 Network: 192.0.2.0/24
IPv4 Gateway: 192.0.2.1
IPv6 Network: 2001:db8:dead:beef::/64
IPv6 Gateway: 2001:db8:dead:beef::1
Total IPv4 count: 256
Free address count: 201 (80% free)
IPv4 pool status: XXX.........XXXXXXXXXXXXXX...XX.............
XXX..........XXX...........................X
....XXX..........XXX.....................XXX
X: occupied .: free
Externally reserved IPv4 addresses:
192.0.2.3, 192.0.2.22
Connected to node groups:
default (link br0), other_group(link br1)
Used by 22 instances:
inst1
inst2
inst32
..
IAllocator changes
++++++++++++++++++
The IAllocator protocol can be made network-aware, i.e. also consider
network availability for node group selection. Networks, as well as
future shared storage pools, can be seen as constraints used to rule out
the placement on certain node groups.
.. vim: set textwidth=72 :
.. Local Variables:
.. mode: rst
.. fill-column: 72
.. End:
...@@ -22,6 +22,7 @@ Contents: ...@@ -22,6 +22,7 @@ Contents:
design-htools-2.3.rst design-htools-2.3.rst
design-2.4.rst design-2.4.rst
design-draft.rst design-draft.rst
design-network.rst
cluster-merge.rst cluster-merge.rst
design-shared-storage.rst design-shared-storage.rst
locking.rst locking.rst
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment