Commit 5ff7fba4 authored by Dimitris Aragiorgis's avatar Dimitris Aragiorgis Committed by Petr Pudlak

Add design doc for ifdown script support

The ifdown script will be responsible for deconfiguring network
devices and cleanup changes made by the ifup script. The first
implementation will target KVM but it could be ported to Xen as well
especially when Xen hotplug gets implemented.
Signed-off-by: default avatarDimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: default avatarPetr Pudlak <pudlak@google.com>
Reviewed-by: default avatarPetr Pudlak <pudlak@google.com>
parent ce09eddd
......@@ -591,6 +591,7 @@ docinput = \
doc/design-htools-2.3.rst \
doc/design-http-server.rst \
doc/design-hugepages-support.rst \
doc/design-ifdown.rst \
doc/design-impexp2.rst \
doc/design-internal-shutdown.rst \
doc/design-kvmd.rst \
......
......@@ -24,6 +24,7 @@ Design document drafts
design-systemd.rst
design-cpu-speed.rst
design-disk-conversion.rst
design-ifdown.rst
.. vim: set textwidth=72 :
.. Local Variables:
......
======================================
Design for adding ifdown script to KVM
======================================
.. contents:: :depth: 4
This is a design document about adding support for an ifdown script responsible
for deconfiguring network devices and cleanup changes made by the ifup script. The
first implementation will target KVM but it could be ported to Xen as well
especially when hotplug gets implemented.
Current state and shortcomings
==============================
Currently, KVM before instance startup, instance migration and NIC hotplug, it
creates a tap and invokes explicitly the kvm-ifup script with the relevant
environment (INTERFACE, MAC, IP, MODE, LINK, TAGS, and all the network info if
any; NETWORK\_SUBNET, NETWORK\_TAGS, etc).
For Xen we have the `vif-ganeti` script (associated with vif-script hypervisor
parameter). The main difference is that Xen calls it by itself by passing it as
an extra option in the configuration file.
This ifup script can do several things; bridge a tap to a bridge, add ip rules,
update a external DNS or DHCP server, enable proxy ARP or proxy NDP, issue
openvswitch commands, etc. In general we can divide those actions in two
categories:
1) Commands that change the state of the host
2) Commands that change the state of external components.
Currently those changes do not get cleaned up or modified upon instance
shutdown, remove, migrate, or NIC hot-unplug. Thus we have stale entries in
hosts and most important might have stale/invalid configuration on external
components like routers that could affect connectivity.
A workaround could be hooks but:
1) During migrate hooks the environment is the one held in config data
and not in runtime files. The NIC configuration might have changed on
master but not on the running KVM process (unless hotplug is used).
Plus the NIC order in config data might not be the same one on the KVM
process.
2) On instance modification, changes are not available on hooks. With
other words we do not know the configuration before and after modification.
Since Ganeti is the orchestrator and is the one who explicitly configures
host devices (tap, vif) it should be the one responsible for cleanup/
deconfiguration. Especially on a SDN approach this kind of script might
be useful to cleanup flows in the cluster in order to ensure correct paths
without ping pongs between hosts or connectivity loss for the instance.
Proposed Changes
================
We add an new script, kvm-ifdown that is explicitly invoked after:
1) instance shutdown on primary node
2) successful instance migration on source node
3) failed instance migration on target node
4) successful NIC hot-remove on primary node
If an administrator's custom ifdown script exists (e.g. `kvm-ifdown-custom`),
the `kvm-ifdown` script executes that script, as happens with `kvm-ifup`.
Along with that change we should rename custom ifup script from
`kvm-vif-bridge` (which does not make any sense) to `kvm-ifup-custom`.
In contrary to `kvm-ifup`, one cannot rely on `kvm-ifdown` script to be
called. A node might die just after a successful migration or after an
instance shutdown. In that case, all "undo" operations will not be invoked.
Thus, this script should work "on a best effort basis" and the network
should not rely on the script being called or being successful. Additionally
it should modify *only* the node local dynamic configs (routes, arp entries,
SDN, firewalls, etc.), whereas static ones (DNS, DHCP, etc.) should be modified
via hooks.
Implementation Details
======================
1) Where to get the NIC info?
We cannot account on config data since it might have changed. So the only
place we keep our valid data is inside the runtime file. During instance
modifications (NIC hot-remove, hot-modify) we have the NIC object from
the RPC. We take its UUID and search for the corresponding entry in the
runtime file to get further info. After instance shutdown and migration
we just take all NICs from the runtime file and invoke the ifdown script
for each one
2) Where to find the corresponding TAP?
Currently TAP names are kept under
/var/run/ganeti/kvm-hypervisor/nics/<instance>/<nic\_index>.
This is not enough. As told above a NIC's index might change during instance's
life. An example will make things clear:
* The admin starts an instance with three NICs.
* The admin removes the second without hotplug.
* The admin removes the first with hotplug.
The index that will arrive with the RPC will be 1 and if we read the relevant
NIC file we will get the tap of the NIC that has been removed on second
step but is still existing in the KVM process.
So upon TAP creation we write another file with the same info but named
after the NIC's UUID. The one named after its index can be left
for compatibility (Ganeti does not use it; external tools might)
Obviously this info will not be available for old instances in the cluster.
The ifdown script should be aware of this corner case.
3) What should we cleanup/deconfigure?
Upon NIC hot-remove we obviously want to wipe everything. But on instance
migration we don't want to reset external configuration like DNS. So we choose
to pass an extra positional argument to the ifdown script (it already has the
TAP name) that will reflect the context it was invoked with. Please note that
de-configuration of external components is not encouraged and should be
done via hooks. Still we could easily support it via this extra argument.
4) What will be the script environment?
In general the same environment passed to ifup script. Except instance's
tags. Those are the only info not kept in runtime file and it can
change between ifup and ifdown script execution. The ifdown
script must be aware of it and should cleanup everything that ifup script
might setup depending on instance tags (e.g. firewalls, etc)
Configuration Changes
~~~~~~~~~~~~~~~~~~~~~
1) The `kvm-ifdown` script will be an extra file installed under the same dir
`kvm-ifup` resides. We could have a single script (and symbolic links to it)
that shares the same code, where a second positional argument or an extra
environment variable would define if we are bringing the interface up or
down. Still this is not the best practice since it is not equivalent
with how KVM uses `script` and `downscript` in the `netdev` option; scripts
are different files that get the tap name as positional argument. Of course
common code will go in `net-common` so that it can be sourced from either
Xen or KVM specific scripts.
2) An extra file written upon TAP creation named after the NIC's UUID and
including the TAP's name. Since this should be the correct file to keep
backwards compatibility we create a symbolic link named after the NIC's
index and pointing to this new file.
.. vim: set textwidth=72 :
.. Local Variables:
.. mode: rst
.. fill-column: 72
.. End:
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment