Commit bd45c6f1 authored by Dimitris Aragiorgis's avatar Dimitris Aragiorgis

Add separate docs for routed and ebtables setup

Remove further info and implementation details of ip-less-routed
and private-filtered setups from main page. Introduce routed and
ebtables page to include all this info.

Add /etc/network/interfaces examples for ip-less-routed
configuration.
Signed-off-by: default avatarDimitris Aragiorgis <dimara@grnet.gr>
parent e65646dc
.. _ebtables:
L2 isolation
------------
Since providing a single VLAN for each private network across the whole data
center is practically impossible (currently expensive switches provide less
than 1024 vlans trucked on all ports), L2 isolation can be achieved via
MAC filtering on a common bridge over a single VLAN.
To ensure isolation we should allow traffic coming from tap to have specific
source MAC and at the same time allow traffic coming to tap to have a source
MAC in the same MAC prefix. Applying those rules only in FORWARD chain will not
guarantee isolation. The reason is because packets with target MAC a `multicast
address <http://en.wikipedia.org/wiki/Multicast_address>`_ go through INPUT and
OUTPUT chains.
.. code-block:: console
# Create new chains
ebtables -t filter -N FROMTAP5 -P RETURN
ebtables -t filter -N TOTAP5 -P RETURN
# Filter multicast traffic from VM
ebtables -t filter -A INPUT -i tap5 -j FROMTAP5
# Filter multicast traffic to VM
ebtables -t filter -A OUTPUT -o tap5 -j TOTAP5
# Filter traffic from VM
ebtables -t filter -A FORWARD -i tap5 -j FROMTAP5
# Filter traffic to VM
ebtables -t filter -A FORWARD -o tap5 -j TOTAP5
# Allow only specific src MAC for outgoing traffic
ebtables -t filter -A FROMTAP5 -s ! aa:55:66:1a:ae:82 -j DROP
# Allow only specific src MAC prefix for incoming traffic
ebtables -t filter -A TOTAP5 -s ! aa:55:60:0:0:0/ff:ff:f0:0:0:0 -j DROP
......@@ -167,141 +167,8 @@ This setup has the following characteristics:
since the VMs are not on the same link with the router.
configuration
"""""""""""""
In order to use this setup all nodes should have been prior properly
configured. A sample `/etc/network/interfaces` could be:
.. code-block:: console
auto eth1
iface eth1 inet manual
up ip route add 192.0.2.0/24 dev eth1
up ip route add 192.0.2.0/24 dev eth1 table snf_public
up ip route add default via 192.0.2.1 dev eth1 table snf_public
up ip rule add iif eth1 lookup snf_public
up arptables -I OUTPUT -o eth1 --opcode 1 --mangle-ip-s 192.0.2.254
For an IPv6 setup this could be:
.. code-block:: console
auto eth1
iface eth1 inet6 manual
up ip -6 route add 2001:db8::/64 dev eth1
up ip -6 route add 2001:db8::/64 dev eth1 table snf_public
up ip -6 route add default via 2001:db8::1 dev eth1 table snf_public
up ip -6 rule add iif eth1 lookup snf_public
up echo 1 > /proc/sys/net/ipv6/conf/eth1/proxy_ndp
Of course we should enable forwarding and define `snf_public` routing
table first:
.. code-block:: console
echo 1 > /proc/sys/net/ipv4/conf/all/forwarding
echo 1 > /proc/sys/net/ipv6/conf/all/forwarding
echo 10 snf_public >> /etc/iproute2/rt_tables
In order to use a more compact `interfaces` file, custom scripts should be
used for ifup/ifdown since this setup is not a common practice. Currently
these scripts are included only as examples in snf-network package but soon
will be provided by `snf-network-helper`. Please see `interfaces` example along
with `vmrouter.ifup`, `vmrouter.ifdown`.
So let's assume the following:
* ``IP`` is the instance's IP
* ``GW_IP`` is the external router's IP
* ``NODE_IP`` is the node's IP
* ``ARP_IP`` is a dummy IP inside the network needed for proxy ARP
* ``MAC`` is the instance's MAC
* ``TAP_MAC`` is the tap's MAC
* ``DEV_MAC`` is the host's DEV MAC
* ``GW_MAC`` is the external router's MAC
* ``DEV`` is the node's device that the router is visible from
* ``TAP`` is the host interface connected with the instance's eth0
Proxy ARP
"""""""""
Since we suppose to be on the same link with the router, ARP takes place first:
1) The VM wants to know the GW_MAC. Since the traffic is routed we do proxy ARP.
- ARP, Request who-has GW_IP tell IP
- ARP, Reply GW_IP is-at TAP_MAC ``echo 1 > /proc/sys/net/conf/TAP/proxy_arp``
- So `arp -na` inside the VM shows: ``(GW_IP) at TAP_MAC [ether] on eth0``
2) The host wants to know the GW_MAC. Since the node does **not** have an IP
inside the network we use the dummy one specified above.
- ARP, Request who-has GW_IP tell ARP_IP (Created by DEV)
``arptables -I OUTPUT -o DEV --opcode 1 -j mangle --mangle-ip-s ARP_IP``
- ARP, Reply GW_IP is-at GW_MAC
3) The host wants to know MAC so that it can proxy it.
- We simulate here that the VM sees **only** GW on the link.
- ARP, Request who-has IP tell GW_IP (Created by TAP)
``arptables -I OUTPUT -o TAP --opcode 1 -j mangle --mangle-ip-s GW_IP``
- So `arp -na` inside the host shows:
``(GW_IP) at GW_MAC [ether] on DEV, (IP) at MAC on TAP``
4) GW wants to know who does proxy for IP.
- ARP, Request who-has IP tell GW_IP
- ARP, Reply IP is-at DEV_MAC (Created by host's DEV)
L3 Routing
""""""""""
With the above we have a working proxy ARP configuration. The rest is done
via simple L3 routing. We assume the following:
* ``TABLE`` is the extra routing table
* ``SUBNET`` is the IPv4 subnet where the VM's IP resides
1) Outgoing traffic:
- Traffic coming out of TAP is routed via TABLE
``ip rule add dev TAP table TABLE``
- TABLE states that default route is GW_IP via DEV
``ip route add default via GW_IP dev DEV``
2) Incoming traffic:
- Packet arrives at router
- Router knows from proxy ARP that the IP is at DEV_MAC.
- Router sends Ethernet packet with tgt DEV_MAC
- Host receives the packet from DEV interface
- Traffic coming out DEV is routed via TABLE
``ip rule add dev DEV table TABLE``
- Traffic targeting IP is routed to TAP
``ip route add IP dev TAP``
3) Host to VM traffic:
- Impossible if the VM resides in the host
- Otherwise there is a route for it: ``ip route add SUBNET dev DEV``
The IPv6 setup is pretty similar but instead of proxy ARP we have proxy NDP
and RS and NS coming from TAP are served by nfdhpcd. RA contain network's
prefix and have M flag unset in order the VM to obtain its IP6 via SLAAC, and
O flag set to obtain static info (nameservers, domain search list) via DHCPv6
(also served by nfdhcpd).
Again the VM sees only the TAP interface as Router and Neighbor on its link
local space. The host must proxy the VM's IPv6
``ip -6 neigh add EUI64 dev DEV``.
When an interface gets up inside a host we should invalidate all entries
related to its IP among other nodes and the router. For proxy ARP we do
``arpsend -U -c 1 -i IP DEV`` and for proxy NDP we do ``ndsend EUI64 DEV``
Please see :ref:`here <routed-conf>` how to configure it, and :ref:`here
<routed-traffic>` how it actually works.
private-filtered
......@@ -315,34 +182,7 @@ ebtables and MAC prefix. The concept is that all interfaces on the same L2
should have the same MAC prefix. MAC prefix uniqueness is guaranteed by
Synnefo and passed to Ganeti as a network option.
To ensure isolation we should allow traffic coming from tap to have specific
source MAC and at the same time allow traffic coming to tap to have a source
MAC in the same MAC prefix. Applying those rules only in FORWARD chain will not
guarantee isolation. The reason is because packets with target MAC a `multicast
address <http://en.wikipedia.org/wiki/Multicast_address>`_ go through INPUT and
OUTPUT chains. To sum up the following ebtables rules are applied:
.. code-block:: console
# Create new chains
ebtables -t filter -N FROMTAP5 -P RETURN
ebtables -t filter -N TOTAP5 -P RETURN
# Filter multicast traffic from VM
ebtables -t filter -A INPUT -i tap5 -j FROMTAP5
# Filter multicast traffic to VM
ebtables -t filter -A OUTPUT -o tap5 -j TOTAP5
# Filter traffic from VM
ebtables -t filter -A FORWARD -i tap5 -j FROMTAP5
# Filter traffic to VM
ebtables -t filter -A FORWARD -o tap5 -j TOTAP5
# Allow only specific src MAC for outgoing traffic
ebtables -t filter -A FROMTAP5 -s ! aa:55:66:1a:ae:82 -j DROP
# Allow only specific src MAC prefix for incoming traffic
ebtables -t filter -A TOTAP5 -s ! aa:55:60:0:0:0/ff:ff:f0:0:0:0 -j DROP
For further info and implementation details please see :ref:`here <ebtables>`.
dns
......
.. _routed:
Routed Setup
------------
In the following section we are going to describe how we can achive a routed
setup for a specific subnet across the data center. We distinguish here
two ways to do that:
1) All nodes are going to host VMs (VMC) and one separate node will be the
external router (Gateway).
2) All nodes are going to host VMs (VMC) and one of them will also be the
external router (Gateway).
Whether the external router will do NAT or not depends on whether we have
a public routable subnet available or just a single node with internet
access.
For the next examples we assume that the route-able subnet will be
``192.0.2.0/24``, the gateway ``192.0.2.1``, nodes primary interface will
be ``eth0`` while VM traffic will go through ``eth0.0`` physical VLAN.
Of cource ``eth0.222`` can be substituted with a separate physical interface
(e.g. ``eth1``). All examples use `/etc/networ/interfaces` file, the
common way for configuring static interfaces under Debian.
.. _routed-conf:
Configuration
^^^^^^^^^^^^^
For a VMC that will just forward traffic to an external router the proposed
setup is:
.. code-block:: console
auto eth0.222
iface eth0.222 inet manual
up ip link set eth0.222 up
# Host can reach VMs in other hosts
up ip route add 192.0.2.0/24 dev eth0.222
# Incoming traffic will be routed via extra table
up ip rule add iif eth0.222 lookup 222
# VM-to-VM traffic will go direct through VLAN
up ip route add 192.0.2.0/24 dev eth0.222 table 222
# Outgoing VM traffic will go through external router on VLAN
up ip route add default via 192.0.2.1 dev eth0.222 table 222
# Enable proxy ARP and forwarding
up echo 1 > /proc/sys/net/ipv4/conf/eth0.222/proxy_arp
up echo 1 > /proc/sys/net/ipv4/conf/eth0.222/forwarding
# Mangle arp request originating from the host
up arptables -A OUTPUT -o eth0.222 --opcode request -j mangle --mangle-ip-s 192.0.2.254
down arptables -D OUTPUT -o eth0.222 --opcode request -j mangle
down ip rule del iif eth0.222 lookup 222
Of course instead of `222` routing table we could alias it with a more
reasonable name (e.g. `snf_routed`):
.. code-block:: console
echo 222 snf_routed >> /etc/iproute2/rt_tables
For a node that acts **only** as a router we have:
.. code-block:: console
auto eth0.222
iface eth0.222 inet manual
up ip link set eth0.222 up
# Add gateway address to the interface
up ip addr add 192.0.2.1/24 dev eth0.222
# Enable forwarding and NAT
up echo 1 > /proc/sys/net/ipv4/conf/eth0.222/forwarding
up iptables -t nat -I POSTROUTING -o eth0 -s 192.0.2.0/24 -j MASQUERADE
down iptables -t nat -I POSTROUTING -o eth0 -s 192.0.2.0/24 -j MASQUERADE
For a node that acts both as a router and a VMC we have:
.. code-block:: console
auto eth0.222
iface eth0.222 inet manual
up ip link set eth0.222 up
# Outgoing VM traffic is routed via extra table
up ip rule add iif eth0.222 lookup 222
# Host-to-VM traffic is routed via extra table
up ip rule add to 192.0.2.0/24 lookup 222
# VM-to-VM and Router-to-VM traffic will go direct through VLAN
up ip route add 192.0.2.0/24 dev eth0.222 table 222
# Add gateway address to the interface
up ip addr add 192.0.2.1 dev eth0.222
up echo 1 > /proc/sys/net/ipv4/conf/eth0.222/proxy_arp
up echo 1 > /proc/sys/net/ipv4/conf/eth0.222/forwarding
up iptables -t nat -I POSTROUTING -o eth0 -s 192.0.2.0/24 -j MASQUERADE
down iptables -t nat -I POSTROUTING -o eth0 -s 192.0.2.0/24 -j MASQUERADE
down ip rule del to 192.0.2.0/24 lookup 222
In order to use a more compact `interfaces` file, custom scripts should be
used for ifup/ifdown since this setup is not a common practice. Currently
these scripts are included only as examples in snf-network package but soon
will be provided by `snf-network-helper`. Please see `interfaces` example along
with `vmrouter.ifup`, `vmrouter.ifdown`.
.. _routed-traffic:
Routed Traffic
^^^^^^^^^^^^^^
Here we break down all stages of networking and analyze how we connectivity
is actually achived. To do so let's first assume the following:
* ``IP`` is the instance's IP
* ``GW_IP`` is the external router's IP
* ``NODE_IP`` is the node's IP
* ``ARP_IP`` is a dummy IP inside the network needed for proxy ARP
* ``MAC`` is the instance's MAC
* ``TAP_MAC`` is the tap's MAC
* ``DEV_MAC`` is the host's DEV MAC
* ``GW_MAC`` is the external router's MAC
* ``DEV`` is the node's device that the router is visible from
* ``TAP`` is the host interface connected with the instance's eth0
Proxy ARP
"""""""""
Since we suppose to be on the same link with the router, ARP takes place first:
1) The VM wants to know the GW_MAC. Since the traffic is routed we do proxy ARP.
- ARP, Request who-has GW_IP tell IP
- ARP, Reply GW_IP is-at TAP_MAC ``echo 1 > /proc/sys/net/conf/TAP/proxy_arp``
- So `arp -na` inside the VM shows: ``(GW_IP) at TAP_MAC [ether] on eth0``
2) The host wants to know the GW_MAC. Since the node does **not** have an IP
inside the network we use the dummy one specified above.
- ARP, Request who-has GW_IP tell ARP_IP (Created by DEV)
``arptables -I OUTPUT -o DEV --opcode 1 -j mangle --mangle-ip-s ARP_IP``
- ARP, Reply GW_IP is-at GW_MAC
3) The host wants to know MAC so that it can proxy it.
- We simulate here that the VM sees **only** GW on the link.
- ARP, Request who-has IP tell GW_IP (Created by TAP)
``arptables -I OUTPUT -o TAP --opcode 1 -j mangle --mangle-ip-s GW_IP``
- So `arp -na` inside the host shows:
``(GW_IP) at GW_MAC [ether] on DEV, (IP) at MAC on TAP``
4) GW wants to know who does proxy for IP.
- ARP, Request who-has IP tell GW_IP
- ARP, Reply IP is-at DEV_MAC (Created by host's DEV)
When an interface gets up inside a host we should invalidate all entries
related to its IP among other nodes and the router. Specifically we use:
``arpsend -U -c 1 -i IP DEV``.
L3 Routing
""""""""""
With the above we have a working proxy ARP configuration. The rest is done
via simple L3 routing. We assume the following:
* ``TABLE`` is the extra routing table
* ``SUBNET`` is the IPv4 subnet where the VM's IP resides
1) Outgoing traffic:
- Traffic coming out of TAP is routed via TABLE
``ip rule add dev TAP table TABLE``
- TABLE states that default route is GW_IP via DEV
``ip route add default via GW_IP dev DEV``
2) Incoming traffic:
- Packet arrives at router
- Router knows from proxy ARP that the IP is at DEV_MAC.
- Router sends Ethernet packet with tgt DEV_MAC
- Host receives the packet from DEV interface
- Traffic coming out DEV is routed via TABLE
``ip rule add dev DEV table TABLE``
- Traffic targeting IP is routed to TAP
``ip route add IP dev TAP``
3) Host to VM traffic:
- Impossible if the VM resides in the host
- If router is also VMC there is a rule for it: ``ip rule to SUBNET lookup TABLE``
- Otherwise there is a route for it: ``ip route add SUBNET dev DEV``
IPv6
^^^^
The IPv6 setup is pretty similar but instead of proxy ARP we have proxy NDP
and RS and NS coming from TAP are served by nfdhpcd. RA contain network's
prefix and have M flag unset in order the VM to obtain its IP6 via SLAAC, and
O flag set to obtain static info (nameservers, domain search list) via DHCPv6
(also served by nfdhcpd).
Again the VM sees only the TAP interface as router and the only neighbor on its
link local space. The host must proxy the VM's IPv6
``ip -6 neigh add EUI64 dev DEV``.
When an interface gets up inside a host we should invalidate all entries
related to its IPv6 among other nodes and the router. Specifically we use:
``ndsend EUI64 DEV`` .
An example interface file for the case where host is only VMC could be:
.. code-block:: console
auto eth0.222
iface eth0.222 inet6 manual
up ip link set eth0.222 up
up ip -6 route add 2001:db8::/64 dev eth0.222
up ip -6 route add 2001:db8::/64 dev eth0.222 table 222
up ip -6 route add default via 2001:db8::1 dev eth0.222 table 222
up ip -6 rule add iif eth0.222 lookup 222
up echo 1 > /proc/sys/net/ipv6/conf/eth0.222/proxy_ndp
down ip -6 rule del iif eth0.222 lookup 222
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment