10. Linux Configuration Hints
We are not aware of any major issues with Linux boxes used as routers, and they seem to be pretty rare on the Exchange. Having said that, there are a few parameters that can (and usually should) be tuned:
ARP filtering & source routing
ARP cache timeout
Reverse Path (RP) filter
For more information on tuning your Linux system for routing, see the Linux Advanced Routing & Traffic Control HOWTO.
10.1. ARP Filtering and Source Routing
The Linux approach to IP addresses is that they belong to the system, not any single interface. As a result, Linux hosts have a default behaviour that is different from most other systems: interfaces semi-promiscuously answer for all IP addresses of all other interfaces. Example:

In this example, host tuxco is a Linux box with a peering connection on eth0 (192.168.1.1/24) and a backbone link on eth1 (10.0.0.1/24).
When host kannix (192.168.1.2) sends an ARP query for 10.0.0.1 it will get a reply from tuxco's eth0 interface!
In other words, a Linux host will answer to ARP queries coming in on any interface if the queried address is configured on any of its interfaces. The idea behind this is that an IP address belongs to the system, not just a single interface. Although this may work well for server or desktop systems, it is not desirable behaviour in a router system. One reason is that it is a limited version of proxy-arp, which is forbidden on the AMS-IX peering LAN. Another reason is that two separate routers could potentially answer ARP queries for the same RFC1918 address.
10.1.1. Fixing ARP
The ARP behaviour can be fixed by using arp_ignore and arp_announce on the WAN interface:
tuxco# sysctl -w net/ipv4/conf/eth0/arp_ignore=1 tuxco# sysctl -w net/ipv4/conf/eth0/arp_announce=1 |
10.1.2. Multiple Interfaces on One Subnet
If you have multiple interfaces on the same subnet, you may also want to enable arp_filter:
tuxco# sysctl -w net/ipv4/conf/eth0/arp_filter=1 |
This prevents the ARP entry for an interface to fluctuate between two or more MAC addresses. However, you need to use source routing to make this work correctly. From the Documentation/networking/ip-sysctl-2.6.txt file in the kernel source:
[ … ]
1 - Allows you to have multiple network interfaces on the same subnet, and have the ARPs for each interface be answered based on whether or not the kernel would route a packet from the ARP'd IP out that interface (therefore you must use source based routing for this to work). In other words it allows control of which cards (usually 1) will respond to an arp request.
[ … ]
10.2. IPv4 ARP Cache Timeout
The ARP cache timeout on Linux-based routers should be changed from the default, especially if you have a large number of peers. This parameter can be tuned by setting the appropriate procfs variable through the sysctl interface. The Linux arp(7) manual says:
[ … ]
ARP supports a sysctl interface to configure parameters on a global or per-interface basis. The sysctls can be accessed by reading or writing the /proc/sys/net/ipv4/neigh/*/* files or with the sysctl(2) interface. Each interface in the system has its own directory in /proc/sys/net/ipv4/neigh/. The setting in the ‘default’ directory is used for all newly created devices. Unless otherwise specified time related sysctls are specified in seconds.
[ … ]
Once a neighbour has been found, the entry is considered to be valid for at least a random value between base_reachable_time/2 and 3*base_reachable_time/2. An entry's validity will be extended if it receives positive feedback from higher level protocols. Defaults to 30 seconds.
This means that Linux systems keep ARP entries in their cache for some time between 15 and 45 seconds (and yes, the average works out to 30 seconds). This is not very high. In fact, it is lower than the typical BGP KEEPALIVE interval and may thus result in excessive ARPs.
We suggest a timeout of at least two hours for ARP entries on your AMS-IX interface, so you'd have to set the base_reachable_time to 2 x 2hrs = 4 hours.
tuxco1# sysctl net.ipv4.neigh.ifname.base_reachable_time net.ipv4.neigh.ifname.base_reachable_time = 30 |
The above command tells you that the ARP cache timeout is 30 seconds average. To change it so it's between 2 and 6 hours, use the following command:
tuxco1# sysctl -w net.ipv4.neigh.ifname.base_reachable_time=14400 net.ipv4.neigh.ifname.base_reachable_time = 14400 |
Here ifname is the name of the interface that connects to AMS-IX. You can also use “default” here, but that may have undesired side-effects for your other interfaces.
10.3. IPv6 Neighbor Cache Timeout
As with the IPv4 ARP cache, Linux systems tend to set the lifetime of the IPv6 neighbor cache quite short as well. The lifetime is controlled in a similar way as for IPv4 ARP:
tuxco1# sysctl net.ipv6.neigh.ifname.base_reachable_time net.ipv6.neigh.ifname.base_reachable_time = 30 tuxco1# sysctl -w net.ipv6.neigh.ifname.base_reachable_time=14400 net.ipv6.neigh.ifname.base_reachable_time = 14400 |
10.4. RP Filter Setting
You may need to turn off the Reverse Path Filter (rp_filter) functionality on a Linux-based router to allow asymmetric routing, particularly on your WAN interface.
To disable the RP filter:
tuxco1# sysctl -w net.ipv4.conf.ifname.rp_filter=0 |
10.5. Running the “sysctl” Commands at Boot
The various system parameters discussed above can be set at boot time by adding it to a file such as /etc/sysctl.conf. The exact name, location and very existence of this file typically depends on the Linux distribution in use, but both Debian and Red Hat/Fedora use /etc/sysctl.conf:
# file: /etc/sysctl.conf # These settings should be duplicated for all interfaces that are # on a peering LAN. ### Typical stuff you really want on a router # Fix the "promiscuous ARP" thing... net/ipv4/conf/ifname/arp_ignore=1 net/ipv4/conf/ifname/arp_announce=1 # Turn off RP filtering to allow asymmetric routing: net/ipv4/conf/ifname/rp_filter=0 # Multiple (non-aggregated) interfaces on the same peering LAN. # READ THE MANUAL FIRST! #net/ipv4/conf/ifname/arp_filter=1 ### Keep the AMS-IX ARP Police happy. :-) net/ipv4/neigh/ifname/base_reachable_time=14400 net/ipv6/neigh/ifname/base_reachable_time=14400 |
![]() | Modules must be loaded before sysctl is executed | ||
|---|---|---|---|
On Debian systems, kernel modules for some network interfaces (e.g. 10GE cards) are not loaded before the init process executes the script that runs the sysctl commands. In those cases, it is necessary to force the module to be loaded earlier. The same goes for the IPv6 settings; the ipv6 module is usually not loaded until the network interfaces are brought up, which is typically after the sysctl variables are set by the procps.sh script. (On Red Hat/Fedora systems no action needs to be taken; the /etc/init.d/network script automatically (re-)sets the sysctl variables before and after bringing up the interfaces.) There are a few ways around this:
|
10.6. Linux Aggregated Links
Enable bonding driver support in the kernel (CONFIG_BONDING=m)
Edit /etc/modules to load the bonding driver on boot:
bonding miimon=100 |
The miimon parameter specifies the frequency for link-monitoring, measured in ms.
Install the ifenslave package (apt-get install ifenslave). This package provides the /sbin/ifenslave tool, which is used to attach physical interfaces to the bonding interface.
Add the bonding interface to /etc/network/interfaces:
# Ams-IX side auto bond0 iface bond0 inet static address 195.69.14x.y netmask 255.255.254.0 post-up /sbin/ifenslave bond0 eth0 eth1 |
The above example creates a bonding interface with two physical interfaces.
For more information see the file Documentation/networking/bonding.txt in the kernel source tree.


