The purpose of this document is to suggest, as a guideline, a method which can be used to allow enterprise servers a certain level of IP “dead-gateway detection” and modify their routing tables accordingly.
In today’s complicated world of network computing, the need for multiple network interfaces in any given enterprise-level server system is obvious. Especially in a “five-nines” environment, the necessity in an emergency to be able to have disaster recovery sites be able to power up and live and breathe just like former production sites is considerable. However, most network operating systems or network drivers do not allow for some sort of “dead-gateway detection”.
Case in point: a customer has SUSE Linux Enterprise Server 9 servers distributed throughout their production environment. These servers are basically cloned on a regular basis and re-centralized at a disaster recovery site just in case something goes awry in production. The problem is, the production IP addressing scheme is vastly different than that of the disaster recovery site. Without some sort of “dead-gateway detection”, when these systems come online, they cannot communicate with the rest of the world. One would think that simply adding another interface configuration would suffice; however, most enterprise servers don’t have DHCP-assigned host addresses, so the routing table gets messy in this situation.
Here’s the solution: as each network interface is brought up, test whether or not the associated gateway is reachable. If it is, leave the interface up and add that interface’s gateway to the routing table. If it is not, bring the interface down to avoid routing confusion. The solution is basically accomplished through simple scripting.
First, each network interface’s start-up script should be modified. These
are located under /etc/sysconfig/network
where the colon-separated id is the MAC address of the corresponding
interface. They should be modified by adding a POST_UP_SCRIPT
parameter (see man
5 ifcfg). For example:
BOOTPROTO='static' MTU='' REMOTE_IPADDR='' STARTMODE='onboot' UNIQUE='XXXX.xxXxXXxxXXX' _nm_name='bus-pci-0000:00:12.0' BROADCAST='10.0.0.255' IPADDR='10.0.0.129' NETMASK='255.255.255.0' NETWORK='10.0.0.0' POST_UP_SCRIPT='postup-eth-id-xx:xx:xx:xx:xx:xx'
By default, the system looks for post_up_scripts in /etc/sysconfig/network/scripts. If a different location is desired, then the full path should be used in the parameter.
Next, the post_up_script should perform a simple test of the gateway’s connectivity and based on the return either leave the interface up or bring it down. An example of this type of script could look like:
ping -c 5 10.0.0.2 if [ "$?" -ne "0" ] then /sbin/ifdown ethx fi
Notice the ethx specification. This should be modified according to which interface is being tested, i.e., eth0, eth1, etc.,.
Finally, the /etc/sysconfig/network/routes
file should be moved, renamed, or deleted. Then, another
interface-specific routes file should replace it (see man
5 routes). One should exist for each interface configured
with a post_up_script. This script should be named
again where the colon-separated id is the MAC address of the
corresponding interface. This file could look something like:
default 10.0.0.2 0.0.0.0 eth1
Once completed, rcnetwork restart can be run for testing of the scripts. What one would expect to see is that, during startup of each interface pings will be sent to that interface’s gateway. If the gateway is alive, the interface should remain active and the routing table should have that interface’s gateway as default. If the gateway is dead, the interface should deactivate.
Note: If both interfaces and both gateways are active, the last gateway tested will be the default.