Automatically Restart a Service if it Crashes | SUSE Communities

Automatically Restart a Service if it Crashes



Sometimes we all experience services that die randomly. The ideal solution in those cases can take some time, like a patch, rebuild the server or wait for a service window. Being able to quickly implement a watchdog for that service makes our life as admins so much better. The following solution is simple, quick and really works in most cases. I have it in production use right now with very good results.


The solution I use isn’t really my own invention but I really like its simplicity. It’s basically just a shell script called from cron. The script watches the service and restarts it in case of a crash. Saves the users on our network from loads of grief.


This is what a sample script for LUM on SLED10 looks like:


MYPROC=namcd #The name of the process
INITS=namcd #The name of the /etc/init.d/ file

COUNT=$(UNIX95=1 ps -C $MYPROC -o pid= -o args= | wc -l) #This command gets
the number of occurances of the command $MYPROC. If its running it gives 0.
if [ $COUNT -lt 1 ] #Checks if the service seems like its running or not.
/etc/init.d/$INITS start # The command to start the service

If we want to check for an open port, we get a script that looks like this:

PORT=:445 #The port, the : makes it easy to snag only ports and not other
 numbers in the output.
INITS=samba #The name of the service in /etc/init.d/
COUNT=$(netstat -lpn | grep $ | wc -l)
if [ $COUNT -lt 1 ]
/etc/init.d/$INITS start

We can also change the actions taken when we find out the service isn’t running. For example with GroupWise we probably want to add a command after “then” to remove the leftover pid file:

rm /var/run/novell/groupwise/ 

Where is the name of the service that has crashed. Otherwise the agent won’t restart.


This script should work everywhere on any SUSE version.

(Visited 1 times, 1 visits today)

Leave a Reply

Your email address will not be published.

No comments yet