iTCO_wdt does not accept Watchdog Timeout bigger 63 seconds
This document (7011426) is provided subject to the disclaimer at the end of this document.
SUSE Linux Enterprise High Availability Extension 11 Service Pack 1
SUSE Linux Enterprise High Availability Extension 11 Service Pack 2
SUSE Linux Enterprise Server 10 Service Pack 4
SUSE Linux Enterprise High Availability Extension 11
jupiter:~ # sbd -d /dev/disk/by-id/scsi-mywatchdogdevice dump
==Dumping header on disk /dev/disk/by-id/scsi-mywatchdogdevice
Header version : 2
Number of slots : 255
Sector size : 512
Timeout (watchdog) : 65
Timeout (allocate) : 2
Timeout (loop) : 1
Timeout (msgwait) : 130
But the logfiles show entries like this on cluster start
Nov 22 12:33:32 jupiter sbd: : ERROR: WDIOC_SETTIMEOUT: Failed to set watchdog timer to 65 seconds.: Invalid argument
Nov 22 12:33:32 jupiter sbd: : CRIT: Please validate your watchdog configuration!
Nov 22 12:33:32 jupiter sbd: : CRIT: Choose a different watchdog driver or specify -T to silence this check if you are sure.
which indicates that the value of the Watchdog Timeout could not be passed to the Hardware Watchdog. This also means that the cluster is not protected via the Watchdog and this is, as the log file states, a critical issue. It should be resolved first and as soon as possible.
This applies to all Watchog Timeout Values bigger than 63 Seconds and the iTCO_wdt Hardware Watchdog
sbd -d /dev/disk/by-id/scsi-mywatchdogdevice -1 60 -4 120 create
and a restart of all cluster nodes would resolve the issue.
If it is really necessary to use a Watchdog Timeout of >63 Seconds then it should be checked whether there is any other Hardware Watchdog available.
As a last resort softdog could be used.
if (((iTCO_wdt_private.iTCO_version == 2) && (tmrval > 0x3ff)) ||
((iTCO_wdt_private.iTCO_version == 1) && (tmrval > 0x03f)))
This Support Knowledgebase provides a valuable tool for NetIQ/Novell/SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7011426
- Creation Date:27-NOV-12
- Modified Date:29-NOV-12
- SUSESUSE Linux Enterprise Server
Did this document solve your problem? Provide Feedback