All HAE nodes fail to start clustering after reboot

This document (7011302) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise High Availability Extension 11 (HAE)
SUSE Linux Enterprise Server 11 (SLES)
Split Brain Detection (SBD) Partitions

Situation

The boot screen shows the following errors:

Starting OpenAIS/Corosync daemon (corosync): Starting SBD - SBD failed to start; aborting.
Failed services in runlevel 3: openais

The /etc/sysconfig/sbd configuration file on all nodes shows:
SBD_DEVICE="/dev/sdc1;/dev/sdd1;/dev/sde1"
SBD_OPTS="-W"

# /usr/sbin/cibadmin -Q
Signon to CIB failed: connection failed
Init failed, could not perform requested operations
The stonith resource in the /var/lib/heartbeat/crm/cib.xml shows:
<primitive class="stonith" id="stonith-sbd" type="external/sbd">
  <instance_attributes id="stonith-sbd-instance_attributes">
    <nvpair id="stonith-sbd-instance_attributes-sbd_device" name="sbd_device" value="/dev/sdb1;/dev/sdc1;/dev/sdd1"/>
  </instance_attributes>
</primitive>

Resolution

Make the device list match in all the /etc/sysconfig/sbd files on all nodes and in the CIB database. There are two scenarios that need to be addressed. One is where the CIB database has the correct list of SBD devices, and the other is where the /etc/sysconfig/sbd file has the list of correct SBD devices. The resolution is different for each.



Method 1 when CIB Database is Correct
For example, the stonith resource in the /var/lib/heartbeat/crm/cib.xml shows:
<primitive class="stonith" id="stonith-sbd" type="external/sbd">
  <instance_attributes id="stonith-sbd-instance_attributes">
    <nvpair id="stonith-sbd-instance_attributes-sbd_device" name="sbd_device" value="/dev/sdb1;/dev/sdc1;/dev/sdd1"/>
  </instance_attributes>
</primitive>

1. On one node, modify the /etc/sysconfig/sbd file.
2. Change the SBD_DEVICE variable to match the CIB database.
SBD_DEVICE="/dev/sdb1;/dev/sdc1;/dev/sdd1"
SBD_OPTS="-W"
3. Save the copy the /etc/sysconfig/sbd file to all nodes in the cluster
scp /etc/sysconfig/sbd node2:/etc/sysconfig/sbd

4. Recreate the sbd partitions as listed in the CIB database
sbd -d /dev/sdb1 -d /dev/sdc1 -d /dev/sdd1 create



Method 2 when /etc/sysconfig/sbd is Correct
The correct /etc/sysconfig/sbd shows:
SBD_DEVICE="/dev/sdc1;/dev/sdd1;/dev/sde1"
SBD_OPTS="-W"

1. Rename the /etc/sysconfig/sbd file to /etc/sysconfig/sbd.save on all nodes in the cluster.
mv /etc/sysconfig/sbd /etc/sysconfig/sbd.save

2. Reboot all nodes in the cluster
3. Remove the stonith resource parameter list or add the correct sbd_device list.
Assuming stonith resource name of stonith-sbd:
crm_resource --delete --resource stonith-sbd --resource-type primitive

crm configure primitive stonith_sbd stonith:external/sbd params sbd_device="/dev/sdc1;/dev/sdd1;/dev/sde1"
-OR-
crm configure primitive stonith_sbd stonith:external/sbd

4. Rename the /etc/sysconfig/sbd.save back to /etc/sysconfig/sbd on all nodes in the cluster
mv /etc/sysconfig/sbd.save /etc/sysconfig/sbd

5. Reboot all nodes in the cluster

Cause

The SBD_DEVICE list in /etc/sysconfig/sbd did not match the Cluster Information Base (CIB) database stonith resource sbd_device list. They must match, or the CIB stonith resource should not have a sbd_device list specified. Without the sbd_device list in the CIB database, clustering will use the SBD_DEVICE list in /etc/sysconfig/sbd.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7011302
  • Creation Date: 02-Nov-2012
  • Modified Date:03-Mar-2020
    • SUSE Linux Enterprise High Availability Extension
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center