SUSE Support

Here When You Need Us

Cluster SBD partition fails from mismatched metadata

This document (7010933) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise High Availability Extension 15 (HAE)
SUSE Linux Enterprise High Availability Extension 12 (HAE)
SUSE Linux Enterprise High Availability Extension 11 (HAE)
Split Brain Detection (SBD) Partition

Situation

The cluster nodes do not fence properly.  There are three SBD partitions configured:
# cat /etc/sysconfig/sbd
SBD_DEVICE="/dev/disk/by-id/scsi-360014054cbc289648e646bba32e4e59b;/dev/disk/by-id/scsi-3600224800a49a62b9e0c800cd1bccec5;/dev/disk/by-id/scsi-360022480336fb6b910671b436698eaab"
SBD_OPTS="-W"
The following metadata appears on the sbd partitions:
# /usr/sbin/sbd -d /dev/disk/by-id/scsi-360014054cbc289648e646bba32e4e59b dump
==Dumping header on disk /dev/disk/by-id/scsi-360014054cbc289648e646bba32e4e59b
Header version     : 2
Number of slots    : 255
Sector size        : 512
Timeout (watchdog) : 5
Timeout (allocate) : 2
Timeout (loop)     : 1
Timeout (msgwait)  : 10
==Header on disk /dev/disk/by-id/scsi-360014054cbc289648e646bba32e4e59b is dumped

# /usr/sbin/sbd -d /dev/disk/by-id/scsi-3600224800a49a62b9e0c800cd1bccec5 dump
==Dumping header on disk /dev/disk/by-id/scsi-3600224800a49a62b9e0c800cd1bccec5
Header version     : 2
Number of slots    : 255
Sector size        : 512
Timeout (watchdog) : 25
Timeout (allocate) : 2
Timeout (loop)     : 1
Timeout (msgwait)  : 10
==Header on disk /dev/disk/by-id/scsi-3600224800a49a62b9e0c800cd1bccec5 is dumped

# /usr/sbin/sbd -d /dev/disk/by-id/scsi-360022480336fb6b910671b436698eaab dump
==Dumping header on disk /dev/disk/by-id/scsi-360022480336fb6b910671b436698eaab
Header version     : 2
Number of slots    : 255
Sector size        : 512
Timeout (watchdog) : 5
Timeout (allocate) : 2
Timeout (loop)     : 1
Timeout (msgwait)  : 10
==Header on disk /dev/disk/by-id/scsi-360022480336fb6b910671b436698eaab is dumped

Resolution

Recreate the SDB partitions and perform a rolling restart of the cluster nodes.  For example:

1. Display the current SBD partitions:
# cat /etc/sysconfig/sbd
SBD_DEVICE="/dev/disk/by-id/scsi-360014054cbc289648e646bba32e4e59b;/dev/disk/by-id/scsi-3600224800a49a62b9e0c800cd1bccec5;/dev/disk/by-id/scsi-360022480336fb6b910671b436698eaab"
SBD_OPTS="-W"
2. Reformat the SBD partition on each devices listed:
# sbd -d /dev/disk/by-id/scsi-360014054cbc289648e646bba32e4e59b -d /dev/disk/by-id/scsi-3600224800a49a62b9e0c800cd1bccec5 -d /dev/disk/by-id/scsi-360022480336fb6b910671b436698eaab create
3. Reboot one node in the cluster.  When it comes back online, reboot another node, repeating the process until each node in the cluster has been rebooted.

Cause

SDB partition metadata did not match.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7010933
  • Creation Date: 15-Oct-2012
  • Modified Date:28-Jun-2023
    • SUSE Linux Enterprise High Availability Extension

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.