SBD STONITH fails to fence other node when using the fully qualified DNS name.

This document (000019877) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise High Availability Extension 15 SP1
SUSE Linux Enterprise High Availability Extension 15 SP2
SUSE Linux Enterprise High Availability Extension 12 SP4
SUSE Linux Enterprise High Availability Extension 12 SP5

Situation

SBD Stonith fails to resolve node name to fence when using FQDN (Fully Qualified DNS Name) in cluster configuration. 
Errors from /var/log/messages from <node1>
node1 stonith-ng[3008]:   notice: Couldn't find anyone to fence (reboot) node2.example.com with any device
node1 stonith-ng[3008]:    error: Operation reboot of node2.example.com by <no-one> for crmd.3012@node2.example.com: No such device
node1 crmd[3012]:   notice: Stonith operation 2/1:0:0:edac53d5-64ec-4650-a447-5aa2a5fc004a: No such device (-19)
node1 crmd[3012]:   notice: Stonith operation 2 for node2.example.com failed (No such device): aborting transition.
node1 crmd[3012]:  warning: No devices found in cluster to fence node2.example.com, giving up

 

Resolution

Normally, it's recommended to use the DNS short name, IP Address,  or value returned by "uname -n".

Preferred Solution:
Modify the /etc/corosync/corosync.conf to use the DNS short name or IP Address under the
     nodelist --> node --> ring0_addr:

If the short name is used rather than IP Address, it's also recommended to add entry in /etc/hosts to eliminate dependency on an external DNS server.
Example: 
nodelist {
        node {
                ring0_addr: node1
                nodeid: 1
        }

        node {
                ring0_addr: node2
                nodeid: 2
        }

Although FQDN (Fully Qualified Domain Names) is supported by Pacemaker, Corosync and SBD, it will require additional configuration outlined in Optional Solution below. 

Optional Solution:
If using FQDN for ring0_addr in the /etc/corosync/corosync.conf, then follow these steps.
  1.  Remove the /etc/sysconfig/sbd from the /etc/csync2/csync2.cfg so it does not get synchronized across cluster nodes.  This file will need to be managed outside of csync2 as it will be different on each node.
  2.  Use the "-n node" option for SBD.      Reference: man sbd (8)
      Set the "SBD_OPTS=-n <FQDN of node1>" in /etc/sysconfig/sbd on first node.
      Set the "SBD_OPTS=-n <FQDN of node2>" in /etc/sysconfig/sbd on second node.
 

Cause



 

Additional Information

Pacemaker documentation [1] clarifies how the manager obtains node names:
The name Pacemaker uses is:
1) The value stored in corosync.conf under ring0_addr in the nodelist, if it does not contain an IP address; otherwise
2) The value stored in corosync.conf under name in the nodelist; otherwise
3) The value of uname -n

Beware: If `uname -n` does not match the name of the node in the cluster configuration, you will need to pass the advertised name to SBD with the`-n` option. 
Example: SBD_OPTS="-n <FQDN host name>"

The sbd(8) manual page explains how SBD works in this regard:
    -n node
 Set local node name; defaults to "uname -n". This should not need to be set.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000019877
  • Creation Date: 11-Feb-2021
  • Modified Date:12-Feb-2021
    • SUSE Linux Enterprise High Availability Extension

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center