IP Cluster Resource failed to failover with a blocked status

This document (000020600) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server for SAP Applications 12 SP3 x86_64

Situation

In a HANA two node cluster, an abnormal reboot occurred for the DC/primary node (node#1). Unfortunately, the IP resources (admin_ip and  rsc_ip) failed to start on the new DC/primary node (node#2) with a blocking status:
     Transition Summary:
 * Start      admin_ip             ( node2 )   blocked
 * Start      rsc_ip               ( node2 )   blocked

Resolution

The solution for this issue is to slow down the fast reboot of the failed VM, so, the surviving node will have some time to recognize that the other node is active again. Moreover, if fencing of the failed node is issued, then it will be executed before the failed node starts joining the cluster.

Edit the /etc/sysconfig/sbd (this is an online change and no need to restart the pacemaker service).

The default value:

SBD_DELAY_START=no 

The suggested value:

SBD_DELAY_START=60 

(This is in seconds. This value could be "yes" and then it will be equal to "msgwait" from the SBD device, meanwhile, it is better to offer more time (60 sec.). The optimum is to be 2*"token" assuming the "token" value is more than "msgwait")

Cause

Two reboots occurred in a short time. 

In this scenario, the token and consensus values are 30 and 36 seconds. These values will help keep the cluster surviving due to a network interruption. Meanwhile, the cluster will need some time before re-forming a new cluster membership after losing one of the nodes. 

After the first reboot; the cluster issued a fencing order for the failed node. Due to the fast reboot of the VM, and the margin of time before re-forming the cluster; the fencing of the failed node happened while it was joining the cluster. The cluster held the IP resources to start in neither node; since it was not stopped and fenced while it is starting.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000020600
  • Creation Date: 08-Mar-2022
  • Modified Date:08-Mar-2022
    • SUSE Linux Enterprise High Availability Extension
    • SUSE Linux Enterprise Server for SAP Applications

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center