Preventing a Fence Race in Split Brain (COROSYNC,PACEMAKER)
This document (7022467) is provided subject to the disclaimer at the end of this document.
Node1 sees Node2 gone and fences
Nov 20 15:17:40  node1 cib: notice: crm_update_peer_state_iter: Node node2 state is now lost | nodeid=168364360 previous=member source=crm_update_peer_proc
Nov 20 15:17:41  node1 pengine: warning: pe_fence_node: Node node2 will be fenced because the node is no longer part of the cluster
Node2 sees Node1 gone and fences at the same time
Nov 20 15:17:40  node2 cib: notice: crm_update_peer_state_iter: Node node1 state is now lost | nodeid=168364359 previous=member source=crm_update_peer_proc
Nov 20 15:17:41  node2 pengine: warning: stage6: Scheduling Node node1 for STONITH
the resulting effect is, that both nodes fence each other. While Data Integrity is maintained this results in a complete loss of all services.
which, in case of an IPMI Device could look like
primitive brie_stonith_ducal stonith:external/ipmi \
params pcmk_delay_max=20 hostname=ducal ipaddr=10.162.192.209 userid=admin passwd=xxxx interface=lanplus \
op monitor interval=1800 timeout=20
this will make it more likely, that one fencing device will have a delay. It is at that moment irrelevant which node fences which node, as there is no way for a Cluster without Quorum to determine the right node to be fenced.
params pcmk_delay_base=0 ...
params pcmk_delay_base=36 ...
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7022467
- Creation Date: 18-Dec-2017
- Modified Date:23-Feb-2021
- SUSE Linux Enterprise High Availability Extension
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com