sleha-join to a cluster using udpu will not join cluster until after restart of corosync on current nodes

This document (7021065) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise High Availability Extension 11 Service Pack 4
SUSE Linux Enterprise High Availability Extension 12

Situation

When using the sleha-join script to add a new node to the cluster, it appears to work as it will connect to an existing node, pull down and csync2 the configuration from it, then starts clustering. (openais service).  However,  at this point, it will not join current cluster but ends up joining it's own cluster.

Resolution

Manual steps are required to add a new node to an existing cluster which uses unicast.  transport: udpu
1. Manually add the new nodes' IP Address to the /etc/corosync/corosync.conf configuration.on one of the existing nodes in the cluster.  This will be a new "memberaddr:" 
2. The new configuration should be distributed to all nodes in the cluster. csync2 -xv | scp | rsync 
    At this point, the new corosync configuration needs to be reloaded which means "openais" service will need to be stopped / started again.
    You can do this like a rolling update, bouncing one node in the cluster at a time allowing resources to migrate to other nodes in the cluster.
3. With the new corosync.conf configuration in the cluster, you can now use the "sleha-join" script on the new node.

Cause

The version of corosync that ships with SLES11 does not have the capability of dynamically reloading it's configuration. 
The only method is to 'stop' and 'start' the openais service with updated corosync.conf configuration which will update it's member / node list in memory.

Additional Information

If your SLES11 cluster is using multicast,  transport: udp, the sleha-join script works as expected.  By default, the sleha-init script will use multicast in the corosync.conf configuration so any subsequent sleha-join on new nodes should work as documented.
This has been fixed in SLES12sp2.  The command: "corosync-cfgtool -R" reloads the corosync configuration.  The sleha-join scripts will do this automatically for you starting with SLES12sp2.

Corosync.conf related parameters.
transport:      udpu
member {
                        memberaddr:     151.155.233.141
                }
                member {
                        memberaddr:     151.155.233.143
                }
                member {
                        memberaddr:     151.155.233.129

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7021065
  • Creation Date: 06-Jul-2017
  • Modified Date:03-Mar-2020
    • SUSE Linux Enterprise High Availability Extension

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center