How to setup corosync token and consensus in a cluster with more then 2 nodes using unicast (udpu)

This document (000020513) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise High Availability Extension 12
SUSE Linux Enterprise High Availability Extension 15

Situation

Rule of Thumb is: the consensus should be 1.2 times the token.

2 node cluster example from /etc/corosync/corosync.conf 

token: 5000
consensus: 6000

consensus = 1.2 * token = 1.2*5000=6000


Which is true for a 2 node cluster. Checking with corosync-cmapctl the following values are shown:

node1:~ # corosync-cmapctl | egrep 'runtime.config.totem.token|runtime.config.totem.consensus'
runtime.config.totem.consensus (u32) = 6000
runtime.config.totem.token (u32) = 5000
runtime.config.totem.token_retransmit (u32) = 490
runtime.config.totem.token_retransmits_before_loss_const (u32) = 10


However when looking at a cluster with for example 11 nodes, and reading the values with corosync-cmapctl.

node1:~ # corosync-cmapctl | egrep 'runtime.config.totem.token|runtime.config.totem.consensus'
runtime.config.totem.consensus (u32) = 6000
runtime.config.totem.token (u32) = 10850
runtime.config.totem.token_retransmit (u32) = 384
runtime.config.totem.token_retransmits_before_loss_const (u32) = 28


The value for consensus is smaller then the value for the token.

Resolution

Based on the information found in "man corosync.conf"

token:  
This timeout is used directly or as a base for real token timeout calculation (explained in token_coefficient section). Token timeout specifies in milliseconds until a token loss is  declared  after  not  receiving  a
token. This is the time spent detecting a failure of a processor in the current configuration. Reforming a new configuration takes about 50 milliseconds in addition to this timeout.
For real token timeout used by totem it's possible to read cmap value of runtime.config.token key.
The default is 1000 milliseconds.

token_coefficient
This  value is used only when nodelist section is specified and contains at least 3 nodes. If so, real token timeout is then computed as token + (number_of_nodes - 2) * token_coefficient.  This allows cluster to scale
without manually changing token timeout every time new node is added. This value can be set to 0 resulting in effective removal of this feature.
The default is 650 milliseconds
        
consensus
This timeout specifies in milliseconds how long to wait for consensus  to be achieved before starting a new round of membership configuration.  The minimum value for consensus must  be  1.2  * token.
This  value  will  be automatically calculated at 1.2 * token if the user doesn't specify a consensus value.

For two node clusters, a consensus larger than the join  timeout but less than token is safe.  For three node or larger clusters, consensus should be larger than token.  
There is  an  increasing risk  of  odd  membership changes, which still guarantee virtual synchrony,  as node count grows if consensus is less than token.
The default is 1200 milliseconds.



There are 2 ways to set this up:

1. token is specified, consensus is not specified, which will result in: 

real token timeout  = token + (number_of_nodes - 2) * token_coefficient = 5000 + (11-2)*650 = 10850
consensus = 1.2 * real token timeout = 1.2 * 10850 = 13020

node1:~ # corosync-cmapctl | egrep 'runtime.config.totem.token|runtime.config.totem.consensus'
runtime.config.totem.consensus (u32) = 13020
runtime.config.totem.token (u32) = 10850
runtime.config.totem.token_retransmit (u32) = 384
runtime.config.totem.token_retransmits_before_loss_const (u32) = 28


2. token is specified, consensus is specified and "token_coefficient: 0", which will result in:

real token timeout  = token + (number_of_nodes - 2) * token_coefficient = 5000 + (11-2)*650*0 = 5000
consensus = 6000

node1:~ # corosync-cmapctl | egrep 'runtime.config.totem.token|runtime.config.totem.consensus'
runtime.config.totem.consensus (u32) = 6000
runtime.config.totem.token (u32) = 5000
runtime.config.totem.token_retransmit (u32) = 177
runtime.config.totem.token_retransmits_before_loss_const (u32) = 28


 

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000020513
  • Creation Date: 25-Nov-2021
  • Modified Date:25-Nov-2021
    • SUSE Linux Enterprise High Availability Extension

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center