Dynamically changing the Cluster Size of an UDPU HAE Cluster

This document (7023669) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise High Availability Extension 12

Situation

This is an advanced Cluster change, please do this with care.

Normally, as per default, the HAE Corosync Cluster, is configured as Multicast for corosync communication. One advantage of the Multicast is that it is a network, so nodes can join and leave the cluster without changing corosync.conf.

Some setups in some environments require to use UDPU, which introduces as set of Node definitions in the

nodelist

Section of the corosync.conf file. As such, the members of the cluster are hard coded into one of the main configuration files.

Consequently in such a scenario any addition or removal of cluster nodes requires reload or restart of corosync service, which would basically introduce complete cluster down in the latter case.

Resolution

To change the nodes dynamically the configuration file has to be changed on all nodes and then by invoking

   corosync-cfgtool -R

the Corosync on all members of the cluster ordered to reread the configuration.

The Machines in in the example are

    oldhanaa1
    oldhanaa2
    oldhanad1
    oldhanad2

The start is to have a working cluster comprising of the nodes

   oldhanad1
   oldhanad2

then to add

   oldhanaa1
   oldhanaa2

and then to remove

   oldhanad1
   oldhanad2

Essentially it is a move from 2 Node Cluster -> 4 Node Cluster -> 2 Node Cluster. This is actually not a good example as we recommend odd number of cluster nodes, eg 3 or 5, but is picked because it was easy to implement in an existing test setup.

The cluster has fencing via SBD with all nodes able to reach the device

All nodes are in a 2 Ring Scenario, which is not a requirement, but only to show that it works with 2 Rings as well.

The starting configuration on

   oldhanad1
   oldhanad2

would be

totem {
        version: 2
        secauth: off
        crypto_hash: sha1
        crypto_cipher: aes256
        cluster_name: nirvanacluster
        clear_node_high_bit: yes
        token: 10000
        consensus: 12000
        token_retransmits_before_loss_const: 10
        join: 60
        max_messages: 20
        interface {
                ringnumber: 0
                bindnetaddr: 10.162.192.0
                mcastport: 5405
                ttl: 1
        }

        interface {
                ringnumber: 1
                bindnetaddr: 192.168.128.0
                mcastport: 5405
                ttl: 1
        }
        transport: udpu
        rrp_mode: active
}

logging {
        fileline: off
        to_stderr: no
        to_logfile: yes
        logfile: /var/log/cluster/corosync.log
        to_syslog: yes
        debug: off
        timestamp: on
        logger_subsys {
                subsys: QUORUM
                debug: off
        }

}

nodelist {
        node {
                ring0_addr: 10.162.193.133
                ring1_addr: 192.168.128.3
                nodeid: 3
        }
        node {
                ring0_addr: 10.162.193.134
                ring1_addr: 192.168.128.4
                nodeid: 4
        }
}

quorum {

        # Enable and configure quorum subsystem (default: off)
        # see also corosync.conf.5 and votequorum.5
        provider: corosync_votequorum
        expected_votes: 2
        two_node: 1
}

and the cluster looks like

Stack: corosync
Current DC: oldhanad1 (version 1.1.16-6.5.1-77ea74d) - partition with quorum
Last updated: Wed Jan 23 16:00:04 2019
Last change: Wed Jan 23 15:09:21 2019 by root via crm_node on oldhanad2

2 nodes configured
9 resources configured

Online: [ oldhanad1 oldhanad2 ]

Active resources:

killer (stonith:external/sbd): Started oldhanad1
Clone Set: base-clone [base-group]
     Started: [ oldhanad1 oldhanad2 ]
test1 (ocf::heartbeat:Dummy): Started oldhanad1
test2 (ocf::heartbeat:Dummy): Started oldhanad2
test3 (ocf::heartbeat:Dummy): Started oldhanad1
test4 (ocf::heartbeat:Dummy): Started oldhanad2

it is advisable to set

   no-quorum-policy=freeze

as the cluster will otherwise on loss of quorum stop resources which might defeat the purpose of this procedure

to add the nodes

   oldhanaa1
   oldhanaa2

the following is done

1) make sure that the cluster is not running on

   oldhanaa1
   oldhanaa2

2) create the new corosync.conf

in this example

--- corosync.conf       2019-01-23 15:09:32.944006828 +0100
+++ corosync.conf.4node 2019-01-23 13:14:01.845075910 +0100
@@ -44,6 +44,16 @@

nodelist {
         node {
+                ring0_addr: 10.162.193.12
+                ring1_addr: 192.168.128.1
+                nodeid: 1
+        }
+        node {
+                ring0_addr: 10.162.193.126
+                ring1_addr: 192.168.128.2
+                nodeid: 2
+        }
+        node {
                 ring0_addr: 10.162.193.133
                 ring1_addr: 192.168.128.3
                 nodeid: 3
@@ -61,6 +71,6 @@
         # Enable and configure quorum subsystem (default: off)
         # see also corosync.conf.5 and votequorum.5
         provider: corosync_votequorum
-        expected_votes: 2
-        two_node: 1
+        expected_votes: 4
+        two_node: 0
}

we add 2 Nodes and change the quorum section

3) copy the new corosync.conf onto all servers

   oldhanaa1
   oldhanaa2
   oldhanad1
   oldhanad2

4) check all resources that have a local dependency onto a node, for example HANA Databases. If in doubt set these resources to

   unmanaged

to prevent them from being moved by the cluster.

5) invoke on one of the already active cluster nodes, in this example
that would be

    oldhanad1 or oldhanad2

the command

   corosync-cfgtool -R

the Cluster will change now to

Stack: corosync
Current DC: oldhanad1 (version 1.1.16-6.5.1-77ea74d) - partition WITHOUT quorum
Last updated: Wed Jan 23 16:10:20 2019
Last change: Wed Jan 23 15:09:21 2019 by root via crm_node on oldhanad2

2 nodes configured
9 resources configured

Online: [ oldhanad1 oldhanad2 ]

Active resources:

killer (stonith:external/sbd): Started oldhanad1
Clone Set: base-clone [base-group]
     Started: [ oldhanad1 oldhanad2 ]
test1   (ocf::heartbeat:Dummy): Started oldhanad1
test2   (ocf::heartbeat:Dummy): Started oldhanad2
test3   (ocf::heartbeat:Dummy): Started oldhanad1
test4   (ocf::heartbeat:Dummy): Started oldhanad2

because a 4 Node Cluster with 2 Nodes does not have quorum.

6) start pacemaker on nodes

   oldhanaa1
   oldhanaa2

the cluster will change to

Stack: corosync
Current DC: oldhanad1 (version 1.1.16-6.5.1-77ea74d) - partition with quorum
Last updated: Wed Jan 23 16:11:05 2019
Last change: Wed Jan 23 16:10:49 2019 by hacluster via crmd on oldhanad1

4 nodes configured
13 resources configured

Online: [ oldhanaa1 oldhanaa2 oldhanad1 oldhanad2 ]

Active resources:

killer (stonith:external/sbd): Started oldhanad1
Clone Set: base-clone [base-group]
     Started: [ oldhanaa1 oldhanaa2 oldhanad1 oldhanad2 ]
test1 (ocf::heartbeat:Dummy): Started oldhanad1
test2 (ocf::heartbeat:Dummy): Started oldhanaa2
test3 (ocf::heartbeat:Dummy): Started oldhanaa1
test4 (ocf::heartbeat:Dummy): Started oldhanad2

Second step, remove to original nodes, meaning, shrink the cluster to 2 nodes again

1) put the nodes to be removed in standby

   crm node standby oldhanad1
   crm node standby oldhanad2

2) stop the cluster on the nodes to be removed

   oldhanad1
   oldhanad2

the cluster will change to

Stack: corosync
Current DC: oldhanaa1 (version 1.1.16-6.5.1-77ea74d) - partition WITHOUT quorum
Last updated: Wed Jan 23 16:17:09 2019
Last change: Wed Jan 23 16:16:49 2019 by root via crm_attribute on oldhanad2

4 nodes configured
13 resources configured

Node oldhanad1: OFFLINE (standby)
Node oldhanad2: OFFLINE (standby)
Online: [ oldhanaa1 oldhanaa2 ]

Active resources:

killer (stonith:external/sbd): Started oldhanaa1
Clone Set: base-clone [base-group]
     Started: [ oldhanaa1 oldhanaa2 ]
test1 (ocf::heartbeat:Dummy): Started oldhanaa1
test2 (ocf::heartbeat:Dummy): Started oldhanaa2
test3 (ocf::heartbeat:Dummy): Started oldhanaa1
test4 (ocf::heartbeat:Dummy): Started oldhanaa2

3) modify the corosync.conf to

--- corosync.conf       2019-01-23 12:04:53.851763302 +0100
+++ corosync.conf.2Node 2019-01-23 16:15:59.077676093 +0100
@@ -53,16 +53,6 @@
                 ring1_addr: 192.168.128.2
                 nodeid: 2
         }
-        node {
-                ring0_addr: 10.162.193.133
-                ring1_addr: 192.168.128.3
-                nodeid: 3
-        }
-        node {
-                ring0_addr: 10.162.193.134
-                ring1_addr: 192.168.128.4
-                nodeid: 4
-        }
}

@@ -71,6 +61,6 @@
         # Enable and configure quorum subsystem (default: off)
         # see also corosync.conf.5 and votequorum.5
         provider: corosync_votequorum
-        expected_votes: 4
-        two_node: 0
+        expected_votes: 2
+        two_node: 1
}

we remove the original 2 Nodes and change the quorum section again

4) copy this modified corosync.conf file onto

   oldhanaa1
   oldhanaa2

5) remove the old nodes

crm node delete oldhanad1
crm node delete oldhanad2

6) invoke on one of the nodes that still actively run the cluster,
in this example that would be

    oldhanaa1 or oldhanaa2

the command

   corosync-cfgtool -R

the cluster will change to

Stack: corosync
Current DC: oldhanaa1 (version 1.1.16-6.5.1-77ea74d) - partition with quorum
Last updated: Wed Jan 23 16:21:36 2019
Last change: Wed Jan 23 16:21:34 2019 by root via crm_node on oldhanaa2

2 nodes configured
9 resources configured

Online: [ oldhanaa1 oldhanaa2 ]

Active resources:

killer (stonith:external/sbd): Started oldhanaa1
Clone Set: base-clone [base-group]
     Started: [ oldhanaa1 oldhanaa2 ]
test1 (ocf::heartbeat:Dummy): Started oldhanaa1
test2 (ocf::heartbeat:Dummy): Started oldhanaa2
test3 (ocf::heartbeat:Dummy): Started oldhanaa1
test4 (ocf::heartbeat:Dummy): Started oldhanaa2

Additional Information

This is not supposed to be used as a blueprint but as a guideline for an experienced Pacemaker/Corosync Cluster Administrator

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.