Optimal SAP HANA maintenance procedure using handshake takeover for SUSE clusters
This blog describes the optimal maintenance procedure using SAP HANA handshake takeover on SUSE clusters.
Major Steps for a successful maintenance procedure
For processing a maintenance for your SAP HANA database with system replication and controlled by the SUSE cluster we recommend always to follow these three major steps.
- Begin the maintenance: Set the multi-state cluster resource into maintenance.
- Do the changes on the SAP workload: Process your needed changes on the SAP HANA instance pair. In this blog we exchange the primary and secondary using the handshake takeover and online registration.
- End the maintenance: Refresh the multi-state cluster resource. Set it back to cluster operation.
These three steps are quite abstract, but easy to remember. Never forget them, until we document an updated list of steps.
The green boxes describe the command with the SUSE cluster. The blue one is your interaction with the SAP HANA databases. Typically the work on the SAP HANA databases is the most time consuming part of the maintenance.
The handshake maintenance procedure step-by-step
Before starting any maintenance procedure it is recommended to check for any errors in the cluster. Note down the SAP HANA site names, SID and instance number. The SAP HANA site names known to cluster can be obtained with SAPHanaSR-showAttr. The SAP HANA site names must not be changed. They must match the site names known to cluster.
Begin the maintenance
- Set the maintenance meta attribute of the multi-state resource for SAP HANA.
# crm resource maintenance msl_SAPHana_HA1_HDB10 on
- Wait, till the cluster is in idle (S_IDLE) state. Remark: cs_wait_for_idle is part of the ClusterTools2 package.
# cs_wait_for_idle -s 5 Cluster state: S_IDLE
Do the changes on the SAP workload
- Before triggering the takeover, check that the system replication is in sync. Login on the primary.
# su - ha1adm ~> HDBSettings.sh systemReplicationStatus.py --sapcontrol=1 | egrep -i '(site|overall).*replication_status' site/2/REPLICATION_STATUS=ACTIVE overall_replication_status=ACTIVE
Check the site and overall replication status. In best both are “ACTIVE”.
- It’s time to trigger the takeover in handshake mode. The primary suspends the transaction activity. Primary and secondary still communicate about the system replication status. They check, if the system replication is in sync. This means the last committed transaction is available on the secondary. The takeover is released. The former secondary gets the new primary. The former primary is still up but blocks all transactions. Login on the secondary.
# su - ha1adm ~> hdbnsutil -sr_takeover --suspendPrimary
- Once the takeover is successful, register the old primary to get the new secondary. This blog documents that in the online mode. As an alternative first stop SAP HANA, register offline and start SAP HANA. Make sure to use the same site names as note down before (here WDF). Logon on the ‘old’ primary.
# su - ha1adm ~> hdbnsutil -sr_register \ --online \ --name=WDF \ --remoteHost=suse02 --remoteInstance=10 \ --replicationMode=sync --operationMode=logreplay
- Repeat step 1. Check the replication status on the ‘new’ primary. Before we end the maintenance, the new SAP HANA instance pair should be both up-and-running. The system replication status must be ‘ACTIVE‘. The cluster should show no resource errors.
End the maintenance
- Refresh the multi-state resource. The resource agent now detects the new system replication topology and adjusts the promotion scores.
# crm cluster refresh msl_SAPHana_HA1_HDB10 # cs_wait_for_idle -s 5 Cluster state: S_IDLE
- End the maintenance. The resource agent sets the e.g. srHook attribute for the new primary site to ‘PRIM‘.
# crm resource maintenance msl_SAPHana_HA1_HDB10 off
Finally check the cluster is in normal operation, is in idle state and does not show resource errors.
Where to get more information?
For our blog series #TowardsZeroDowntime my colleague has described other maintenance related tasks in his blog SLES for SAP HANA Maintenance Procedures – Part -1.
As always please have a look on our man pages of the packages SAPHanaSR, ClusterTools2 and others. In special for the seamless maintenance procedure consult the man pages SAPHanaSR (7), SAPHanaSR-showAttr (8) and SAPHanaSR_maintenance_examples (7).
What to take with
It is very easy to run the seamless maintenance procedure for SAP HANA in a SUSE cluster. As described some basics needs to be taken into account. The maintenance procedure provided by SUSE also supports the handshake takeover. In the handshake takeover the SAP HANA primary suspends the transaction activity. The secondary starts the takeover, if the system replication is already in sync.
Please also read our other blogs about #TowardsZeroDowntime.