SAP HANA Scale-Out Upgrade Details

Share
Share

Starting with version 0.180 the SAPHanaSR-ScaleOut package supports SAP HANA scale-out multi-target system replication. That means you can connect a third HANA site by system replication to either of the two HANA sites which are managed by the SUSE HA cluster. What about the SAP HANA Scale-Out Multi-Target Upgrade Details?

Picture: Preparing the details

In this blog article you will find details on selected steps of the upgrade procedure from old-style to multi-target, that is outlined in our blog SAPHanaSR-ScaleOut Multi-Target Upgrade . As mentioned there, we can not write down a complete upgrade procedure that works for all customer environments. Instead we give detailed examples for some important task. This examples are:

  • Check and document status of SUSE HA cluster and HANA system replication (step 1)
  • Upgrade SUSE HA cluster srHook attribute from old-style to multi-target (step 7)
  • Set resources SAPHanaController and SAPHanaTopology back from maintenance into managed mode (step 9)

Manual pages SAPHanaSR-manageAttr(8), SAPHanaSrMultiTarget.py(7), SAPHanaSR-showAttr(8) and SAPHanaSR_maintenance_examples(7) are showing more examples.

In the examples below we use SAP HANA SID “HA1” and instance number “00”.

Check and document status of SUSE HA cluster and HANA system replication (step 1)

Checking and documenting the initial status before changing something is always good practice. Do the same after the change. You can use a similar sequence before and after the upgrade. Of course you use different directories, e.g. “entry” and “final”. You can do both, checking and documenting, with the same commands. For documenting, just re-directed the output into files. Linux admins are knowing this commands from daily work:

# mkdir entry; cd entry
# crm_mon -1r >crm_mon.txt
# crm configure show >crm_configure.txt
# cibadmin -Ql >cibadmin.txt
# SAPHanaSR-showAttr >SAPHanaSR-showAttr.txt
# rpm -qa | grep -i SAP >rpm-qa-SAP.txt
# cp /hana/shared/HA1/global/hdb/custom/config/global.ini .
# md5sum /usr/share/SAPHanaSR-ScaleOut/SAPHanaSR.py >SAPHanaSR.py.md5
# md5sum /usr/share/SAPHanaSR-ScaleOut/SAPHanaSrMultiTarget.py >SAPHanaSrMultiTarget.py.md5
md5sum: /usr/share/SAPHanaSR-ScaleOut/SAPHanaSrMultiTarget.py: No such file or directory
# su - ha1adm -c "HDBSettings.sh systemReplicationStatus.py" >systemReplicationStatus.py.txt
# cd ..

Example: Documenting current status before the upgrade

Note: Expect md5sum error “No such file …” until you installed the new version 0.180 or later of SAPHanaSR-ScaleOut. This error should not occur when documenting the final state.

Upgrade SUSE HA cluster srHook attribute from old-style to multi-target (step 7)

Upgrading the SUSE HA cluster srHook attribute and checking the necessary pre-requisites from OS, SUSE HA and SAP HANA HA/DR provider might not be familiar to the Linux admin. Therefor we have encapsulated the needed steps as functionality in SAPHanaSR-manageAttr. As mentioned earlier, SAPHanaSR-manageAttr can do checking prerequisites as well as upgrading attributes. Let us assume we already successfully did all checks. So we now can upgrade the srHook attributes:

# SAPHanaSR-manageAttr --sid=HA1 --ino=00 --case multi-target --migrate msl_SAPHanaCon_HA1_HDB00; echo "rc: $?"
 *** INFO: Start checking....
 *** INFO: Check multi-target requirements.
 ...
 *** INFO: Check cluster health.
 ...
 *** INFO: Check resource agent update state on all cluster nodes.
 ...
 *** INFO: Check srHook generation attribute on all cluster nodes.
 ...
 *** INFO: Check sudoers configuration.
 ...
 *** INFO: Check srHook across all cluster nodes.
 ...
 *** INFO: Check multi-target attribute.
 *** INFO: Start migration....
 *** INFO: Check cluster health.
 *** INFO: Cluster has unmanaged parts.
 *** The multi-state resource for SID 'HA1' is currently unmanaged.
 *** The clone resource for SID 'HA1' is currently unmanaged.
 *** INFO: Set resource 'msl_SAPHanaCon_HA1_HDB00' to maintenance, if needed.
 *** INFO: Resource 'msl_SAPHanaCon_HA1_HDB00' already unmanaged.
 *** INFO: Start migration, remove globale SRHook attribute as it's no longer needed.
 Deleted crm_config option: id=SAPHanaSR-hana_ha1_glob_srHook name=hana_ha1_glob_srHook
 *** INFO: Release resource 'msl_SAPHanaCon_HA1_HDB00' from maintenance.
 *** WARNING: As WE do not set the resource 'msl_SAPHanaCon_HA1_HDB00' into maintenance
before, we will not release the maintenance state for the resource now. Please check and
release it by yourself.
 rc: 0

Example: Upgrading srHook attribute to multi-target

Once the command returned successfully, the srHook attribute has been changed inside the CIB. However, the changes are not immediately visible by SAPHanaSR-showAttr. First because the resource agents are in maintenance. Second because the site-specific srHook attributes will be set by the HANA HA/DR provider only in case of srConnectionChanged() events. We will finish SUSE HA cluster resource maintenance soon, see below. A SAP HANA HA/DR provider action might be triggered by testing the new setup for HANA primary failure.

In rare cases the upgrade might fail at one of the health checks, even if a previous check returned successfully:

 *** INFO: Check cluster health.
...
 *** ERROR: Cluster NOT in state S_IDLE, but in state S_TRANSITION_ENGINE.
 Please check.

Example: Upgrade interrupted due to busy cluster

In this cases the cluster is busy when SAPHanaSR-manageAttr wants to be sure the cluster is idle. Usually this is just a regular monitor operation. After you have checked if everything is fine and the cluster is idle again, try re-running SAPHanaSR-manageAttr.

Set resources SAPHanaController and SAPHanaTopology back from maintenance into managed mode (step 9)

Setting back the SAPHanaController from maintenance into managed mode is a common task. On the other hand, for SAPHanaTopology you do not do it that often. Usually the SAPHanaTopology stays managed all the time. So keep in mind: When both resources are in maintenance, obeying correct order and timings is crucial when setting back to managed mode.

Before starting, check if the SUSE HA cluster and the HANA instances including system replication are fine. If everything looks good, perform a resource refresh on SAPHanaTopology and SAPHanaController. The cluster needs some time to finish. Wait for each action to complete before calling another command. cs_wait_for_idle from package ClusterTools2 helps waiting for the cluster:

# cs_wait_for_idle -s 5
Cluster state: S_IDLE
# crm resource refresh cln_SAPHanaTop_HA1_HDB00; cs_wait_for_idle -s 5
...
Cluster state: S_IDLE
# crm resource refresh msl_SAPHanaCon_HA1_HDB00; cs_wait_for_idle -s 5
...
Cluster state: S_IDLE

Example: Refreshing SUSE HA cluster resources before setting back to managed

Now the cluster has updated the attributes and can take back control over HANA. First set SAPHanaTopology back to managed, second set back SAPHanaController. Again it takes some time before all resources have been monitored on every node. Finally you can display the new site-specific srHook and auxiliary attributes:

# cs_wait_for_idle -s 5
Cluster state: S_IDLE
# crm resource maintenance cln_SAPHanaTop_HA1_HDB00 off; cs_wait_for_idle -s 5
...
Cluster state: S_IDLE
# crm resource maintenance msl_SAPHanaCon_HA1_HDB00 off; cs_wait_for_idle -s 5
...
Cluster state: S_IDLE
# SAPHanaSR-showAttr
Glo cib-time                 mts  prim sec srmode sync_state upd 
-----------------------------------------------------------------
HA1 Fri Jul 23 13:32:54 2021 true S1   S2  sync   SOK        ok

Resource                 maintenance 
-------------------------------------
cln_SAPHanaTop_HA1_HDB00 false 
msl_SAPHanaCon_HA1_HDB00 false

Si lpt        lss mns    srHook srr 
------------------------------------
S1 1627039974   4 suse11 PRIM   P 
S2 30           4 suse21        S 

Hosts clone_state gra gsh node_state roles                          score site
-------------------------------------------------------------------------------
suse00 online                        :::
suse11 PROMOTED   2.0 2.0 online     master1:master:worker:master  150    S1
suse12 DEMOTED    2.0 2.0 online     slave:slave:worker:slave      -10000 S1
suse21 DEMOTED    2.0 2.0 online     master1:master:worker:master  100    S2
suse22 DEMOTED    2.0 2.0 online     slave:slave:worker:slave      -12200 S2

Example: Setting SUSE HA cluster resources back to managed

In the Glo(bal) section is no global “srHook” attribute anymore. Instead the Si(te) section shows two site-specific attributes. For the primary site “S1” that site-specific “srHook” is “PRIM”. This is an auxiliary value, because the primary does not have this attribute on its own. For the secondary site “S2” the attribute is empty. This is because the srHook will be updated by HANA on srConnectionChanged() events. You can find more information about this state in the above section on migration prerequisites. The Glo(bal) section also contains two new auxiliary attributes “mts” and “upd”. They are indicating the overall cluster multi-target awareness. “mts true” means that multi-target support is true, “upd ok” means the update was succesfully done.The Hosts section also shows two new auxiliary attributes. “gra” shows the generation of the resource agent, “gsh” the generation of the HADR provider script for srConnectionChanged(). In our case both are “2.0” for all relevant nodes, which leads to the overall “mts true”. You can find more details in manual page SAPHanaSR-manageAttr(8).

To complete this step, once more you should check the overall cluster status, for example by calling “crm_mon -1r” and “cs_clusterstate -i”.

Where can I find further information?

Please have a look at the reference part of this blog series (link will follow soon).

– Related blog articles

https://www.suse.com/c/tag/towardszerodowntime/

– Product documentation

https://documentation.suse.com/

– Manual pages
SAPHanaSR-manageAttr(8), SAPHanaSR-ScaleOut(7), ocf_suse_SAPHanaController(7), SAPHanaSrMultiTarget.py(7), SAPHanaSR-ScaleOut_basic_cluster(7), SAPHanaSR_maintenance_examples(7), SAPHanaSR-showAttr(8), ha_related_suse_tids(7), crm(8), crm_attribute(8), crm_mon(8), sudo(8), cs_wait_for_idle(8),
cs_clusterstate(8)

Share
(Visited 1 times, 1 visits today)
lpinne
787 views