My Favorites

Close

Please to see your favorites.

  • Bookmark
  • Email Document
  • Printer Friendly
  • Favorite
  • Rating:

Showing SOK Status in Cluster Monitoring Tools Workaround

This document (7023526) is provided subject to the disclaimer at the end of this document.

Environment

This is a technical advanced Workaround to make the SOK Status more visible in the Cluster. It applies to SAP Hana Scale up - Performance Optimized Scenarios on SLE 11 and SLE 12.

Situation

Using the Performance Optimized Best Practice Guide for Hana on SLES, the information whether the HANA is ready to do a failover in case of a failure of the primary is not obvious. And only if the Synchronisation is in sync the Resource Agent will do a promote on the Secondary in case of Failure of the Primary.

Whether the System is detected in this state could be checked by using the

  SAPHanaSR-showAttr

from SAPHanaSR RPM which shows the System Replication state.

As an alternative it would be possible to invoke

  crm_mon -A

and check in the Attribute Section for "SOK" on the Secondary.

In some scenarios, where maybe only the HAWK output or the simple crm_mon output is checked this information is not visible.

Resolution

The information necessary is already in the cluster cib. One can add a Resource that will actually work like a flag. And add a location constraint that only allows the start of the resource if the cluster detects "SOK" for a HANA.

Assuming the SID of the HANA in question would be HA1

First one creates the Resource that will act as Flag

   crm configure primitive rsc_HANAinSync_HA1_HDB00 Dummy

the name is the relevant information in this case. And then comes the location rule

   crm configure location loc_HA1_HDB00_inSync
rsc_HANAinSync_HA1_HDB00 rule -inf: not_defined hana_ha1_sync_state or hana_ha1_sync_state ne "SOK"

to make it more generally and easy understood, the logic is  encapsulated in the location rule in the

 not_defined hana_ha1_sync_state
 hana_ha1_sync_state ne "SOK"

which queries the cluster for an Attribute

   hana_ha1_sync_state

which will only exist and be set, if there is a Hana with SID HA1 running and in status sync ok.

Which means that in case of a HANA with SID PL4 it would look like

   crm configure primitive rsc_HANAinSync_PL4_HDB00 Dummy

   crm configure location loc_PL4_HDB00_inSync
rsc_HANAinSync_PL4_HDB00 rule -inf: not_defined hana_pl4_sync_state or hana_pl4_sync_state ne "SOK"

The adaption of the SID in the Resource Name of the Flag Resource is not necessary from a code level, but it is the relevant information to be passed by the cluster to the Administrator.

Example from a system

oldhanad2:~ # crm_mon -A1
Stack: corosync
Current DC: oldhanad1 (version 1.1.16-6.5.1-77ea74d) - partition with quorum
Last updated: Fri Nov 16 15:50:15 2018
Last change: Fri Nov 16 15:49:49 2018 by root via crm_attribute on oldhanad1

2 nodes configured
7 resources configured

Online: [ oldhanad1 oldhanad2 ]

Active resources:

 billythekid    (stonith:external/sbd): Started oldhanad1
 rsc_ip_HD0_HDB00       (ocf::heartbeat:IPaddr2):       Started oldhanad1
 Master/Slave Set: msl_SAPHana_HD0_HDB00 [rsc_SAPHana_HD0_HDB00]
     Masters: [ oldhanad1 ]
     Slaves: [ oldhanad2 ]
 Clone Set: cln_SAPHanaTopology_HD0_HDB00 [rsc_SAPHanaTopology_HD0_HDB00]
     Started: [ oldhanad1 oldhanad2 ]

Node Attributes:
* Node oldhanad1:
    + hana_hd0_clone_state              : PROMOTED 
    + hana_hd0_op_mode                  : logreplay
    + hana_hd0_remoteHost               : oldhanad2
    + hana_hd0_roles                    : 4:P:master1:master:worker:master
    + hana_hd0_site                     : matti    
    + hana_hd0_srmode                   : sync     
    + hana_hd0_sync_state               : PRIM     
    + hana_hd0_version                  : 2.00.030.00.1522209842
    + hana_hd0_vhost                    : oldhanad1
    + lpa_hd0_lpt                       : 1542379789
    + master-rsc_SAPHana_HD0_HDB00      : 150      
* Node oldhanad2:
    + hana_hd0_clone_state              : DEMOTED  
    + hana_hd0_op_mode                  : logreplay
    + hana_hd0_remoteHost               : oldhanad1
    + hana_hd0_roles                    : 4:S:master1:master:worker:master
    + hana_hd0_site                     : teppo    
    + hana_hd0_srmode                   : sync     
    + hana_hd0_sync_state               : SOK      
    + hana_hd0_version                  : 2.00.030.00.1522209842
    + hana_hd0_vhost                    : oldhanad2
    + lpa_hd0_lpt                       : 30       
    + master-rsc_SAPHana_HD0_HDB00      : 100      

the SID is hd0 and the attribute to look for would be hana_hd0_sync_state, the commands to enter would be in this case

 oldhanad2:~ # crm configure primitive rsc_HANAinSync_HD0_HDB00 Dummy
 oldhanad2:~ #

and

 oldhanad2:~ # crm configure location loc_HD0_HDB00_inSync rsc_HANAinSync_HD0_HDB00 rule -inf: not_defined hana_hd0_sync_state or hana_hd0_sync_state ne "SOK"
 oldhanad2:~ #

which will result in

Stack: corosync
Current DC: oldhanad1 (version 1.1.16-6.5.1-77ea74d) - partition with quorum
Last updated: Fri Nov 16 15:53:23 2018
Last change: Fri Nov 16 15:52:57 2018 by root via crm_attribute on oldhanad1

2 nodes configured
8 resources configured

Online: [ oldhanad1 oldhanad2 ]

Active resources:

billythekid     (stonith:external/sbd): Started oldhanad1
rsc_ip_HD0_HDB00        (ocf::heartbeat:IPaddr2):       Started oldhanad1
 Master/Slave Set: msl_SAPHana_HD0_HDB00 [rsc_SAPHana_HD0_HDB00]
     Masters: [ oldhanad1 ]
     Slaves: [ oldhanad2 ]
 Clone Set: cln_SAPHanaTopology_HD0_HDB00 [rsc_SAPHanaTopology_HD0_HDB00]
     Started: [ oldhanad1 oldhanad2 ]
rsc_HANAinSync_HD0_HDB00      (ocf::heartbeat:Dummy): Started oldhanad2


the resource rsc_HANAinSync_HD0_HDB00 will not be active if the Hana with SID HD0 is not in SOK, sync ok, state.

Additional Information

The location rule logic is a variation of a pingd setup.

Change Log

Created 16.11.2018 by rschmid
Changed 20.11.2018 by rschmid
Changed 21.11.2018 by rschmid - adapted the naming style from the Best Practice Guide


Disclaimer

This Support Knowledgebase provides a valuable tool for NetIQ/Novell/SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7023526
  • Creation Date:16-NOV-18
  • Modified Date:21-NOV-18
    • SUSESUSE Linux Enterprise High Availability Extension
< Back to Support Search

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center