[Tutorial] Deploying SAP Data Hub on SUSE CaaS Platform
SUSE is the trusted and preferred open source platform for SAP customers who want to unlock data intelligence, drive innovation and run with the best. So it’s no wonder that SUSE CaaS Platform is one of the few cloud platforms verified by SAP for its premier data analytics platform, SAP Data Hub. SAP Data Hub manages incoming data from SAP systems and provides a dashboard for users to do something with that data. It is delivered as a Kubernetes (K8s) ready workload.
With the announce of SUSE Start for CaaS Platform, SUSE Global Services delivers a two week engagement that gets you up and running quickly. But can you really get the infrastructure you need implemented in two weeks? This post, written by Jean Marc Lambert and Martin Weiss, two senior architects with SUSE Global Services, describes how to do just that. So let’s follow along with the experts.
But First: What is SAP Data Hub?
Taken from SAP’s own website, SAP Data Hub is an”all-in-one data orchestration solution [that] discovers, refines, enriches, and governs any type, variety, and volume of data across your entire distributed data landscape. It supports your intelligent enterprise by rapidly delivering trustworthy data to the right users with the right context at the right time.”
And why SUSE? Simply put SUSE Linux Enterprise Server is a reference development platform for SAP applications including SAP HANA. It is also validated by SAP and has a proven history of customer success with more than 90 percent running SAP HANA on SUSE Linux Enterprise Server for SAP Applications. This operating system is also widely used for cloud deployments (such as Amazon Web Services, Microsoft Azure, and Google Cloud Platform).
Simply put, SAP Data Hub manages the data analytics provided by the SAP systems and showcases that data on one single pane of glass.
So while SUSE is a provider of a few cloud solutions, including SUSE OpenStack Cloud and SUSE Cloud Application Platform, it’s really SUSE CaaS Platform that provides a full fledge certified Kubernetes distribution, complemented with a management layer. Therefore, it is the obvious choice for our framework on which to run Data Hub.
And because local cloud deployments require dynamic provisioning of storage to complement to the platforms, we choose SUSE Enterprise Storage as our software defined storage solution. SUSE Enterprise Storage provides a cost-effective way to manage large volumes of data, taking into consideration the redundancy, high availability, scaling and on-demand allocation as key properties.
Building the SAP Data Hub Framework
The remainder of this post provides the information you need to build the framework to support SAP Data Hub, based on SUSE CaaS Platform and SUSE Enterprise Storage for the storage management using RBD. RBD is a utility for manipulating rados block device (RBD) images. The combination of SUSE CaaS Platform and SUSE Enterprise Storage has been verified using the SAP qualification framework.
Assumptions for This Solution
Because we want to have close access to the SAP Applications hosted in the customer’s data center, we will implement as an on-premise proof of concept (PoC) deployment.
For our POC, we will:
- Deploy 3 physical servers to host the whole solution (see sizing below) and provide high availability (HA) capability at hardware level.
- Host SUSE CaaS Plaform on SLES KVM Hosts. We will also deploy an HA CaaS Platform with 3 Masters & 3 Nodes (workers).
- Deploy SUSE Enterprise Storage on bare metal to ensure full fledge performance rate. (For the POC we will deploy SUSE Enterprise Storage on the same servers as SUSE CaaS Platform, however, dedicated nodes are recommended when moving to production – ensuring a futureproof solution that is performant, evolutive and scalable.)
- Install performant NVMe drives to provide fast disk operations in SUSE Enterprise Storage
- Deploy the harness(infra) required for the deployment and management lifecycle of the solution. This will include:
- Registry: a local Docker Distribution Registry
- Portus: the SUSE Access Controller to the registry
- SMT/RMT: local store & gateway to SUSE update channels
- Jumpbox: a stepping stone VM to access and admin the solution.
Note that we will deploy all of these services as a VM on a SUSE Linux Enterprise Servicer KVM Host.
Sizing the Solution
To size the solution, we use the following for a realistic PoC. That means, we will be able to mirror this in a production environment. To that end, we will set up three physical servers with :
CPU : 2 CPUs x 24 cores/server RAM : 144Gb/server Disks : NVMe : 800GB/ Storage will be thin provisioned. Network: 4 x 10G /server
In addition, we will also require a LAN switch with sufficient ports and ample internet connectivity.
Note: For an air-gapped deployment, you will need one additional server to host infra services as a gateway between internet and this solution.
Production-ready deployments might also requre dedicated servers for SUSE Enterprise Servicer, but we will study the exact sizing with the customer to meet their expectations. (Performance, redundancy, distribution…)
This architecture is a recap of what we will deploy to host the SAP Data framework for a realistic PoC solution
To deploy and operate this solution, we will require ample internet access to SAP to download the Data Hub, including « images » that are quite big (20GB) and Helm charts, to the SUSE registry and update channels to perform the installation as well as updating/patching of the components.
At the minimum, the Infra node is to be connected; the other servers can be air-gapped.
Platform Installation steps
Now that the servers are racked, plugged and network connected, we can set up the platform to support SAP Data Hub.
Install Host & Infra
- Install the SLES KVM Host on each server
- On the infra one, deploy the various components as VMs
- Jumpbox : in which you will install the various tools you will need to deploy/admin
- SMT/RMT : repo connected to SUSE channels to receive packages & updates
- Registry : store for docker images (eg: copy of the SAP DataHub images)
- Portus : access control to the registry
- Optional: Load Balancer, DNS server, NTP server, HTTP Proxy server. Infrastructure services
Install SUSE CaaS Platform
- Download the SUSE Linux Enterprise Server and CaaS Installation Sources, and create a CaaS Admin/Infrastructure VM. (that is, an admin or infra-server / management node)
- Create the DNS entries for your CaaS Platform (at least an admin node and setup a load balancer with or without VIP (nginx, haproxy, apache…).
- Bootstrap as many VMs as required to create your CaaS Cluster (3Master + 3 Nodes)
- Use the CaaS cluster bootstrap method when the various VMs are ready to join and
- Make 3 VMs (1/Host) to be master
- Make the rest of your VMs as nodes.
- From your Jumpbox, download the kubeconfig file to access the kubernetes cluster.
Initialize helm In Your Cluster
- Install an ingress controller (nginx-ingress) to provide routing to your cluster
- Create the namespace required for Data Hub
- On each node you will need to install the “zypper in -y ceph-common” that will be used by the SES RBD provisioner to interact with SES.
- You will complete this setup with the SES complement after SES is installed.
Install SUSE Enterprise Storage
- Download the SUSE Linux Enterprise Server and SUSE Enterprise Storage sources
- Install the SUSE Enterprise Storage admin VM
- Create 4 OSD VMs (with 1 OS disk and 8 or more data/OSD disks)
- Setup salt master on SUSE Enterprise Storage admin and connect salt minions on OSD VMs to connect to the salt master
- Set the deepsea grain
- Deploy SUSE Enterprise Storage using the DeepSea processes (run stage.0, stage.1, configure policy.cfg and storage profiles, run stage.3, stage.4)
- Setup Ceph pools required and create CephX user and access key/rights
Make the SUSE Enterprise Storage Available in SUSE CaaS Platform
- Install the Storage Class for SUSE Enterprise Storage/RBD as default class in the SUSE CaaS Platform
- Test it
Deploy SAP Data Hub
Deploy SAP Data Hub using an SAP provided script. You will need access to the SAP Data Hub helm repository and image registry. Find complete details about this process in the SAP provided installation guide for SAP Data Hub.
The following is just a rough overview of the steps required:
Preparation (for the most current information, please reference the SAP Data Hub installation guide)
- Pre-create the namespace needs to be pre-created
- Ensure management host has valid ~/.kube/config with rights to the namespace
- Ensure that Storage Classes are accessible and available
- Install Docker
- Install Helm
- Pre-create Registry Credentials (in the target namespace) and Registry Namespace
- On the management host, download the script for installation and execution
- The script will:
- Ask for several options and configuration details for SAP Data Hub
- Download required images and store them in the on premise registry
- Deploy SAP Data Hub
Find out More!
And there you have it, the implementation of a PoC of SUSE CaaS Platform and SUSE Enterprise Storage to support SAP Data Hub.
If you would like help setting up your PoC, please learn more about our SUSE Start for CaaS Platform by downloading the SUSE Start for SUSE CaaS Platform data sheet. SUSE Start is a 2-week engagement that jumpstarts the implementation with product and technical experts for a fixed cost.
To learn about SUSE and SAP Data Hub, we invite you to download our free whitepaper – SAP Data Hub on SUSE Container as a Service Platform.
If you’d like to learn more about SUSE and SAP, we invite you Join the Best; Run Your SAP Solutions on SUSE.