The Challenge of Persistent Data on Kubernetes
Guest blog by Chris Crow | Technical Marketing Manager at Portworx by Pure Storage
Persistent data can be a challenge for any Kubernetes practitioner. After all, Kubernetes did not start with stateful workloads in mind. In many ways, persistence seemingly goes against the grain of Kubernetes’ declarative configurations. After all, building to a desired state doesn’t have an easy analog in the world of data.
In our last blog we learned about a number of challenges to running Kubernetes in the enterprise. We learned that the very promise of Kubernetes—scalability, resilience and agility—spawn a number of challenges when it comes to installing and maintaining that platform. The extensible nature of Kubernetes tends to give us ample opportunity for self sabotage. The folks at SUSE have solved a number of these problems by management tools and default configurations that allow administrators to have a standard and secure set of configurations. In this blog, we will turn the spotlight on the challenges surrounding persistent data and their applications running on Kubernetes environments.
Why run persistent applications on Kubernetes?
Kubernetes excels at management at scale, but one of the biggest benefits is its use of declarative configurations. Declarative (vs imperative) configurations build to a desired state. We don’t have to turn back the clock very far to remember that declarative configurations were considered the pinnacle of OS configuration management embodied by tools like Ansible or Puppet. These tools are ultimately a convenience for an administrator in that they would run an imperative set of instructions to bring a server into a desired state.
Kubernetes achieves the same goal by way of a different path: instead of declarative language that is translated by a tool into imperative commands, what if we simply rebuild the workload every time we detect a change? Now to bring a workload into a desired state, we simply need to start it. We have some challenges we need to solve using this method – there are well built container images where all configurations are managed using environment variables and config maps and those that are… not – but the gains to simplicity are huge. One of the biggest challenges to come out of this method of declarative workload management is the simple fact that we throw away the data with every boot.
So why do we even want to run applications that require persistence on Kubernetes? For the same reason we want to run applications without persistence. Simple, declarative language to describe the configuration and operation of our applications. When a database server scaling is as simple as changing a replica count, we have gained a LOT in scale and efficiency that makes the engineering worth it.
The challenge: persistence in an ephemeral world
I assume that everyone reading this blog will be familiar with most of the Kubernetes constructs that help solve this problem: persistent volumes and persistent volume claims, stateful sets, and container storage interfaces. We also know many management functions of modern storage administrators are not present in Kubernetes by default. Functions like mobility between Kubernetes clusters, or capacity automation are important for these persistent workloads.
In addition, persistent applications on Kubernetes must account for a potential drift between the configuration of the application, and the data that it is managing. Imagine a simple Postgres database running in Kubernetes: all of the configuration details (including the version of the database) are declarative and ephemeral. The data however must be persistent. This means that to protect, migrate or manage this application we must account for both the configuration manifests, as well as the PVC.
Portworx – enabling stateful workloads in Kubernetes
Portworx by Pure Storage brings a number of features to Kubernetes environments to help with managing stateful workloads, but just as important as a list of storage features is how Portworx provides these features. Portworx installs as a level 5 operator and provides custom resources to the Kubernetes cluster to enable a well designed application (and probably more importantly, a less well designed application) to use many of the standard storage features that we have come to rely on as server administrators:
Disaster Recovery – Portworx provides disaster recovery schedules, application consistency, and recovery using declarative custom resources. This means that the protection and recovery of an application can be integrated into any standard GitOps tool, including SUSE Rancher’s Fleet. And because Portworx manages both the data (PVs and PVCs) as well as the manifests as they existed at the point of protection, we can manage the application configuration drift that can sometimes happen. Portworx can create DR pairs between any supported storage provider or Kubernetes cluster.
Migration – Portworx can also manage data and manifest migrations between clusters. This enables blue/green upgrades of clusters running stateful workloads as well as mobility between regions and datacenters. And because Portworx supports many Kubernetes orchestrators and storage backends, we can also use this mobility to migrate to different environments as our needs change as a company. I have used this in the past to build GitOps based data testing as part of database Gitops workflows (citation to kubecon talks)
I/O Control and Storage Automation – Portworx is also able to shape I/O coming from applications to help prevent “noisy neighbors” and provide storage automation to automatically increase the size of PVCs, but to increase the size of the storage pools.
Application aware backups and clones – Portworx also provides CSI based snapshot capabilities, but additionally provides the ability to run scripts to quiesce the applications for consistency and upload the snapshot data to an object store.
How can we use these features?
So how can we use these features to implement storage services in a Kubernetes environment? I find that the best proof points are how some of the folks I have worked with have implemented Portworx.
Disaster Recovery – Persistent workloads have different needs from ephemeral workloads, particularly when they back critical applications. Enterprises must adhere to strict compliance and often regulatory requirements for data availability. Banks, including a German bank managed by technology partner DXC, had requirements of near zero data loss and a Recovery Time Objective (RTO) of less than 60 minutes. They identified Portworx synchronous disaster recovery as one of the few capabilities in the market that could meet this need for their container workloads.
Storage Automation – As Kubernetes workloads grow in scale and scope, so too do the Day 2 storage operations that developers or platform teams must manage. Tasks like manual storage provisioning can be time-consuming and expose businesses when under or over provisioned. Underprovisioning storage can result in applications crashing, while overprovisioning can result in inflated storage costs. Automated storage provisioning makes managing storage capacity easy—so teams can focus their attention elsewhere.
Application-aware Backups – Backing up data can often be an afterthought for busy teams, but it is a critical component of managing persistent data on Kubernetes. Application backups need to be application-aware and container-granular to ensure quick, speedy restores that don’t inexplicably fail. Losing critical data can take many forms—sometimes even as simple as an admin accidentally deleting all the volumes on a cluster. Backup or platform teams need to be able to quickly target and recover this data before customers are impacted.
Conclusion
Kubernetes management is complex. Data on Kubernetes can be more complex. Portworx and SUSE Rancher Prime offer ways to make managing that complexity easier. When something is easier, it costs less. It costs less in the increased uptime because teams make less configuration mistakes. It costs less because it is secure by design, ensuring that costly breaches and ransomware attaches don’t happen. It costs less in stress to the platform administrators who are tasked with keeping all of this running. Won’t someone please think of the administrators.
Stay tuned for our next blog where we cover how Portworx and SUSE work together in an enterprise environment to enable simple operations at scale.
Related Articles
Jan 22nd, 2025
Unlocking Retail Innovation With SUSE Edge
Feb 14th, 2025
4 Efficiency Gains for Enterprises That Own Their AI
Jul 01st, 2024