EKS and SUSE Rancher Prime: A Best Practice Guide for Cloud Native Deployments
EKS and SUSE Rancher Prime: A Best Practice Guide for Cloud Native Deployments
This guide offers a set of best practices for both greenfield (new) and brownfield (existing) deployments of Amazon EKS clusters managed by SUSE Rancher Prime. By following these steps, you can ensure your cloud-native infrastructure is secure, scalable, and optimized for performance.
1. Understanding Your Foundation: Requirements and SUSE Portfolio
Before deploying, a solid understanding of your environment and available tools is crucial.
Reviewing SUSE on AWS and the Portfolio
First, review your options for running Rancher Prime on AWS to select the right architecture for your needs.
Next, familiarize yourself with the Cloud Native SUSE Portfolio, which is built from open-source software with a focus on security and compliance (SLSA Level 3 certified):
- SUSE Rancher Prime: The core management platform, including Rancher, SUSE Security, SUSE Observability, and platform choices like RKE2/k3s.
- SUSE Security: For security features like image scanning, compliance, zero trust, and network segmentation.
- SUSE Observability: Provides end-to-end monitoring, tracing, and historical data to reduce Mean Time to Resolution (MTTR).
- SUSE Storage: Persistent storage solution for Kubernetes.
- SUSE Virtualization: An open-source hyper-converged infrastructure (HCI) platform.
- SUSE Rancher Prime Hosted for AWS: A fully supported, managed Rancher Manager with a 99.9% SLA.
2. Planning Your EKS Cluster: Size, Workloads, and Networking
A well-planned EKS cluster is the bedrock of a stable environment.
Plan Node Types Based on Workloads
Effective planning starts with a deep understanding of your workloads:
- Application Type & Resource Usage: Identify your application types (e.g., web servers, databases, ML) and estimate their average and peak CPU, memory, and storage requirements.
- Traffic & Latency: Analyze traffic patterns and latency sensitivity to inform your node selection and scaling strategy.
Choosing Node Types (from AWS Best Practices):
| Workload Focus | AWS Instance Series | Use Case Example |
| General Purpose | “m” series (e.g., m5.large, m5.xlarge) | Diverse, balanced workloads. |
| CPU-Optimized | “c” series (e.g. c5.xlarge) | High-performance computing, batch processing. |
| Memory-Optimized | “r” series (e.g., r5.large) | Large data processing, in-memory caches. |
| GPU-Optimized | “p” series (e.g., p3.xlarge) | Machine learning, graphics rendering. |
Initial Cluster Size and Scaling:
Start small to test and optimize, but ensure redundancy by distributing nodes across multiple Availability Zones (AZs). Plan a clear scaling strategy using the Kubernetes Cluster Autoscaler to dynamically add nodes based on demand. Use multiple node groups with different instance types, and use Taints and Tolerations to ensure specific pods only schedule on appropriate nodes.
Deciding on Your Networking Strategy
Your Virtual Private Cloud (VPC) configuration is essential for security and isolation:
- Network Segmentation: Divide your VPC into multiple subnets based on application tiers for isolation.
- Dedicated Security Groups: Create unique security groups for each tier, leveraging SUSE Rancher Prime’s RBAC for granular access control.
- Monitoring and Review: Implement network monitoring (using SUSE Observability) and regularly review and update your security group rules.
3. Deployment: AWS Setup, EKS Creation, and Rancher Installation
With planning complete, you can move to the technical setup.
AWS Account Setup and Tooling
- Account and Credentials: Create an AWS account, configure appropriate AWS credentials, and set up an IAM user and role with the necessary policies for Rancher Prime to operate.
- Tools: Install and configure essential tools like AWS CLI, eksctl, kubectl, and Helm.
- Networking: Design your VPC with multiple Availability Zones for high availability and plan your load balancing strategy for Rancher Prime and your applications.
EKS Cluster Creation
- Use ekscl: Leverage eksctl for reproducible cluster creation and management, defining the configuration in a YAML file.
- Managed Node Groups: Opt for managed node groups for simplified maintenance and updates, and configure them for auto-scaling.
- Kubernetes Version: Choose a supported Kubernetes version that aligns with Rancher Prime’s compatibility matrix.
SUSE Rancher Prime Installation
- Helm Chart: Install SUSE Rancher Prime on your EKS cluster using the Helm chart from the AWS Marketplace. Customization via Helm chart values is key to matching your environment.
- High Availability (HA): Configure Rancher Prime for HA with multiple replicas. For isolation, use a dedicated EKS cluster for Rancher Prime management, separate from your workload clusters. (Alternatively, you can deploy the management plane on EC2 using RKE2).
- Ingress: Set up an Ingress controller (e.g., Nginx) to expose SUSE Rancher Prime externally and configure SSL certificates for secure access.
4. Security, Monitoring, and Disaster Recovery
These steps are non-negotiable for a production-ready environment.
Security Best Practices
- Deploy SUSE Security: Implement SUSE Security to enable image scanning, zero trust L7 runtime security, and network policies within Kubernetes to restrict pod communication.
- IAM & Least Privilege: Follow the principle of least privilege for IAM roles and policies, and regularly rotate AWS access keys.
- Secrets Management: Utilize AWS Secrets Manager to securely store and manage sensitive data, integrating it with Rancher Prime for Kubernetes secrets.
- Pod Security: Enforce pod security policies using Kubewarden, included with SUSE Rancher Prime.
- Audits: Conduct regular security audits to proactively identify vulnerabilities.
Monitoring and Logging
- SUSE Rancher Prime Monitoring & Observability: Leverage the built-in logging and monitoring capabilities, including SUSE Observability, for end-to-end tracing and performance metrics.
- CloudWatch Integration: Integrate EKS with CloudWatch for general cluster health, and set up alerts for critical events.
- Centralized Logging: Centralize logs using solutions like Elasticsearch or Fluentd, and monitor them within SUSE Observability for errors and security issues.
Backup and Disaster Recovery
- etcd and Rancher Backups: Regularly back up your etcd data for cluster recovery, storing them in a secure S3 bucket. Also, use SUSE Rancher Prime’s built-in functionality to schedule or run on-demand backups of your configuration.
- Disaster Recovery Plan: Develop a Disaster Recovery (DR) plan to ensure business continuity, enabling DR for your SUSE Rancher Prime clusters across different AZs and regions.
5. Ongoing Management and Optimization
Your deployment is not a static event; continuous management is essential.
Updates, Scaling, and Cost Optimization
- Updates and Upgrades: Stay current with the latest Rancher Prime and Kubernetes releases. If you use Rancher Prime Hosted, SUSE manages the updates for your Rancher instances.
- Cluster Scaling: Continuously monitor resource consumption and utilize auto-scaling to dynamically adjust your cluster size.
- Cost Optimization: Right-size your nodes and consider using spot instances (a feature SUSE Rancher Prime can help manage) to optimize EKS costs. Use the OpenCost enterprise build available in SUSE Application Collection to monitor AWS billing.
💡 Additional Tips
- Infrastructure as Code (IaC): Use tools like Terraform or CloudFormation to manage your AWS infrastructure and SUSE Rancher Prime IaC tools for reproducible, consistent deployments.
- GitOps: Implement GitOps with SUSE Fleet to manage Kubernetes configurations and applications, ensuring a version-controlled and auditable deployment pipeline.
- Support: Be sure to leverage the full suite of SUSE and AWS support resources.
Related Articles
Jul 24th, 2024