Best Practices for EKS Cluster Management on AWS Using SUSE Rancher Prime
This article is intended to provide a guideline of best practices for greenfield or brownfield deployments on EKS clusters managed by SUSE Rancher Prime.
- Understand Requirements:
- Review Options for running Rancher Prime on AWS.
- Read the blog: https://www.suse.com/c/options-for-running-rancher-on-aws/
- Explore the Cloud Native SUSE Portfolio built from open source software with SLSA Level 3 certification.
- SUSE Security = NeuVector + Kubewarden
- SUSE Observability = StackState
- SUSE Rancher Prime = Application Collection + Rancher + SUSE Security + SUSE Observability + RKE2 + k3s
- SUSE Rancher Prime Hosted for AWS – Fully SUSE Supported Rancher Manager with 99.9% SLA uptime
- SUSE Virtualization = Harvester
- SUSE Storage = Longhorn
- SUSE Rancher Suite = All of the above
- Installation Requirements :: Rancher product documentation
- Review Options for running Rancher Prime on AWS.
- Plan your EKS cluster size and node types based on your workload needs. (pulled from AWS best practices)
-
- Understand your workloads:
- Application type:
Identify the types of applications running on your cluster (e.g., web servers, databases, batch processing, machine learning). - Resource usage:
Estimate the average and peak CPU, memory, and storage needs of each application pod. - Traffic patterns:
Analyze expected traffic fluctuations and peak usage times to determine scaling requirements. - Latency sensitivity:
Consider if your application has strict latency requirements, which might influence node selection. - Choose node types:
- General purpose instances:
For diverse workloads with balanced CPU and memory needs, consider “m” series instances (e.g., m5.large, m5.xlarge). - CPU-optimized instances:
If your workloads are primarily CPU intensive, choose “c” series instances (e.g., c5.xlarge, c5.2xlarge). - Memory-optimized instances:
For large data processing or in-memory applications, opt for “r” series instances (e.g., r5.large, r5.2xlarge). - GPU-optimized instances:
If your workload involves machine learning or graphics rendering, use “p” series instances (e.g., p3.xlarge, p3.2xlarge).
- General purpose instances:
- Application type:
- Understand your workloads:
- Determine initial cluster size:
- Start small:
Begin with a small cluster with a few nodes to test your application and optimize resource usage. - Consider redundancy:
For high availability, distribute nodes across multiple Availability Zones (AZs). - Scaling strategy:
Plan how to scale your cluster horizontally by adding nodes as needed using the Kubernetes Cluster Autoscaler.
- Start small:
- Configure node groups:
- Multiple node groups:
Create separate node groups with different instance types to cater to specific application requirements. - Taints and tolerations:
Use taints and tolerations to restrict pod scheduling to specific node types.
- Multiple node groups:
- Example scenarios:
- Web application with moderate traffic:
Start with a small cluster using m5.large instances, utilizing the Cluster Autoscaler to scale based on real-time demand. - High-performance computing workload:
Deploy a cluster with c5.xlarge or c5.2xlarge instances for optimal CPU performance. - Large-scale data processing:
Use r5.xlarge or r5.2xlarge instances to handle large datasets with high memory requirements.
- Web application with moderate traffic:
- Decide on your networking strategy (VPC, subnets, security groups).
- Network Segmentation:
Divide your VPC into multiple subnets based on application functionality, ensuring isolation between different tiers. - Dedicated Security Groups:
Create unique security groups for each application tier to enforce granular access controls. RBAC available via SUSE Rancher Prime. - Regular Review and Updates:
Periodically review your security group rules and update them as needed to maintain a secure network. - Monitoring:
Implement network monitoring tools to detect suspicious activity and identify potential security vulnerabilities.- SUSE Observability included in SUSE Rancher Prime and Rancher Prime Suite.
- Network Segmentation:
- AWS Account Setup:
- Create an AWS account if you don’t have one.
- Configure AWS credentials with appropriate permissions for EKS and other AWS services.
- Set up an IAM user and role with the necessary policies for Rancher Prime to operate.
- Tooling:
- Install and configure essential tools: AWS CLI, eksctl, kubectl, and Helm.
- Networking:
- Design your VPC and subnets for high availability and security.
- Consider using multiple Availability Zones for your EKS cluster.
- Plan your load balancing strategy for Rancher Prime and your applications.
- EKS Cluster Creation
- Creating an EKS Cluster | Rancher
- EKS Cluster Configuration Reference | Rancher
- Using eksctl:
- Leverage eksctl for EKS cluster creation and management
- Define your cluster configuration in a YAML file for reproducibility.
- Managed Node Groups:
- Opt for managed node groups for simplified node management and updates.
- Configure auto-scaling for your node groups to handle varying workloads.
- Kubernetes Version:
- Choose a supported Kubernetes version that aligns with Rancher Prime’s compatibility.
- Stay updated with the latest Kubernetes releases for security and features.
Rancher Prime Installation
- Helm Chart:
- Use the Helm chart from AWS Marketplace for installing Rancher Prime on your EKS cluster.
- Customize the Helm chart values to match your environment and preferences.
- Installing Rancher on Amazon EKS
- Optionally, you can create the SUSE Rancher Prime management deployment on EC2 directly. This will use RKE2 as the Kubernetes platform.
- High Availability:
- Configure Rancher Prime for high availability with multiple replicas.
- Use a dedicated EKS cluster for Rancher Prime to isolate it from workloads.
- Ingress:
- Set up an Ingress controller (e.g., Nginx) to expose Rancher Prime externally.
- Obtain and configure SSL certificates for secure access.
Security Best Practices
- Deploy SUSE Security:
- Enables image scanning, compliance, CI/CD integration, zero trust layer 7 runtime security, network segmentation.
- Use network policies within Kubernetes to restrict pod communication.
- Read the blog for more information on best practices for securing Amazon EKS with SUSE Rancher Prime.
- AWS Partner SUSE – Security ISV Competency | Amazon EKS Ready Product
- IAM Roles and Policies:
- Follow the principle of least privilege when assigning IAM roles and policies.
- Regularly rotate your AWS access keys.
- Network Security:
- Implement security groups to control traffic flow to and from your EKS cluster.
- Secrets Management:
- Utilize AWS Secrets Manager to securely store and manage sensitive data.
- Integrate Rancher Prime with Secrets Manager for Kubernetes secrets.
- Pod Security Policies:
- Enforce pod security policies to define security constraints for your workloads using Kubewarden, included with SUSE Rancher Prime.
- Regular Security Audits:
- Conduct regular security audits to identify and address vulnerabilities.
Monitoring and Logging
-
- SUSE Rancher Prime Monitoring plus Observability:
- Use SUSE Rancher Prime built in logging and monitoring capabilities.
- SUSE Rancher Prime Monitoring plus Observability:
-
- SUSE Observability
- End to end monitoring, tracing, historical data for reducing MTTR and future downtime.
- SUSE Observability
-
- CloudWatch:
- Integrate EKS with CloudWatch for monitoring cluster health and performance.
- Set up alerts for critical events and metrics.
- CloudWatch:
-
- Logging:
- Centralize your logs using a logging solution Elasticsearch or Fluentd.
- Monitor logs for errors and potential security issues in SUSE Observability.
- Logging:
Backup and Disaster Recovery
-
- etcd Backups:
- Regularly back up your etcd data for cluster recovery.
- Store backups securely in an S3 bucket.
- SUSE Rancher Prime Backups:
- Use SUSE Rancher Prime’s backup and restore functionality to protect your configuration. Can be scheduled or run on demand.
- Disaster Recovery Plan:
- Develop a disaster recovery plan to ensure business continuity in case of an outage. DR for your SUSE Rancher Prime clusters can be accomplished across AZ and regions.
- etcd Backups:
Ongoing Management
-
- Updates and Upgrades:
- Stay up-to-date with the latest Rancher Prime and Kubernetes releases. Rancher Prime Hosted does this for you for the Rancher instances.
- Follow the recommended upgrade procedures to minimize downtime.
- Updates and Upgrades:
-
- Cluster Scaling:
- Monitor your cluster resources and scale your nodes as needed.
- Utilize auto-scaling to dynamically adjust your cluster size.
- Cost Optimization:
- Optimize your EKS cluster costs by right-sizing your nodes and using spot instances (SUSE Rancher Prime does this for you).
- Monitor your AWS billing and identify areas for cost savings. SUSE Application Collection provides an enterprise build of OpenCost.
- Cluster Scaling:
Additional Tips
- Infrastructure as Code:
- Use the SUSE Rancher Prime IaC tools.
- Use tools like Terraform or CloudFormation to manage your AWS infrastructure.
- This allows for reproducible deployments and easier management.
- GitOps:
- Implement GitOps and SUSE Fleet for managing your Kubernetes configurations and applications.
- This provides a version-controlled and auditable approach to deployments.
- Support:
- Leverage SUSE’s support resources.
(Visited 1 times, 1 visits today)