How to migrate etcd data directory to a dedicated filesystem?
This document (000020050) is provided subject to the disclaimer at the end of this document.
Situation
Task
When running large Rancher installations or large clusters, it may be necessary to reduce IO contention on the disks for etcd. By default, etcd data is stored in folder /var/lib/etcd
, which is most likely stored on the root file system. To avoid sharing the disk IOPS with other system components, it might be a good idea to migrate the etcd data directory to a dedicated file system to improve performance.
Pre-requisites
- RKE cluster
- Root access to all etcd nodes.
- A new file system with at least 2GB free, but we recommend 8GB or higher. Please work with your systems team to create and mount the file system.
- Etcd backups should be configured and verified.
- Schedule at least an hour of downtime during your change management maintenance window.
- It is highly recommended to pause/halt any new deployments and CI/CD jobs during this change window.
Resolution
Before making any changes, please take an etcd snapshot using one of the following:
For new clusters
For a new cluster, please see our installation documentation
NOTE: Please make sure you have a file system mounted to "/var/lib/etcd/" before creating the cluster.
For existing clusters
Option A - In-place migration
- SSH into the first etcd node and become root.
- Stop etcd container
docker update --restart=no etcd && docker stop etcd
- Verify etcd is stopped, and there are no open files.
lsof | grep '/var/lib/etcd/'
- Move etcd data to a temporary location
mv /var/lib/etcd /var/lib/etcd_tmp
- Create a new file system and mount it to "/var/lib/etcd." Please work with your systems team for this step.
- Verify new file systems
df -H /var/lib/etcd
- Move etcd data from temporary location to new file system
rsync -av --progress /var/lib/etcd_tmp/ /var/lib/etcd/
- Restart etcd
docker update --restart=yes etcd && docker start etcd
- Verify etcd health
docker exec -it etcd member list
- Repeat the process until all etcd nodes have been updated.
- Once all nodes have been updated, please cleanup the temporary data.
rm -rf /var/lib/etcd_tmp/
Option B - Rolling replacement
- Create a new node with the dedicated file system mount at "/var/lib/etcd/."
- Join the new nodes to the existing cluster.
- Waiting for cluster upgrade to finish.
- Verify etcd health
docker exec -it etcd member list
- Remove old nodes from the cluster using documentation
- Repeat the process until all etcd nodes have been replaced.
Further reading
For additional disk tuning, please see etcd
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020050
- Creation Date: 06-May-2021
- Modified Date:06-May-2021
-
- SUSE Rancher
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com