SUSE Support

Here When You Need Us

Supportability and Challenges for Live Migration of VMs in a SUSE HA Cluster Environment

This document (000020918) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise High Availability Extension

Situation

Live migration of active cluster nodes presents a considerable challenge within an HA Pacemaker/Corosync stack. Unexpected cluster behavior may manifest during or following the migration process, such as Corosync/Pacemaker communication drop and node fencing.

Resolution

The potential challenges linked to VM live migration, including temporary network connectivity drops and disturbances in real-time processing, can lead to unexpected cluster behavior within the HA stack. Therefore, SUSE doesn't provide support for incidents or issues that may arise from the live migration of nodes.

It is important to note that this doesn't suggest that Live Migration or similar technologies are ineffective. Rather, it emphasizes that SUSE cannot guarantee the absence of cluster issues during such operations.

While SUSE doesn't provide support for cluster issues during the live migration of VMs, if you still find it necessary to proceed at your own risk, we recommend following these steps to ensure a smoother process:

  1. Place the cluster in maintenance mode ( # crm maintenance on )
  2. Stop cluster services on the node to be moved (# crm cluster stop )
  3. Migrate the VM
  4. Start cluster services again on the node that was moved ( # crm cluster start )
  5. Exit maintenance mode (# crm maintenance off) 

Cause

Live migration can lead to challenges like increased network latency and temporary drops in connectivity, affecting communication and synchronization among cluster nodes. This can possibly cause split brain scenarios and node fencing. Additionally, halting VMs during migration can disrupt real-time processing tasks, resulting in missed deadlines or delays.

Additional Information

The challenges and recommendations in this document do not apply to live migration processes performed transparently by Cloud Service Providers (CSPs) such as Microsoft Azure, Google Cloud Platform (GCP), Amazon Web Services (AWS), and Alibaba Cloud. In these environments, live migration is an integral part of automatic operations—used for node healing, cluster re-balancing, and rolling out updates—and cannot be prevented or opted out of by customers. Since these migrations are designed to be transparent to workloads, advising to place the cluster in maintenance mode is not practical in these contexts.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000020918
  • Creation Date: 13-Jan-2023
  • Modified Date:16-Sep-2024
    • SUSE Linux Enterprise High Availability Extension

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.