XEN Virtual Machine Migration
Architecture: IBM BladeCenter E with LS41 double wide blades. Connected to an IBM DS3400 storage system via two Cisco Fibre Channel Switch Modules (multiple physical paths).
The Open Source XEN hypervisor includes advanced technology which allows you to relocate your virtual machines to different host servers on your network. This makes it easier for you to meet your employers/clients requirement to have 99.999 availability and to be able to easily move workloads for maintenance or performance reasons.
XEN gives you the ability to migrate a virtual machine from one physical server to another. There are a couple of different methods of doing this, each of which will be explained below. The basic requirement when migrating virtual machines is the ability to fully save the running state of the virtual machine. This is far more involved than simply copying a virtual machine (configuration and image files) to another machine. The system needs to create a point in time snapshot of its memory, device I/O states, network connections and the contents of the virtual CPU registers. XEN has the ability to save all of this information to a disk and then restart the virtual machine on a different host server.
Option 1: Save & Restore Migration
Save and Restore migration does not migrate the virtual machine in real time. Instead it uses checkpoint files to save the current state of the machine and then uses these same files to restart the virtual machine on either the same host or a different host (you’ll obviously need shared storage for this). Its worth noting that when your save/suspend a virtual machine its resources are de-allocated and returned to domain0 (host server). These resources can then be used by running virtual machines on the host. Be aware that this means network connections to the virtual machine will be lost.
This process if achieved with the save and restore subcommands of the xm tool.
Using the xm save command the existing state of the virtual machine is stored into a file, usually on shared storage so that it can be restored to another host server. You can however save and restore a virtual machine to the same host server. For example you may not need the virtual machine and therefore want to free up the resources its using but still keep it in exactly the same state as it is currently. The xm restore command restores the file to the host server and restarts the virtual machine.
Do the following to save and restore a virtual machine:
- On the existing host server type xm list. Make a note of the domains ID that you want to save. This is displayed in the second column.
- On the existing host server type xm save <domainID> <filename>
If you wish you can use the virtual machine name rather than the domainID. The filename is where the virtual machine state is going to be stored (known as the checkpoint file). This does not erase any previously saved checkpoints. In fact if you wish you can save a number of different running states of the virtual machine. You can include a path to the file. If using shared storage I always recommend creating a folder on the logical drive that holds your virtual machine files called something like suspended. You can then save these checkpoint files to this location.
Example: xm save 1 /vms/suspended/oes2
- Now run the xm list command again. The domain will still be listed but nothing should be shown on the ‘State’ column indicating that it is not running. You can also check the directory where you placed the checkpoint file. For example type ls –lah /vms/suspended/.
- We are now ready to restore the file. Depending upon your environment you may be restoring the virtual machine to the same host or to a different host. It doesn’t make any difference except for deciding which server to execute the following commands on. To restore the virtual machine you simply need to enter xm restore <filename>.
Example: xm restore /vms/suspended/oes2
As mentioned earlier the checkpoint file created during the save and restore process is not automatically deleted and can therefore be used at a later date.
The save and restore functionality can be used for a number of purposes including testing, debugging, relocating virtual machines and quick crash recovery. For example if a virtual machine crashes I can restore it to a known working state very quickly.
Option 2: Regular and Live Migration
Using the xm migrate command you can perform a ‘warm’ migration of a virtual machine from one host to another. In this scenario the virtual machine is still temporarily stopped and then restarted but not in the same way as Save & Restore. You can also use the ‘live’ option with the x migrate command which allows you to move a virtual machine from one host to another seamlessly often without any network packet loss and therefore the end user is unaware of the change.
At the time of writing live migration in SLES 10 SP1 is only possible with paravirtualised virtual machines however support for Windows virtual machines has been added in SP2 (I have not yet fully tested this). To make use of regular or live migration you need to either have a copy of the virtual machine image and configuration files on each host server (not very practical or easy to keep up-to-date) or use a shared storage system to store your virtual machine files on. Having multiple copies of a virtual machine on your network can be very dangerous. If both are accidentally started at the same time this can easily lead to corruption of the data they will both access. Before proceeding test that you can start your virtual machines from your shared storage on the different hosts that you want to configure migration between. This will involve using persistent block device naming in your configuration files.
With regular migration the following occurs:
- The XEN host pauses the execution of the virtual machines processes.
- Memory and process information for the virtual machine is transferred from the source host to the destination.
- Execution of the virtual machine is resumed on the destination server.
A regular migration basically automates the manual checkpoint process explained above and because of this it doesn’t actually make use of XEN’s integrated migration tools. This process is not transparent to the end user as it still leads to a loss in network connections.
Live migration reduces the amount of time it takes to migrate a virtual machine to seconds (depending on your network infrastructure and the resources allocated to the virtual machine). This type of migration is suitable for applications where your business cannot tolerate any downtime as the virtual machine is effectively migrated while in operation. This process is far more complicated, to the underlying XEN hypervisor, than a regular migration because the state of the virtual machine is always changing (as it is running) while the migration is taking place. To achieve this an iterative multipass algorithm is used to transfer the virtual machines memory in successive steps.
The following occurs with live migration:
- Firstly the source server is checked to ensure it has enough available resources to run the virtual machine.
- An initial copy of the virtual machines memory is performed and passed to the destination server.
- With each successive iteration, after the initial copy, only memory that has changed in the interim is sent to the destination.
- When the number of changing pages in memory is low enough or the pages remaining to be transferred is not decreasing with each pass the final state of the virtual machine is sent to the destination server.
- Control of the virtual machine is transferred to the new host server.
Before you can implement live migration in your environment check you have the following in place:
- Two XEN host servers correctly configured for migration (see below).
- A fast stable network connection between the host servers. Both servers must be on the same layer 2 network and IP subnet. This allows for the migration of network connections to the virtual machine.
- Shared storage accessible to both host servers. I recommend storing both the virtual machines disk image on the shared storage and the configuration file.
- The same version of XEN on both hosts.
We are now ready to perform the initial setup required for live migration to work. Please follow through the steps below.
Edit the xend configuration file
Xend-config.sxp is the main configuration file for the XEN daemon. Using a text editor such as vi or gedit open /etc/xen/xend-config.sxp.
Go through the file and ensure the following lines are NOT commented out (i.e. remove the # sign at the beginning of the line) and change them to the values set below:
The xend-relocation-address option allows you to specify which IP address the XEN daemon should listen on for migration requests. Leaving this blank lets the server listen on all ports. The xend-relocation-hosts-allow option allows you to limit which hosts can contact the server for migration requests. Ideally you should limit access to the server, using this option, to provide better security.
If your server is running a firewall ensure you open the port specified above (8002) on the specified IP addresses on your network card.
Migrate a virtual machine
In it’s most basic form the xm migrate command is very simple. You only need to specify the domain ID (or name) of the domain to be migrated and the destination server. On the existing host server use the xm list command to find the domain ID for the virtual machine you wish to migrate. Then use the xm migrate command to move the virtual machine. For example xm migrate 1 10.0.0.56.
You can then use xm list on the destination server to ensure the virtual machine was migrated successfully.
Without any additional parameters the xm migrate command will perform a regular migration. However if we append the –live option a live migration is performed. Therefore issuing the the command xm migrate –live 1 10.0.0.56 migrates the virtual machine with a domain ID of 1 (as above) but this time uses the live method. A good test of how well the live migration works is to ping the IP address of the virtual machine during the process and see how many packets (if any) are lost.