Using Velero to Back Up and Restore SUSE Virtualization VMs with External CSI Storage

Share
Share

Using Velero to Back Up and Restore SUSE Virtualization VMs with External CSI Storage

SUSE Virtualization 1.5 introduces support for the provisioning of virtual machine root volumes and data volumes using external Container Storage Interface (CSI) drivers.

This article demonstrates how to use Velero 1.16.0 to perform backup and restore of virtual machines in SUSE Virtualization.

It goes through commands and manifests to:

  • Back up virtual machines in a namespace, their NFS CSI volumes, and associated namespace-scoped configuration
  • Export the backup artifacts to an AWS S3 bucket
  • Restore to a different namespace on the same cluster
  • Restore to a different cluster
  • Ensure data consistency during backup using filesystem freeze techniques (advanced topic)

Velero is a Kubernetes-native backup and restore tool that enables users to perform scheduled and on-demand backups of virtual machines to external object storage providers such as S3, Azure Blob, or GCS, aligning with enterprise backup and disaster recovery practices.

Note: The commands and manifests used in this article are tested with SUSE Virtualization 1.6.1.

The CSI NFS driver and Velero configuration and versions used are for demonstration purposes only. Adjust them according to your environment and requirements.

Important: The examples provided are intended to back up and restore Linux virtual machine workloads. It is not suitable for backing up guest clusters provisioned via the Harvester Rancher integration.

To back up and restore guest clusters like RKE2, please refer to the distro official documentation.

SUSE Virtualization Installation

Refer to the Harvester documentation for installation requirements and options.

The kubeconfig file of the SUSE Virtualization cluster can be retrieved following the instructions here.

Install and Configure Velero

Download the Velero CLI.

Set the following shell variables:

BUCKET_NAME=<your-s3-bucket-name>
BUCKET_REGION=<your-s3-bucket-region>
AWS_CREDENTIALS_FILE=<absolute-path-to-your-aws-credentials-file>

Install Velero on the SUSE Virtualization cluster:

velero install \
  --provider aws \
  --features=EnableCSI \
  --plugins "velero/velero-plugin-for-aws:v1.12.0,quay.io/kubevirt/kubevirt-velero-plugin:v0.7.1" \
  --bucket "${BUCKET_NAME}" \
  --secret-file "${AWS_CREDENTIALS_FILE}" \
  --backup-location-config region="${BUCKET_REGION}" \
  --snapshot-location-config region="${BUCKET_REGION}" \
  --use-node-agent
  • In this setup, Velero is configured to:
    • Run in the velero namespace
    • Enable CSI volume snapshot APIs
    • Enable the built-in node agent data movement controllers and pods
    • Use the velero-plugin-for-aws plugin to manage interactions with the S3 object store
    • Use the kubevirt-velero-plugin plugin to back up and restore KubeVirt resources

Confirm that Velero is installed and running:

kubectl -n velero get po
NAME                      READY   STATUS    RESTARTS         AGE
node-agent-875mr          1/1     Running   0                1d
velero-745645565f-5dqgr   1/1     Running   0                1d

Configure the velero CLI to output the backup and restore status of CSI objects:

velero client config set features=EnableCSI

Deploy the NFS CSI and Example Server

Follow the instructions in the NFS CSI documentation to set up the NFS CSI driver, its storage class, and an example NFS server.

The NFS CSI volume snapshotting capability must also be enabled following the instructions here.

Confirm that the NFS CSI and example server are running:

kubectl get po -A -l 'app in (csi-nfs-node,csi-nfs-controller,nfs-server)'
NAMESPACE     NAME                                  READY   STATUS    RESTARTS    AGE
default       nfs-server-b767db8c8-9ltt4            1/1     Running   0           1d
kube-system   csi-nfs-controller-5bf646f7cc-6vfxn   5/5     Running   0           1d
kube-system   csi-nfs-node-9z6pt                    3/3     Running   0           1d

The default NFS CSI storage class is named nfs-csi:

kubectl get sc nfs-csi
NAME      PROVISIONER      RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
nfs-csi   nfs.csi.k8s.io   Delete          Immediate           true                   14d

Confirm that the default NFS CSI volume snapshot class csi-nfs-snapclass is also installed:

kubectl get volumesnapshotclass csi-nfs-snapclass
NAME                DRIVER           DELETIONPOLICY   AGE
csi-nfs-snapclass   nfs.csi.k8s.io   Delete           14d

Preparing the Virtual Machine and Image

Create a custom namespace named demo-src:

kubectl create ns demo-src

Follow the instructions in the Image Management documentation to upload the Ubuntu 24.04 raw image from https://cloud-images.ubuntu.com/minimal/releases/noble/ to SUSE Virtualization.

The storage class of the image must be set to nfs-csi, per the Third-Party Storage Support documentation.

Confirm the virtual machine image is successfully uploaded to SUSE Virtualization:

Follow the instructions in the third-party storage documentation to create a virtual machine with NFS root and data volumes, using the image uploaded in the previous step.

For NFS CSI snapshot to work, the NFS data volume must have the volumeMode set to Filesystem:

Note (optional): For testing purposes, once the virtual machine is ready, access it via SSH and add some files to both the root and data volumes.

The data volume needs to be partitioned, with a file system created and mounted before files can be written to it.

Backup the Source Namespace

Use the velero CLI to create a backup of the demo-src namespace using Velero’s built-in data mover:

BACKUP_NAME=backup-demo-src-`date "+%s"`

velero backup create "${BACKUP_NAME}" \
  --include-namespaces demo-src \
  --snapshot-move-data

Info: For more information on Velero’s data mover, see its documentation on CSI data snapshot movement capability.

This creates a backup of the demo-src namespace containing resources like the virtual machine created earlier, its volumes, secrets and other associated configuration.

Depending on the size of the virtual machine and its volumes, the backup may take a while to complete.

The DataUpload custom resources provide insights into the backup progress:

kubectl -n velero get datauploads -l velero.io/backup-name="${BACKUP_NAME}"

Confirm that the backup completed successfully:

velero backup get "${BACKUP_NAME}"
NAME                         STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
backup-demo-src-1747954979   Completed   0        0          2025-05-22 16:04:46 -0700 PDT   29d       default            <none>

After the backup completes, Velero removes the CSI snapshots from the storage side to free up the snapshot data space.

Tip: The velero backup describe and velero backup logs commands can be used to assess details of the backup including resources included, skipped, and any warnings or errors encountered during the backup process.

Restore to a Different Namespace

This section describes how to restore the backup from the demo-src namespace to a new namespace named demo-dst.

Save the following restore modifier to a local file named modifier-data-volumes.yaml:

version: v1
resourceModifierRules:
- conditions:
    groupResource: persistentvolumeclaims
    matches:
    - path: /metadata/annotations/harvesterhci.io~1volumeForVirtualMachine
      value: "\"true\""
  patches:
  - operation: remove
    path: /metadata/annotations/harvesterhci.io~1volumeForVirtualMachine

This restore modifier removes the harvesterhci.io/volumeForVirtualMachine annotation from the virtual machine data volumes to ensure that the restoration does not conflict with the CDI volume import populator.

Create the restore modifier:

kubectl -n velero create cm modifier-data-volumes --from-file=modifier-data-volumes.yaml

Assign the backup name to a shell variable:

BACKUP_NAME=backup-demo-src-1747954979

Start the restore operation:

velero restore create \
  --from-backup "${BACKUP_NAME}" \
  --namespace-mappings "demo-src:demo-dst" \
  --exclude-resources "virtualmachineimages.harvesterhci.io" \
  --resource-modifier-configmap "modifier-data-volumes" \
  --labels "velero.kubevirt.io/clear-mac-address=true,velero.kubevirt.io/generate-new-firmware-uuid=true"
  • During the restore:
    • The virtual machine MAC address and firmware UUID are reset to avoid potential conflicts with existing virtual machines.
    • The virtual machine image manifest is excluded because Velero restores the entire state of the virtual machine from the backup.
    • The modifier-data-volumes restore modifier is invoked to modify the virtual machine data volumes metadata to prevent conflicts with the CDI volume import populator.

While the restore operation is still in-progress, the DataDownload custom resources can be used to examine the progress of the operation:

RESTORE_NAME=backup-demo-src-1747954979-20250522164015

kubectl -n velero get datadownload -l velero.io/restore-name="${RESTORE_NAME}"

Confirm that the restore completed successfully:

velero restore get
NAME                                        BACKUP                       STATUS      STARTED                         COMPLETED                       ERRORS   WARNINGS   CREATED                         SELECTOR
backup-demo-src-1747954979-20250522164015   backup-demo-src-1747954979   Completed   2025-05-22 16:40:15 -0700 PDT   2025-05-22 16:40:49 -0700 PDT   0        6          2025-05-22 16:40:15 -0700 PDT   <none>

Verify that the virtual machine and its configuration are restored to the new demo-dst namespace:

Note: Velero uses Kopia as its default data mover. This issue describes some of its limitations on advanced file system features such as setuid/gid, hard links, mount points, sockets, xattr, ACLs, etc.

Velero provides the --data-mover option to configure custom data movers to satisfy different use cases. For more information, see the Velero’s documentation.

Tip: The velero restore describe and velero restore logs commands provide more insights into the restore operation, including the resources restored, skipped, and any warnings or errors encountered during the restore process.

Restore to a Different Cluster

This section extends the above scenario to demonstrate the steps to restore the backup to a different SUSE Virtualization cluster.

On the target cluster, install Velero, and set up the NFS CSI and NFS server following the instructions from the Deploy the NFS CSI and Example Server section.

Once Velero is configured to use the same backup location as the source cluster, it automatically discovers the available backups:

velero backup get
NAME                         STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
backup-demo-src-1747954979   Completed   0        0          2025-05-22 16:04:46 -0700 PDT   29d       default            <none>

Follow the steps in the Restore to a Different Namespace section to restore the backup on the target cluster.

Remove the --namespace-mappings option to set the restored namespace to demo-src on the target cluster.

Confirm that the virtual machine and its configuration are restored to the demo-src namespace:

Limitations

Enhancements related to the limitations described in this section are tracked at https://github.com/harvester/harvester/issues/8367.

  • By default, Velero only supports resource filtering by resource groups and labels. To back up or restore a single virtual machine instance, you must apply custom labels to the virtual machine and its related resources: the virtual machine instance, pod, data volumes, persistent volume claims, persistent volumes and cloudinit secret resources. It’s recommended to back up the entire namespace and filter resources during the restore process to ensure that the backup includes all dependency resources required by the virtual machine.
  • The restoration of virtual machine images is not fully supported yet.

Advanced Topic: Ensuring Data Consistency with Filesystem Freeze

In certain scenarios, you may require the VM filesystem to be quiesced during Velero backup creation to prevent data corruption, especially when the VM is experiencing heavy I/O operations. This section describes how to customize Velero Backup Hooks to implement filesystem freeze during Velero backup processing, ensuring data consistency in the backup content.

Background: KubeVirt virt-freezer

KubeVirt’s virt-freezer provides a mechanism to freeze and thaw guest filesystems. This capability can be leveraged to ensure filesystem consistency during VM backups. However, certain prerequisites must be met for filesystem freeze/thaw operations to function properly.

Prerequisites for Filesystem Freeze

  • QEMU Guest Agent must be enabled in the guest VM
    • Verify this by checking if the VMI has AgentConnected in its status
  • Guest VM must be properly configured for related libvirt commands
    • When virt-freezer is triggered, KubeVirt communicates with the QEMU Guest Agent via libvirt commands such as guest-fsfreeze-freeze
    • The guest agent translates these commands to OS-specific calls:
      • Linux systems: Uses fsfreeze syscalls
      • Windows systems: Uses VSS (Volume Shadow Copy Service) APIs

Common Configuration Challenges

Based on the SUSE Virtualization project experience, some guest operating systems require additional configuration:

  • Linux distributions (e.g., RHEL, SLE Micro): May lack sufficient permissions for filesystem freeze operations by default, requiring custom policies
  • Windows guests: Require the VSS service to be enabled for filesystem freeze functionality

Important: Filesystem freeze/thaw functionality depends on guest VM configuration, which is outside SUSE Virtualization’s control. Users are responsible for ensuring compatibility before implementing Velero backup hooks with filesystem freeze.

Verifying Filesystem Freeze Compatibility

To confirm that your VM supports filesystem freeze operations:

  1. Access the virtual machine’s virt-launcher compute container:
    POD=$(kubectl get pods -n <VM Namespace> \
      -l vm.kubevirt.io/name=<VM Name> \
      -o jsonpath='{.items[0].metadata.name}')
    kubectl exec -it $POD -n default -c compute -- bash
  2. Test filesystem freeze using the virt-freezer application available in the compute container:
    virt-freezer --freeze --namespace <VM namespace> --name <VM name>
  3. Critical: Always verify the freeze operation result and thaw the VM filesystems before performing any other operations

Implementing Filesystem Freeze Hooks for VM Backup

Velero supports pre and post backup hooks that can be integrated with KubeVirt’s virt-freezer to ensure filesystem consistency during VM backups.

Configuring VM Template Annotations

For all VMs requiring data consistency, add the following annotations to the VM template:

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: vm-nfs
  namespace: demo
spec:
  template:
    metadata:
      annotations:
        # These annotations will be applied to the virt-launcher pod
        pre.hook.backup.velero.io/command: '["/usr/bin/virt-freezer", "--freeze", "--namespace", "<VM Namespace>", "--name", "<VM Name>"]'
        pre.hook.backup.velero.io/container: compute
        pre.hook.backup.velero.io/on-error: Fail
        pre.hook.backup.velero.io/timeout: 30s
        
        post.hook.backup.velero.io/command: '["/usr/bin/virt-freezer", "--unfreeze", "--namespace", "<VM Namespace>", "--name", "<VM Name>"]'
        post.hook.backup.velero.io/container: compute
        post.hook.backup.velero.io/timeout: 30s
    spec:
      # ...rest of VM spec...

These annotations will be propagated to the related virt-launcher pod and instruct Velero to:

  • Freeze the VM filesystem before backup creation begins
  • Thaw the VM filesystem after backup completion

Important: Replace <VM Namespace> and <VM Name> with the actual namespace and name of your VM.

Creating a Velero Backup with Filesystem Freeze

After applying the Velero pre/post hook annotations to the VM manifest, follow the backup procedures described earlier in this article.

Verifying Successful Hook Execution

If the guest VM is configured correctly, the Velero backup will complete successfully with HooksAttempted indicating successful hook execution.

Check the backup status using:

velero backup describe [Backup Name] --details

Example output showing successful hook execution:

Name:         demo
Namespace:    velero
Labels:       velero.io/storage-location=default
Annotations:  velero.io/resource-timeout=10m0s
              velero.io/source-cluster-k8s-gitversion=v1.33.3+rke2r1
              velero.io/source-cluster-k8s-major-version=1
              velero.io/source-cluster-k8s-minor-version=33

Phase:  Completed


Namespaces:
  Included:  demo
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto

Label selector:  <none>

Or label selector:  <none>

Storage Location:  default

Velero-Native Snapshot PVs:  auto
Snapshot Move Data:          true
Data Mover:                  velero

....

Backup Volumes:
  Velero-Native Snapshots: <none included>

  CSI Snapshots:
    demo/vm-nfs-disk-0-au2ej:
      Data Movement:
        Operation ID: du-be5417aa-498e-4b93-b59f-e6498f95a6df.d7f97dab-3bb1-41e189381
        Data Mover: velero
        Uploader Type: kopia
        Moved data Size (bytes): 5368709120
        Result: succeeded

  Pod Volume Backups: <none included>

HooksAttempted:  2
HooksFailed:     0

The output shows that Velero pre/post backup hooks completed successfully. In this case, the hooks are connected to guest VM filesystem freeze and thaw operations to ensure data consistency.

Troubleshooting Filesystem Freeze Issues

If you encounter issues with filesystem freeze operations:

  1. Verify QEMU Guest Agent status in the VMI
  2. Check guest OS configuration for filesystem freeze support
  3. Review Velero hook logs for specific error messages
  4. Test virt-freezer manually as described in the verification section

Conclusion

This guide has covered the complete workflow for backing up and restoring SUSE Virtualization’s virtual machines with external CSI storage using Velero:

Basic Operations:

  • Setting up Velero with external CSI drivers (NFS) for VM backup and restore
  • Creating namespace-scoped backups with CSI volume snapshots and data movement
  • Restoring VMs to different namespaces on the same cluster
  • Migrating VMs across different SUSE Virtualization clusters using S3-compatible storage

Advanced Data Consistency:

  • Implementing filesystem freeze/thaw hooks using KubeVirt’s virt-freezer
  • Ensuring point-in-time consistency for VMs with high I/O activity

By combining Velero’s robust backup capabilities with proper filesystem quiescing techniques, you can establish a comprehensive disaster recovery strategy for your SUSE Virtualization infrastructure. Whether you need basic VM backups or require strict data consistency guarantees, this approach provides the flexibility to meet various enterprise backup and recovery requirements.

Share
(Visited 1 times, 1 visits today)
Avatar photo
14 views
Webber Huang Software engineer experienced in Kubernetes, cloud-native storage, and CSI-based volume orchestration, with a strong foundation in general storage technologies such as kernel device mapper and block-storage systems. Passionate about building high-performance, resilient backup and restore solutions for virtual machines and containerized workloads using S3-compatible storage.