Cluster Resources for a A Highly Available NFS Service

A highly available NFS service consists of the following cluster resources:

DRBD Master/Slave Resource

The resource is used to replicate data and is switched from and to the Primary and Secondary roles as deemed necessary by the cluster resource manager.

NFS Kernel Server Resource

With this resource, Pacemaker ensures that the NFS server daemons are always available.

LVM and File System Resources

The LVM Volume Group is made available on whichever node currently holds the DRBD resource in the Primary role. Apart from that, you need resources for one or more file systems residing on any of the Logical Volumes in the Volume Group. They are mounted by the cluster manager wherever the Volume Group is active.

NFSv4 Virtual File System Root

A virtual NFS root export. (Only needed for NFSv4 clients).

Non-root NFS Exports

One or more NFS exports, typically corresponding to the file system mounted from LVM Logical Volumes.

Resource for Floating IP Address

A virtual, floating cluster IP address, allowing NFS clients to connect to the service no matter which physical node it is running on.

How to configure these resources (using the crm shell) is covered in detail in the following sections.

Example NFS Scenario

The following configuration examples assume that 10.9.9.180 is the virtual IP address to use for a NFS server which serves clients in the 10.9.9.0/24 subnet.

The service is to host an NFSv4 virtual file system root hosted from /srv/nfs, with exports served from /srv/nfs/sales and /srv/nfs/engineering.

Into these export directories, the cluster will mount ext3 file systems from Logical Volumes named sales and engineering, respectively. Both of these Logical Volumes will be part of a highly available Volume Group, named nfs, which is hosted on a DRBD device.

DRBD Master/Slave Resource

To configure this resource, issue the following commands from the crm shell:

crm(live)# configure
crm(live)configure# primitive p_drbd_nfs \
  ocf:linbit:drbd \
    params drbd_resource="nfs" \
  op monitor interval="15" role="Master" \
  op monitor interval="30" role="Slave"
crm(live)configure# ms ms_drbd_nfs p_drbd_nfs \
  meta master-max="1" master-node-max="1" clone-max="2" \
  clone-node-max="1" notify="true"
crm(live)configure# commit

This will create a Pacemaker Master/Slave resource corresponding to the DRBD resource nfs. Pacemaker should now activate your DRBD resource on both nodes, and promote it to the Master role on one of them.

Check this with the crm_mon command, or by looking at the contents of /proc/drbd.

NFS Kernel Server Resource

In the crm shell, the resource for the NFS server daemons must be configured as a clone of an lsb resource type, as follows:

crm(live)configure# primitive p_lsb_nfsserver \
  lsb:nfsserver \
  op monitor interval="30s"
crm(live)configure# clone cl_lsb_nfsserver p_lsb_nfsserver
crm(live)configure# commit

NOTE: Resource Type Name and NFS Server init Script

The name of the lsb resource type must be exactly identical to the file name of the NFS server init script, installed under /etc/init.d. SUSE Linux Enterprise ships the init script as /etc/init.d/nfsserver (package nfs-kernel-server). Hence the resource must be of type lsb:nfsserver.

After you have committed this configuration, Pacemaker should start the NFS Kernel server processes on both nodes.

LVM and File System Resources

  1. Configure LVM and file system type resources as follows (but do not commit this configuration yet):

    crm(live)configure# primitive p_lvm_nfs \
      ocf:heartbeat:LVM \
        params volgrpname="nfs" \
      op monitor interval="30s"
    crm(live)configure# primitive p_fs_engineering \
      ocf:heartbeat:Filesystem \
      params device=/dev/nfs/engineering \
        directory=/srv/nfs/engineering \
        fstype=ext3 \
      op monitor interval="10s"
    crm(live)configure# primitive p_fs_sales \
      ocf:heartbeat:Filesystem \
      params device=/dev/nfs/sales \
        directory=/srv/nfs/sales \
        fstype=ext3 \
     op monitor interval="10s"
  2. Combine these resources into a Pacemaker resource group:

    crm(live)configure# group g_nfs \
      p_lvm_nfs p_fs_engineering p_fs_sales
  3. Add the following constraints to make sure that the group is started on the same node where the DRBD Master/Slave resource is in the Master role:

    crm(live)configure# order o_drbd_before_nfs inf: \
      ms_drbd_nfs:promote g_nfs:start
    crm(live)configure# colocation c_nfs_on_drbd inf: \
      g_nfs ms_drbd_nfs:Master
  4. Commit this configuration:

    crm(live)configure# commit

After these changes have been committed, Pacemaker does the following:

  • It activates all Logical Volumes of the nfs LVM Volume Group on the same node where DRBD is in the Primary role. Confirm this with vgdisplay or lvs.

  • It mounts the two Logical Volumes to /srv/nfs/sales and /srv/nfs/engineering on the same node. Confirm this with mount (or by looking at /proc/mounts).

NFS Export Resources

Once your DRBD, LVM, and file system resources are working properly, continue with the resources managing your NFS exports. To create highly available NFS export resources, use the exportfs resource type.

NFSv4 Virtual File System Root

If clients exclusively use NFSv3 to connect to the server, you do not need this resource. In this case, continue with Non-root NFS Exports.

  1. To enable NFSv4 support, configure one—and only one—NFS export whose fsid option is either 0 (as used in the example below) or the string root. This is the root of the virtual NFSv4 file system.

    crm(live)configure# primitive p_exportfs_root \
      ocf:heartbeat:exportfs \
      params fsid=0 \
        directory="/srv/nfs" \
        options="rw,crossmnt" \
        clientspec="10.9.9.0/255.255.255.0" \
      op monitor interval="30s"
    crm(live)configure# clone cl_exportfs_root p_exportfs_root

    This resource does not hold any actual NFS-exported data, merely the empty directory (/srv/nfs) that the other NFS exports are mounted into. Since there is no shared data involved here, we can safely clone this resource.

  2. Since any data should be exported only on nodes where this clone has been properly started, add the following constraints to the configuration:

    crm(live)configure# order o_root_before_nfs inf: \
      cl_exportfs_root g_nfs:start
    crm(live)configure# colocation c_nfs_on_root inf: \
      g_nfs cl_exportfs_root
    crm(live)configure# commit

    After this, Pacemaker should start the NFSv4 virtual file system root on both nodes.

  3. Check the output of the exportfs -v command to verify this.

Non-root NFS Exports

All NFS exports that do not represent an NFSv4 virtual file system root must set the fsid option to either a unique positive integer (as used in the example), or a UUID string (32 hex digits with arbitrary punctuation).

  1. Create NFS exports with the following commands:

    crm(live)configure# primitive p_exportfs_sales \
      ocf:heartbeat:exportfs \
        params fsid=1 \
          directory="/srv/nfs/sales" \
          options="rw,mountpoint" \
          clientspec="10.9.9.0/255.255.255.0" \
          wait_for_leasetime_on_stop=true \
      op monitor interval="30s"
    crm(live)configure# primitive p_exportfs_engineering \
      ocf:heartbeat:exportfs \
        params fsid=2 \
          directory="/srv/nfs/engineering" \
          options="rw,mountpoint" \
          clientspec="10.9.9.0/255.255.255.0" \
          wait_for_leasetime_on_stop=true \
      op monitor interval="30s"
  2. After you have created these resources, add them to the existing g_nfs resource group:

    crm(live)configure# edit g_nfs
  3. Edit the group configuration so it looks like this:

    group g_nfs \
      p_lvm_nfs p_fs_engineering p_fs_sales \
      p_exportfs_engineering p_exportfs_sales
  4. Commit this configuration:

    crm(live)configure# commit

    Pacemaker will export the NFS virtual file system root and the two other exports

  5. Confirm that the NFS exports are set up properly:

    exportfs -v

Resource for Floating IP Address

To enable smooth and seamless failover, your NFS clients will be connecting to the NFS service via a floating cluster IP address, rather than via any of the hosts' physical IP addresses.

  1. Add the following resource to the cluster configuration:

    crm(live)configure# primitive p_ip_nfs \
      ocf:heartbeat:IPaddr2 \
        params ip=10.9.9.180 \
          cidr_netmask=24 \
        op monitor interval="30s"
  2. Add the IP address to the resource group (like you did with the exportfs resources):

    crm(live)configure# edit g_nfs

    This is the final setup of the resource group:

    group g_nfs \
      p_lvm_nfs p_fs_engineering p_fs_sales \
      p_exportfs_engineering p_exportfs_sales \
      p_ip_nfs
  3. Complete the cluster configuration:

    crm(live)configure# commit

    At this point Pacemaker will set up the floating cluster IP address.

  4. Confirm that the cluster IP is running correctly:

    ip address show

    The cluster IP should be added as a secondary address to whatever interface is connected to the 10.9.9.0/24 subnet.

NOTE: Connection of Clients

There is no way to make your NFS exports bind to just this cluster IP address. The Kernel NFS server always binds to the wildcard address (0.0.0.0 for IPv4). However, your clients must connect to the NFS exports through the floating IP address only, otherwise the clients will suffer service interruptions on cluster failover.