16.14 Best Practice

16.14.1 Scanning for New Devices without Rebooting

If your system has already been configured for multipathing and you later need to add more storage to the SAN, you can use the rescan-scsi-bus.sh script to scan for the new devices. By default, this script scans all HBAs with typical LUN ranges. The general syntax for the command looks like the following:

rescan-scsi-bus.sh [options] [host [host ...]]

For most storage subsystems, the script can be run successfully without options. However, some special cases might need to use one or more options. Run rescan-scsi-bus.sh --help for details.

WARNING: EMC PowerPath Environments

In EMC PowerPath environments, do not use the rescan-scsi-bus.sh utility provided with the operating system or the HBA vendor scripts for scanning the SCSI buses. To avoid potential file system corruption, EMC requires that you follow the procedure provided in the vendor documentation for EMC PowerPath for Linux.

Use the following procedure to scan the devices and make them available to multipathing without rebooting the system.

  1. On the storage subsystem, use the vendor’s tools to allocate the device and update its access control settings to allow the Linux system access to the new storage. Refer to the vendor’s documentation for details.

  2. Scan all targets for a host to make its new device known to the middle layer of the Linux kernel’s SCSI subsystem. At a terminal console prompt, enter

    sudo rescan-scsi-bus.sh

    Depending on your setup, you might need to run rescan-scsi-bus.sh with optional parameters. Refer to rescan-scsi-bus.sh --help for details.

  3. Check for scanning progress in the systemd journal (see Section 14.0, journalctl: Query the systemd Journal, (↑Administration Guide) for details). At a terminal console prompt, enter

    sudo journalctl -r

    This command displays the last lines of the log. For example:

    sudo journalctl -r
    Feb 14 01:03 kernel: SCSI device sde: 81920000
    Feb 14 01:03 kernel: SCSI device sdf: 81920000
    Feb 14 01:03 multipathd: sde: path checker registered
    Feb 14 01:03 multipathd: sdf: path checker registered
    Feb 14 01:03 multipathd: mpath4: event checker started
    Feb 14 01:03 multipathd: mpath5: event checker started
    Feb 14 01:03:multipathd: mpath4: remaining active paths: 1
    Feb 14 01:03 multipathd: mpath5: remaining active paths: 1
    [...]
  4. Repeat the previous steps to add paths through other HBA adapters on the Linux system that are connected to the new device.

  5. Run the multipath command to recognize the devices for DM-MPIO configuration. At a terminal console prompt, enter

    sudo multipath

    You can now configure the new device for multipathing.

16.14.2 Scanning for New Partitioned Devices without Rebooting

Use the example in this section to detect a newly added multipathed LUN without rebooting.

WARNING: EMC PowerPath Environments

In EMC PowerPath environments, do not use the rescan-scsi-bus.sh utility provided with the operating system or the HBA vendor scripts for scanning the SCSI buses. To avoid potential file system corruption, EMC requires that you follow the procedure provided in the vendor documentation for EMC PowerPath for Linux.

  1. Open a terminal console.

  2. Scan all targets for a host to make its new device known to the middle layer of the Linux kernel’s SCSI subsystem. At a terminal console prompt, enter

    rescan-scsi-bus.sh

    Depending on your setup, you might need to run rescan-scsi-bus.sh with optional parameters. Refer to rescan-scsi-bus.sh --help for details.

  3. Verify that the device is seen (such as if the link has a new time stamp) by entering

    ls -lrt /dev/dm-*

    You can also verify the devices in /dev/disk/by-id by entering

    ls -l /dev/disk/by-id/
  4. Verify the new device appears in the log by entering

    sudo journalctl -r
  5. Use a text editor to add a new alias definition for the device in the /etc/multipath.conf file, such as data_vol3.

    For example, if the UUID is 36006016088d014006e98a7a94a85db11, make the following changes:

    defaults {
         user_friendly_names   yes
      }
    multipaths {
         multipath {
              wwid    36006016088d014006e98a7a94a85db11
              alias  data_vol3
              }
      }
  6. Create a partition table for the device by entering

    fdisk /dev/disk/by-id/dm-uuid-mpath-<UUID>

    Replace UUID with the device WWID, such as 36006016088d014006e98a7a94a85db11.

  7. Trigger udev by entering

    sudo echo 'add' > /sys/block/dm_device/uevent

    For example, to generate the device-mapper devices for the partitions on dm-8, enter

    sudo echo 'add' > /sys/block/dm-8/uevent
  8. Create a file system on the device /dev/disk/by-id/dm-uuid-mpath-UUID_partN. Depending on your choice for the file system, you may use one of the following commands for this purpose: mkfs.btrfs mkfs.ext3, mkfs.ext4, or mkfs.xfs. Refer to the respective man pages for details. Replace UUID_partN with the actual UUID and partition number, such as 36006016088d014006e98a7a94a85db11_part1.

  9. Create a label for the new partition by entering the following command:

    sudo tune2fs -L LABELNAME /dev/disk/by-id/dm-uuid-UUID_partN
    

    Replace UUID_partN with the actual UUID and partition number, such as 36006016088d014006e98a7a94a85db11_part1. Replace LABELNAME with a label of your choice.

  10. Reconfigure DM-MPIO to let it read the aliases by entering

    sudo multipathd -k'reconfigure'
  11. Verify that the device is recognized by multipathd by entering

    sudo multipath -ll
  12. Use a text editor to add a mount entry in the /etc/fstab file.

    At this point, the alias you created in a previous step is not yet in the /dev/disk/by-label directory. Add a mount entry for the /dev/dm-9 path, then change the entry before the next time you reboot to

    LABEL=LABELNAME
  13. Create a directory to use as the mount point, then mount the device.

16.14.3 Viewing Multipath I/O Status

Querying the multipath I/O status outputs the current status of the multipath maps.

The multipath -l option displays the current path status as of the last time that the path checker was run. It does not run the path checker.

The multipath -ll option runs the path checker, updates the path information, then displays the current status information. This command always displays the latest information about the path status.

sudo multipath -ll
3600601607cf30e00184589a37a31d911
[size=127 GB][features="0"][hwhandler="1 emc"]

\_ round-robin 0 [active][first]
  \_ 1:0:1:2 sdav 66:240  [ready ][active]
  \_ 0:0:1:2 sdr  65:16   [ready ][active]

\_ round-robin 0 [enabled]
  \_ 1:0:0:2 sdag 66:0    [ready ][active]
  \_ 0:0:0:2 sdc  8:32    [ready ][active]

For each device, it shows the device’s ID, size, features, and hardware handlers.

Paths to the device are automatically grouped into priority groups on device discovery. Only one priority group is active at a time. For an active/active configuration, all paths are in the same group. For an active/passive configuration, the passive paths are placed in separate priority groups.

The following information is displayed for each group:

  • Scheduling policy used to balance I/O within the group, such as round-robin

  • Whether the group is active, disabled, or enabled

  • Whether the group is the first (highest priority) group

  • Paths contained within the group

The following information is displayed for each path:

  • The physical address as host:bus:target:lun, such as 1:0:1:2

  • Device node name, such as sda

  • Major:minor numbers

  • Status of the device

16.14.4 Managing I/O in Error Situations

You might need to configure multipathing to queue I/O if all paths fail concurrently by enabling queue_if_no_path. Otherwise, I/O fails immediately if all paths are gone. In certain scenarios, where the driver, the HBA, or the fabric experience spurious errors, DM-MPIO should be configured to queue all I/O where those errors lead to a loss of all paths, and never propagate errors upward.

When you use multipathed devices in a cluster, you might choose to disable queue_if_no_path. This automatically fails the path instead of queuing the I/O, and escalates the I/O error to cause a failover of the cluster resources.

Because enabling queue_if_no_path leads to I/O being queued indefinitely unless a path is reinstated, ensure that multipathd is running and works for your scenario. Otherwise, I/O might be stalled indefinitely on the affected multipathed device until reboot or until you manually return to failover instead of queuing.

To test the scenario:

  1. Open a terminal console.

  2. Activate queuing instead of failover for the device I/O by entering

    sudo dmsetup message device_ID 0 queue_if_no_path

    Replace the device_ID with the ID for your device. The 0 value represents the sector and is used when sector information is not needed.

    For example, enter:

    sudo dmsetup message 3600601607cf30e00184589a37a31d911 0 queue_if_no_path
  3. Return to failover for the device I/O by entering

    sudo dmsetup message device_ID 0 fail_if_no_path

    This command immediately causes all queued I/O to fail.

    Replace the device_ID with the ID for your device. For example, enter

    sudo dmsetup message 3600601607cf30e00184589a37a31d911 0 fail_if_no_path

To set up queuing I/O for scenarios where all paths fail:

  1. Open a terminal console.

  2. Open the /etc/multipath.conf file in a text editor.

  3. Uncomment the defaults section and its ending bracket, then add the default_features setting, as follows:

    defaults {
      default_features "1 queue_if_no_path"
    }
  4. After you modify the /etc/multipath.conf file, you must run dracut -f to re-create the initrd on your system, then reboot for the changes to take effect.

  5. When you are ready to return to failover for the device I/O, enter

    sudo dmsetup message mapname 0 fail_if_no_path

    Replace the mapname with the mapped alias name or the device ID for the device. The 0 value represents the sector and is used when sector information is not needed.

    This command immediately causes all queued I/O to fail and propagates the error to the calling application.

16.14.5 Resolving Stalled I/O

If all paths fail concurrently and I/O is queued and stalled, do the following:

  1. Enter the following command at a terminal console prompt:

    sudo dmsetup message mapname 0 fail_if_no_path

    Replace mapname with the correct device ID or mapped alias name for the device. The 0 value represents the sector and is used when sector information is not needed.

    This command immediately causes all queued I/O to fail and propagates the error to the calling application.

  2. Reactivate queuing by entering the following command:

    sudo dmsetup message mapname 0 queue_if_no_path

16.14.6 Configuring Default Settings for IBM z Systems Devices

Testing of the IBM z Systems device with multipathing has shown that the dev_loss_tmo parameter should be set to 90 seconds, and the fast_io_fail_tmo parameter should be set to 5 seconds. If you are using z Systems devices, modify the /etc/multipath.conf file to specify the values as follows:

defaults {
       dev_loss_tmo 90
       fast_io_fail_tmo 5
}

The dev_loss_tmo parameter sets the number of seconds to wait before marking a multipath link as bad. When the path fails, any current I/O on that failed path fails. The default value varies according to the device driver being used. The valid range of values is 0 to 600 seconds. To use the driver’s internal timeouts, set the value to zero (0) or to any value greater than 600.

The fast_io_fail_tmo parameter sets the length of time to wait before failing I/O when a link problem is detected. I/O that reaches the driver fails. If I/O is in a blocked queue, the I/O does not fail until the dev_loss_tmo time elapses and the queue is unblocked.

If you modify the /etc/multipath.conf file, the changes are not applied until you update the multipath maps, or until the multipathd daemon is restarted (systemctl restart multipathd).

16.14.7 Using Multipath with NetApp Devices

When using multipath for NetApp devices, we recommend the following settings in the /etc/multipath.conf file:

  • Set the default values for the following parameters globally for NetApp devices:

    max_fds max
    queue_without_daemon no
  • Set the default values for the following parameters for NetApp devices in the hardware table:

    dev_loss_tmo infinity
    fast_io_fail_tmo 5
    features "3 queue_if_no_path pg_init_retries 50"

16.14.8 Using --noflush with Multipath Devices

The --noflush option should always be used when running on multipath devices.

For example, in scripts where you perform a table reload, you use the --noflush option on resume to ensure that any outstanding I/O is not flushed, because you need the multipath topology information.

load
resume --noflush

16.14.9 SAN Timeout Settings When the Root Device Is Multipathed

A system with root (/) on a multipath device might stall when all paths have failed and are removed from the system because a dev_loss_tmo timeout is received from the storage subsystem (such as Fibre Channel storage arrays).

If the system device is configured with multiple paths and the multipath no_path_retry setting is active, you should modify the storage subsystem’s dev_loss_tmo setting accordingly to ensure that no devices are removed during an all-paths-down scenario. We strongly recommend that you set the dev_loss_tmo value to be equal to or higher than the no_path_retry setting from multipath.

The recommended setting for the storage subsystem’s dev_los_tmo is

<dev_loss_tmo> = <no_path_retry> * <polling_interval>

where the following definitions apply for the multipath values:

  • no_path_retry is the number of retries for multipath I/O until the path is considered to be lost, and queuing of IO is stopped.

  • polling_interval is the time in seconds between path checks.

Each of these multipath values should be set from the /etc/multipath.conf configuration file. For information, see Section 16.6, Creating or Modifying the /etc/multipath.conf File.