Multipath Drive Failed with queue_if_no_path after All Paths Failed

This document (7022310) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 12 Service Pack 3 (SLES 12 SP3)
SUSE Linux Enterprise Server 12 Service Pack 2 (SLES 12 SP2)
SUSE Linux Enterprise Server 12 Service Pack 1 (SLES 12 SP1)
SUSE Linux Enterprise Server 12
SUSE Linux Enterprise Server 11

Situation

Even though queue_if_on_path is configured or set, when all drives are lost, the multipath disk fails resulting in filesystem corruption. The expectation with queue_if_no_path is that all I/O will be pending until the disks are back online.
node:~ # multipath -ll red
red (360014056e9de179685e4baf9250a6708) dm-5 LIO-ORG,IBLOCK
size=1.0G features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 4:0:0:1 sde 8:64 active ready running
  `- 3:0:0:1 sdd 8:48 active ready running

The multipath configuration entry in /etc/multipath.conf is:
    multipath {
        wwid 360014056e9de179685e4baf9250a6708
        alias "red"
        features "1 queue_if_no_path"
        no_path_retry 5
        path_grouping_policy "multibus"
        path_selector "round-robin 0"
        rr_min_io 1000
        rr_min_io_rq 1
    }

An error is sometimes observed in /var/log/messages like the following:
Oct 27 16:41:50 node multipathd[2002]: red: config error, ignoring 'queue_if_no_path' because no_path_retry=5

Resolution

1. Change no_path_retry 5 to no_path_retry "queue" in /etc/multipath.conf.
2. Save /etc/multipath.conf and run: echo reconfigure | multipathd -k.

NOTE: no_path_retry defaults to fail in SLE12 and undefined in SLE11. So make sure you don't just delete no_path_retry, but actually set it to no_path_retry "queue".

Cause

The no_path_retry takes precedence over queue_if_no_path.

Review the options in the multipath.conf(5) man page. Search for no_path_retry and queue_if_no_path.

queue_if_no_path  (Superseded by no_path_retry) Queue I/O if no path is active.  Identical to the no_path_retry with queue value. See KNOWN ISSUES.

KNOWN ISSUES
   The  usage  of  queue_if_no_path  option  can  lead to D state processes being hung and not killable in situations where all the paths to the LUN go offline. It is advisable to use the no_path_retry option instead.

   The use of queue_if_no_path or no_path_retry might lead to a deadlock if the dev_loss_tmo setting results in a device being removed while I/O is still queued. The multipath daemon  will  update  the dev_loss_tmo setting accordingly to avoid this deadlock. Hence if both values are specified the order of precedence is no_path_retry, queue_if_no_path, dev_loss_tmo.

Additional Information

2017-11-10: Jason Record - Initial document

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7022310
  • Creation Date: 10-Nov-2017
  • Modified Date:03-Mar-2020
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center