Delayed outgoing packets causing NFS timeouts

This document (000019943) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 15 SP2
SUSE Linux Enterprise Server 12 SP5
 

Situation

This problem manifests itself in several ways; the one observed in the field was Linux NFS client timing out, with the following message logged in the system log:
        nfs: server *HOSTNAME* not responding, still trying
and after several minutes
        nfs: server *HOSTNAME* OK
Because of an in-kernel retransmit timer (or another packet being queued), the stuck packet will eventually be sent out, after a delay.

In tcpdump packet capture analysis, this problem can be identified by spurious resend attempts of the same packet (with equal TSVal) a long time apart.

Resolution

This problem has been reported upstream [3] and the proper fix is still in the works by the Linux Kernel community.

SUSE has released kernel maintanance update that will mitigate this problem by disabling the lockless optimization on pfifo_fast qdisc (which is the only qdisc currently making use of this optimization) [4].
The issue is solved in the following kernel versions:
  • SLES15 SP2: 5.3.18-24.61
  • SLES12 SP5: 4.12.14-122.66
Alternatively, to address the problem without the kernel update / reboot, the problem can be mitigated by switching away form pfifo_fast qdisc and making sure the change stays functional across reboots by the following commands:
  echo 'net.core.default_qdisc = fq_codel' >>/etc/sysctl.conf
  sysctl -w net.core.default_qdisc=fq_codel
  tc qdisc add dev $devname root handle 1: mq
  tc qdisc del dev $devname root
In case the $devname above is not a multiqueue-capable device, the following commands have to be used instead:
  echo 'net.core.default_qdisc = fq_codel' >>/etc/sysctl.conf
  sysctl -w net.core.default_qdisc=fq_codel
  tc qdisc add dev $devname root handle 1: fq_codel
  tc qdisc del dev $devname root

Cause

Linux kernel implements various algorithms -- called queuing disciplines (qdiscs) for scheduling outgoing network packets. Starting with Linux Kernel 4.16, there is an optimization that allows for these algorithms to process the packets without acquiring any locks, with the ultimate goal of improving throughput. These changes have been implemented upstream in commits [1] [2], and those commits have been backported by SUSE to SLE12-SP5 and SLE15-SP2 codestreams.

However, the lockless optimization has a design flaw which (under certain very specific circumstances) opens a window for a race condition that causes the "last" packet in the queue to be stuck (and not sent out to the wire) for a potentially unbound amount of time, causing network stalls.

Additional Information

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000019943
  • Creation Date: 20-Apr-2021
  • Modified Date:20-Apr-2021
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center