Behavior of automount daemon changed in respect to the "retry=n" NFS mount parameter
This document (000020409) is provided subject to the disclaimer at the end of this document.
SUSE Linux Enterprise Server 12
SUSE Linux Enterprise Server 11
SLES 12 and 15 contain autofs packages based on the upstream Linux community version 5.0.9 (and newer).
The mount definitions and maps used by autofs can be set up in a variety of ways, and the behaviors talked about in this TID may not hold 100% true in every possible configuration.
In this particular scenario, the automount daemon is controlling an NFS mount whose "retry=n" parameter is in use with a positive value. That is usually true, because even when "retry=n" is not explicitly set, it still defaults to a value of 2 (minutes). The automount is set up as a single, direct mount to an NFS Server.
With autofs v5.0.6 (in SLES 11 SP4), if a target NFS Server is not reachable on the network, the NFS mount option "retry=n" will cause the mount attempt to retry for the specified number of minutes before failing. As such, a process which attempts to access the mount point might stall while failures and retries occur. If the mount is eventually successful, the process can continue. If the NFS Server continues to be unreachable, an error will be returned after the retry time expires. In summary, the "old behavior" is to respect the "retry=n" value, even though it forces processes to wait.
With autofs v5.0.9 and newer (in SLES 12 and higher), when the target NFS Server is not reachable, autofs will almost immediately return an error. In other words, the "retry=n" value may not be used. Of course, if processes continues to make new attempts to access the mount path, new attempts to mount the NFS file system will be triggered. Therefore, new tries can still happen on demand even if retries are not automatically built in.
While this new behavior has some advantages, some administrators may prefer the old behavior (respecting "retry=n").
NOTE: This document pertains only to automounts accomplished with the autofs package. Systemd automounts are handled separately and with different code.
DEFAULT_MOUNT_WAIT=120or some other positive value. This sets the number of seconds to wait for the mount command to return. After this number of seconds, the automount daemon will abort the attempt, even if the mount command has not yet finished its retries.
Extra caution is needed here, because the "retry" value is measured in minutes, while the "DEFAULT_MOUNT_WAIT" value is measured in seconds. BOTH parameters must be set to appropriate values in order to result in a period of retries. When one is set lower than the other, the lesser of the two time periods will take effect.
This resolution should work on all SLES 12 and SLES 15 installations. There are other resolutions possible in certain situations. See the "Additional Information" section below for more details.
One example of execution being irrelevant came about in autofs 5.0.7, when a "probe" feature of autofs was modified, and is now used in more situations than it was previously. The probe will test to see whether a target NFS Server is reachable on the network. If the probe fails (i.e. if the NFS Server is down), "mount" is not executed, and therefore no mount options (such as "retry=n") can take effect. Automount may return with failure almost immediately, based on the probe results.
The DEFAULT_MOUNT_WAIT option in autofs serves two purposes: It allows the probe logic to be bypassed, so "mount" gets executed. It also tells the automount daemon how long to wait for the mount command to return results. Even then, it is not the "DEFAULT_MOUNT_WAIT" feature that is actually doing any retries. It is merely waiting on the return of the mount command, which could be doing retries.
2. If DEFAULT_MOUNT_WAIT is set to "0" (zero), automount will not use a wait timer of it's own, but will still bypass the probe logic and allow the mount command to be executed. However, the exact behavior of "0" has changed over time, which is why "0" is not presented above as a universal resolution.
In autofs 5.0.9, DEFAULT_MOUNT_WAIT=0 means: The automount daemon will execute mount and wait for it to return. If the mount command returns with failure after retry=n minutes, automount will again execute mount and wait, up to 2 more times, for a total of 3 attempts and 3 retry periods. While this sequence does make use of the retry=n value, those retries continue for 3 times longer than originally implemented. It does not match the old behavior.
In autofs 5.1.3, DEFAULT_MOUNT_WAIT=0 means: The automount daemon will execute mount and wait for it to return. If the mount command returns with failure after retry=n minutes, automount will stop its efforts and return an error. Therefore, when using autofs 5.1.3, DEFAULT_MOUNT_WAIT=0 is another available resolution to accurately restore the old behavior.
3. Starting in autofs 5.1.3 (present in SLES 12 SP5 and also in SLES 15), it is also possible to set the autofs wait timer in /etc/autofs.conf. In fact, this is now the preferred location. However, in that file the syntax is:
mount_wait = 120
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020409
- Creation Date: 08-Oct-2021
- Modified Date:08-Oct-2021
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com