SSSD Child was terminated by own WATCHDOG

This document (000021345) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 15 all Service Packs
SUSE Linux Enterprise Server for SAP Applications all Service Packs
 

Situation

A child process of sssd.service experiences a time out and is shut down by its watchdog, causing service disruption:
● sssd.service - System Security Services Daemon
Loaded: loaded (/usr/lib/systemd/system/sssd.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Sun 2024-01-07 15:49:49 UTC; 1 day 15h ago
Process: 2288 ExecStart=/usr/sbin/sssd -i ${DEBUG_LOGGER} (code=exited, status=1/FAILURE)
Main PID: 2288 (code=exited, status=1/FAILURE)

Jan 07 15:49:37 Hostname sssd[2288]: Child [2362] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
Jan 07 15:49:37 Hostname nss[48209]: Starting up
Jan 07 15:49:39 Hostname nss[48459]: Starting up
Jan 07 15:49:43 Hostname nss[49097]: Starting up
Jan 07 15:49:43 Hostname sssd[2288]: Exiting the SSSD. Could not restart critical service [nss]

Resolution

Modify the timeout setting within the appropriate section ([domain/...], [nss], or [pam]) of the /etc/sssd/sssd.conf file.  Identify the specific sssd service encountering termination from the watchdog and adjust the timeout value accordingly.

The updated value should be determined based on the number of cached users and groups, with suggested settings ranging from 20 to 30 seconds.  Note that the default timeout is set at 10 seconds.

Example for /etc/sssd/sssd.conf:                
[domain/default]
timeout = 20
Restart the SSSD service after making the change:
# systemctl restart sssd

Cause

While elevating the timeout values can act as a temporary solution, it is crucial to investigate the underlying cause of the watchdog termination.  Additionally, it's worth noting that if the processing of group membership is notably sluggish, there is a possibility of the watchdog terminating the sssd_be process.

The log indicates a prolonged blockage in the sssd_be process, exceeding the 10 seconds default timeout. To pinpoint the issue, activate sssd debugging (debug_level = 9) in /etc/sssd/sssd.conf, especially under [$domain].  After the problem reoccurs, note the timestamp of the termination event in /var/log/sssd/sssd_$domain.log. Check the last operation before this timestamp to identify the cause.  If watchdog terminates sssd_nss or sssd_pam, follow the same process using the respective sssd log.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000021345
  • Creation Date: 01-Feb-2024
  • Modified Date:08-Feb-2024
    • SUSE Linux Enterprise Server
    • SUSE Linux Enterprise Server for SAP Applications

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center