System freezes with a large number of tasks waiting for gsch_scan() to return

This document (000019767) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 12 SP5
SUSE Linux Enterprise Server 12 SP4
 

Situation

The system hangs/freezes while most of the tasks are on sleep state waiting for gsch_scan() to return and to wake them up again. Most of those tasks have the following stack trace:
PID: 16901  TASK: ffff897e6f7f9140  CPU: 7   COMMAND: "sshd"
 #0 [ffffa6a8c6063bd8] __schedule at ffffffffa07209e2
 #1 [ffffa6a8c6063c60] schedule at ffffffffa0721002
 #2 [ffffa6a8c6063c70] gsch_scan at ffffffffc0887e53 [gsch]
 #3 [ffffa6a8c6063d30] gsch_policy_handle_close at ffffffffc0884cbf [gsch]
 #4 [ffffa6a8c6063d80] gsch_redirfs_release at ffffffffc0883cc0 [gsch]
 #5 [ffffa6a8c6063dc8] rfs_postcall_flts at ffffffffc081903e [redirfs]
 #6 [ffffa6a8c6063de8] rfs_release at ffffffffc08121ca [redirfs]
 #7 [ffffa6a8c6063e88] __fput at ffffffffa0268902
 #8 [ffffa6a8c6063ec8] task_work_run at ffffffffa00b09d8
 #9 [ffffa6a8c6063f00] exit_to_usermode_loop at ffffffffa008ce79
#10 [ffffa6a8c6063f30] do_syscall_64 at ffffffffa0004a25
#11 [ffffa6a8c6063f50] entry_SYSCALL_64_after_hwframe at ffffffffa08000b6
    RIP: 00007fb582a1d1d2  RSP: 00007fff0873aa68  RFLAGS: 00000202
    RAX: 0000000000000000  RBX: 0000000000000001  RCX: 00007fb582a1d1d2
    RDX: 00007fff0873ac20  RSI: 0000000000000001  RDI: 0000000000000008
    RBP: 00007fff0873adc0   R8: 0000000000000000   R9: 00007fff0873a6b0
    R10: 0000000000000008  R11: 0000000000000202  R12: 00007fff0873af50
    R13: 0000000000000000  R14: 000055ecb8a56250  R15: 000055ecb6cbb1a0
    ORIG_RAX: 0000000000000003  CS: 0033  SS: 002b

In more details, there are 1398 tasks on sleep state waiting for gsch_scan():
crash> foreach IN bt | grep "#2\|#3" | awk '{print $3 }' | sort | uniq -c|sort -rn
   2713 do_wait
   2707 sys_wait4
   1398 gsch_scan
   1376 gsch_policy_handle_pre_open
    685 pipe_wait
    685 pipe_read
    254 kthread
    196 futex_wait_queue_me
    196 futex_wait
    103 rescuer_thread
     57 schedule_hrtimeout_range_clock

gsch_scan() and gsch_policy_handle_pre_open() are coming from the third party kernel module gsch (Trend Micro Deep Security Agent), which is also tainting the kernel:
crash> mod -t
NAME     TAINTS
tmhook   OE
redirfs  OE
gsch     OE
bmhook   OE 

Just before the system hang the TrendMicro processes were failing to fork on system slice:
kernel: [59696.369225] cgroup: fork rejected by pids controller in /system.slice/ds_agent.service

Resolution

TrendMicro services are running on system slice, which by default has TakskMax limit 512 on SLES12.  Therefore it would be highly recommended to increase the DefaultTasksMax for system slice globally:
#/etc/systemd/system.conf
------------------------
[Manager]
DefaultTasksMax=infinity 

And reload the new settings by running:
# systemctl daemon-reload 

In case the same system hang pattern happens without the "fail to fork" error, then we would recommend to involve Trend Micro support as they might recommend a workaround or possibly an updated version if this has been fixed in newer Trend Micro Deep Security Agent versions. 

Cause

TrendMicro services are running on system slice, which by default on SLES12 has TakskMax limit 512 . In case TrendMicro ds_agent services are not able to fork, it might cause that the sleeping tasks which are waiting for gsch_scan() to return, might not get any response.  

 

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000019767
  • Creation Date: 13-Dec-2021
  • Modified Date:13-Dec-2021
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center