SLES12 SP5 system panic in tcp_retransmit_skb()

This document (000019642) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 12 SP5 (SLES12 SP5)


Situation

A SLES12 SP5 system with kernel 4.12.14-122.12-default or older may suffer an intermittent panic. The console output or kernel log will show the following warning messages:

[172193.273976] WARNING: CPU: 38 PID: 0 at ../net/ipv4/tcp_timer.c:434 tcp_retransmit_timer+0x985/0x9b0 
[172193.273977] Modules linked in: loop mmfs26(OEX) mmfslinux(OEX) tracedev(OEX) binfmt_misc nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache tcp_diag inet_diag scsi_transport_iscsi bonding iscsi_ibft  
iscsi_boot_sysfs rdma_ucm(OEX) ib_ucm(OEX) rdma_cm(OEX) iw_cm(OEX) configfs ib_ipoib(OEX) ib_cm(OEX) ib_umad(OEX) mlx5_ib(OEX) mlx5_core(OEX) mlxfw(OEX) mpt3sas raid_class mptctl mptbase msr intel_rapl sb_edac x86_pkg_temp_thermal intel_
powerclamp iTCO_wdt coretemp iTCO_vendor_support kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel pcbc nls_iso8859_1 nls_cp437 vfat fat cdc_ether aesni_intel ipmi_ssif usbnet aes_x86_64 joydev crypto_simd mii ses glue_helper cryp
td igb ioatdma enclosure ch lpc_ich pcspkr scsi_transport_sas mfd_core dca wmi ipmi_si ipmi_devintf
[172193.274036]  ipmi_msghandler pcc_cpufreq button sunrpc mlx4_en(OEX) mlx4_ib(OEX) ptp pps_core ib_uverbs(OEX) ib_core(OEX) ext4 crc16 jbd2 mbcache sd_mod hid_generic usbhid mgag200 crc32c_intel i2c_algo_bit qla2xxx drm_kms_helper sysc
opyarea sysfillrect sysimgblt fb_sys_fops ttm xhci_pci nvme_fc mlx4_core(OEX) xhci_hcd ehci_pci nvme_fabrics devlink ehci_hcd drm nvme_core mlx_compat(OEX) scsi_transport_fc usbcore drm_panel_orientation_quirks megaraid_sas(OEX) sg dm_mu
ltipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod efivarfs autofs4
[172193.274075] Supported: Yes, External
[172193.274079] CPU: 38 PID: 0 Comm: swapper/38 Tainted: G           OE      4.12.14-122.12-default #1 SLE12-SP5
[172193.274080] Hardware name: LENOVO x3950 X6 -[6241AC4]-/00WA086, BIOS -[A9E152FUS-4.71]- 12/05/2019
[172193.274081] task: ffff8b0a8152c0c0 task.stack: ffffa5dab142c000
[172193.274084] RIP: 0010:tcp_retransmit_timer+0x985/0x9b0
[172193.274085] RSP: 0018:ffff8bc6bf483e48 EFLAGS: 00010246
[172193.274086] RAX: 0000000000000001 RBX: ffff8bc680c08000 RCX: 000000000000001f
[172193.274087] RDX: 000000281784d847 RSI: 0000000000000004 RDI: ffff8bc680c08000
[172193.274088] RBP: ffff8bc680c08158 R08: 0000000000000004 R09: ffffecc2a5232d5f
[172193.274088] R10: 0000000000000004 R11: 0000000000000005 R12: 0000000000000100
[172193.274089] R13: ffffffff831ca400 R14: ffff8bc680c08000 R15: 0000000000000000
[172193.274090] FS:  0000000000000000(0000) GS:ffff8bc6bf480000(0000) knlGS:0000000000000000
[172193.274091] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[172193.274092] CR2: 00007fec7b9606e0 CR3: 000003c4c400a003 CR4: 00000000001606e0
[172193.274093] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[172193.274093] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[172193.274094] Call Trace:
[172193.274099]  <IRQ>
[172193.274104]  ? tcp_write_timer_handler+0x240/0x240
[172193.274105]  tcp_write_timer_handler+0xc6/0x240
[172193.274106]  tcp_write_timer+0x69/0x80
[172193.274114]  call_timer_fn+0x32/0x140
[172193.274116]  run_timer_softirq+0x1d6/0x420
[172193.274127]  ? timerqueue_add+0x54/0x80
[172193.274129]  ? enqueue_hrtimer+0x38/0x90
[172193.274133]  __do_softirq+0xce/0x28b
[172193.274141]  irq_exit+0xdb/0xf0
[172193.274147]  smp_apic_timer_interrupt+0x3f/0x60
[172193.274150]  apic_timer_interrupt+0x8f/0xa0
[172193.274151]  </IRQ>
[172193.274154] RIP: 0010:mwait_idle+0x7b/0x1a0
[172193.274154] RSP: 0018:ffffa5dab142fed8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[172193.274155] RAX: 0000000000000000 RBX: ffff8b0a8152c0c0 RCX: 0000000000000000
[172193.274156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[172193.274156] RBP: 0000000000000026 R08: 0000000000000004 R09: ffffecc2a5232d5f
[172193.274157] R10: 00000001028fba98 R11: ffff8bc1268b8880 R12: ffff8b0a8152c0c0
[172193.274158] R13: ffff8b0a8152c0c0 R14: 0000000000000000 R15: 0000000000000000
[172193.274166]  do_idle+0x160/0x1f0
[172193.274168]  cpu_startup_entry+0x5d/0x70
[172193.274174]  start_secondary+0x1a5/0x200
[172193.274180]  secondary_startup_64+0xa5/0xb0
[172193.274182] Code: ff ff 31 d2 44 89 f6 48 89 df e8 07 f1 ff ff 84 c0 0f 84 ca f9 ff ff 31 f6 e9 c8 f9 ff ff 84 c0 0f 84 ae f9 ff ff e9 b6 f9 ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 e9 c5 f7 ff ff 48 8b 7b 60 48 

which then will be followed by the following stack trace:

[172193.274207] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 
[172193.274213] IP: tcp_retransmit_skb+0x5c/0xc0
[172193.274216] PGD 0 P4D 0  
[172193.274218] Oops: 0000 [#1] SMP PTI
[172193.274220] CPU: 38 PID: 0 Comm: swapper/38 Tainted: G        W  OE      4.12.14-122.12-default #1 SLE12-SP5
[172193.274221] Hardware name: LENOVO x3950 X6 -[6241AC4]-/00WA086, BIOS -[A9E152FUS-4.71]- 12/05/2019
[172193.274222] task: ffff8b0a8152c0c0 task.stack: ffffa5dab142c000
[172193.274223] RIP: 0010:tcp_retransmit_skb+0x5c/0xc0
[172193.274224] RSP: 0018:ffff8bc6bf483e28 EFLAGS: 00010246
[172193.274225] RAX: 00000000fffffff5 RBX: ffff8bc680c08000 RCX: 0000000000000001
[172193.274226] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8bc680c08000
[172193.274227] RBP: 0000000000000000 R08: 0000000016c00000 R09: ffffecc2a5232d5f
[172193.274228] R10: 0000000000000004 R11: 0000000000000000 R12: 00000000fffffff5
[172193.274229] R13: ffffffff831ca400 R14: ffff8bc680c08000 R15: 0000000000000000
[172193.274230] FS:  0000000000000000(0000) GS:ffff8bc6bf480000(0000) knlGS:0000000000000000
[172193.274231] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[172193.274232] CR2: 0000000000000030 CR3: 000003c4c400a003 CR4: 00000000001606e0
[172193.274233] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[172193.274234] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[172193.274235] Call Trace:
[172193.274236]  <IRQ>
[172193.274238]  tcp_retransmit_timer+0x3ff/0x9b0
[172193.274240]  ? tcp_write_timer_handler+0x240/0x240
[172193.274241]  tcp_write_timer_handler+0xc6/0x240
[172193.274243]  tcp_write_timer+0x69/0x80
[172193.274244]  call_timer_fn+0x32/0x140
[172193.274246]  run_timer_softirq+0x1d6/0x420
[172193.274248]  ? timerqueue_add+0x54/0x80
[172193.274250]  ? enqueue_hrtimer+0x38/0x90
[172193.274251]  __do_softirq+0xce/0x28b
[172193.274253]  irq_exit+0xdb/0xf0
[172193.274255]  smp_apic_timer_interrupt+0x3f/0x60
[172193.274257]  apic_timer_interrupt+0x8f/0xa0
[172193.274259]  </IRQ>
[172193.274260] RIP: 0010:mwait_idle+0x7b/0x1a0
[172193.274261] RSP: 0018:ffffa5dab142fed8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[172193.274262] RAX: 0000000000000000 RBX: ffff8b0a8152c0c0 RCX: 0000000000000000
[172193.274263] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[172193.274264] RBP: 0000000000000026 R08: 0000000000000004 R09: ffffecc2a5232d5f
[172193.274265] R10: 00000001028fba98 R11: ffff8bc1268b8880 R12: ffff8b0a8152c0c0
[172193.274265] R13: ffff8b0a8152c0c0 R14: 0000000000000000 R15: 0000000000000000

The most important part in this stack trace is the following line:

​​​​​​​RIP: 0010:tcp_retransmit_skb+0x5c/0xc0
​​​​​​​

​​​​​​​The numbers in the brackets [..] may vary as they are just the uptime in seconds since reboot.

Resolution

This is a known problem and has been fixed with kernel 4.12.14-122.17.1 and newer maintenance update kernels.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000019642
  • Creation Date: 05-Jun-2020
  • Modified Date:05-Jun-2020
    • SUSE Linux Enterprise Server
    • SUSE Linux Enterprise Server for SAP Applications

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center