PPC64 - After updating to SLES11 SP4 kernel 3.0.101-108.81 system is crashing constantly

This document (7023571) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 11 Service Pack 4 (SLES 11 SP4)

Situation

After a PPC64 LPAR has been updated to kernel 3.0.101-108.81-{bigmem,default} the system is crashing constantly with the following console output.

Oops: Bad kernel stack pointer, sig: 6 [#1]
SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: nfsd binfmt_misc nfs fscache lockd auth_rpcgss nfs_acl sunrpc fuse xfs loop ipv6 ipv6_lib ibmveth(X) nx_crypto(X) sg ext3 jbd mbcache dm_mirror dm_region_hash dm_log linear sd_mod crc_t10dif ibmvf
c(X) scsi_transport_fc scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc dm_service_time dm_least_pending dm_queue_length dm_round_robin dm_multipath scsi_dh dm_snapshot dm_mod ibmvscsic(X) scsi_transport_srp scsi
_tgt scsi_mod
Supported: Yes, External
NIP: 0000000000003494 LR: 00007fff8e6b3e40 CTR: 00007fff7c340298
REGS: c00000001e9bfd40 TRAP: 0300   Tainted: G             X  (3.0.101-108.81-bigmem)
MSR: 8000000000001000 <ME>  CR: 22282484  XER: 20000001
DAR: 00007fff8ca96978, DSISR: 42000000
TASK = c000004de130e160[21452] 'jstart' THREAD: c000004d90db0000 CPU: 24
GPR00: 00007fff7c340298 00007fff8ca8b7a0 000000000000cafe 00007fff8ca8c9b0
GPR04: 0000000000000000 0000000000000010 00007fff8ca8d9a0 0000000000000001
GPR08: ffffffffffffffc0 00007fff8ebcf2dc 00007fff8ec9de70 000000000000c0de
GPR12: 00007fff8ecf9678 00007fff8ca968a0 0000000010f78800 00007fff8ed57b26
GPR16: 00007fff8ca8b810 00000000000003d0 0000000010f79c80 0000000010f798b0
GPR20: 00007fff7c3402ec 0000000010f79890 00007fff8eddc0a8 00007fff8ca8c9b0
GPR24: 0000000010f79840 00007fff7c340280 00007fff7c340298 00007fff7c3402cc
GPR28: 00007fff8eddc0ac 0000000000000080 00007fff8ec2d2e0 00007fff8ca8b7a0
NIP [0000000000003494] 0x3494
LR [00007fff8e6b3e40] 0x7fff8e6b3e40
Call Trace:
Instruction dump:
f98d0098 XXXXXXXX XXXXXXXX XXXXXXXX 7d7a02a6 XXXXXXXX XXXXXXXX XXXXXXXX
7d9b02a6 XXXXXXXX XXXXXXXX XXXXXXXX f92d00d8 XXXXXXXX XXXXXXXX XXXXXXXX
Sending IPI to other cpus...
I'm in purgatory
io_event_irq: No ibm,io-events on system! IO Event interrupt disabled.
doing fast boot
Starting multipathd
Creating device nodes with udev
Out of memory: Kill process 608 (udevadm) score 0 or sacrifice child
Killed process 608 (udevadm) total-vm:3520kB, anon-rss:640kB, file-rss:2112kB
boot/04-udev.sh: line 17:   608 Killed                  /sbin/udevadm settle --timeout=$udev_timeout
Out of memory: Kill process 879 (init) score 0 or sacrifice child
Killed process 880 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 881 (init) score 0 or sacrifice child
Killed process 882 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 883 (init) score 0 or sacrifice child
Killed process 884 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 883 (init) score 0 or sacrifice child
Killed process 883 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 885 (init) score 0 or sacrifice child
Killed process 886 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 894 (init) score 0 or sacrifice child
Killed process 894 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 895 (init) score 0 or sacrifice child
Killed process 895 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 896 (init) score 0 or sacrifice child
Killed process 896 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 897 (init) score 0 or sacrifice child
Killed process 897 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 900 (init) score 0 or sacrifice child
Killed process 901 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 902 (init) score 0 or sacrifice child
Killed process 903 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 902 (init) score 0 or sacrifice child
Killed process 902 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Out of memory: Kill process 904 (init) score 0 or sacrifice child
Killed process 904 (init) total-vm:4992kB, anon-rss:1216kB, file-rss:0kB
Kernel panic - not syncing: Out of memory and no killable processes...


Call Trace:
[c0000000165cf5b0] [c000000008014ae8] .show_stack+0x68/0x1b0 (unreliable)
[c0000000165cf660] [c00000000864b838] .panic+0xe0/0x288
[c0000000165cf700] [c0000000081752a0] .out_of_memory+0x380/0x390
[c0000000165cf7f0] [c00000000817cba4] .__alloc_pages_nodemask+0x9d4/0x9f0
[c0000000165cf9b0] [c0000000081c8a8c] .alloc_pages_vma+0x12c/0x2f0
[c0000000165cfa90] [c0000000081a1aa8] .do_wp_page+0x338/0xc20
[c0000000165cfb90] [c00000000863df14] .do_page_fault+0x3f4/0x770
[c0000000165cfe30] [c000000008006024] handle_page_fault+0x20/0xffc
Rebooting in 180 seconds..

Resolution

Reverting back to  kernel 3.0.101-108.77.1 or updating to kernel 3.0.101-108.84.1 fixes the panics.

Cause


Additional Information


Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7023571
  • Creation Date: 10-Dec-2018
  • Modified Date:03-Mar-2020
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center