Kdump fails on IBM POWER system when crash is triggered after CPU/memory Add/Remove operation

This document (7023750) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 15 SP 3
SUSE Linux Enterprise Server 15 SP 2
SUSE Linux Enterprise Server 15 SP 1
SUSE Linux Enterprise Server 15
SUSE Linux Enterprise Server 12 SP 5
SUSE Linux Enterprise Server 12 SP 4
SUSE Linux Enterprise Server 12 SP 3
 

Situation

On IBM POWER systems kdump fails when crash is triggered after CPUs or DLPAR memory were added or removed.

Resolution

The work around for this issue is to restart the kdump service after any CPU or memory hot-removal/hot-add:
systemctl restart kdump.service

 

Cause

When a CPU is hot-removed/hot-added, the udev rule to reload kdump kernel is triggered.
The kdump reload command (`kexec -p`) parses through device-tree nodes to get the CPU topology. Some of these nodes are removed when CPUs are hot-removed.
A race condition between kdump reload & CPUs hot-removal can cause kdump load to fail so a subsequent crash would not capture the dump.

When memory is hot-removed/hot-added, the udev rule to reload capture kernel is triggered while device-tree is being updated.
At times, it results in capture kernel looking at stale memory mapping and failing to save its dump to disk.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7023750
  • Creation Date: 27-Feb-2019
  • Modified Date:19-May-2021
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center