After you obtain the dump, it is time to analyze it. There are several options.
The original tool to analyze the dumps is GDB. You can even use it in the latest environments, although it has several disadvantages and limitations:
GDB was not specifically designed to debug kernel dumps.
GDB does not support ELF64 binaries on 32-bit platforms.
GDB does not understand other formats than ELF dumps (it cannot debug compressed dumps).
That is why the crash utility was implemented. It analyzes crash dumps and debugs the running system as well. It provides functionality specific to debugging the Linux kernel and is much more suitable for advanced debugging.
If you want to debug the Linux kernel, you need to install its debugging information package in addition. Check if the package is installed on your system with zypper se kernel | grep debug.
IMPORTANT: Repository for Packages with Debugging Information
If you subscribed your system for online updates, you can find
debuginfo packages in the
*-Debuginfo-Updates online installation repository
relevant for SUSE Linux Enterprise Server 11 SP4. Use YaST to enable the
To open the captured dump in crash on the machine that produced the dump, use a command like this:
crash /boot/vmlinux-188.8.131.52-0.1-default.gz /var/crash/2010-04-23-11\:17/vmcore
The first parameter represents the kernel image. The second parameter is the dump file captured by kdump. You can find this file under /var/crash by default.
The Linux kernel comes in Executable and Linkable Format (ELF). This file is usually called vmlinux and is directly generated in the compilation process. Not all boot loaders, especially on x86 (i386 and x86_64) architecture, support ELF binaries. The following solutions exist on different architectures supported by SUSE® Linux Enterprise Server.
Mostly for historic reasons, the Linux kernel consists of two parts: the Linux kernel itself (vmlinux) and the setup code run by the boot loader.
These two parts are linked together in a file called bzImage, which can be found in the kernel source tree. The file is now called vmlinuz (note z vs. x) in the kernel package.
The ELF image is never directly used on x86. Therefore, the main kernel package contains the vmlinux file in compressed form called vmlinux.gz.
To sum it up, an x86 SUSE kernel package has two kernel files:
vmlinuz which is executed by the boot loader.
vmlinux.gz, the compressed ELF image that is required by crash and GDB.
The elilo boot loader, which boots the Linux kernel on the IA64 architecture, supports loading ELF images (even compressed ones) out of the box. The IA64 kernel package contains only one file called vmlinuz. It is a compressed ELF image. vmlinuz on IA64 is the same as vmlinux.gz on x86.
The yaboot boot loader on PPC also supports loading ELF images, but not compressed ones. In the PPC kernel package, there is an ELF Linux kernel file vmlinux. Considering crash, this is the easiest architecture.
If you decide to analyze the dump on another machine, you must check both the architecture of the computer and the files necessary for debugging.
You can analyze the dump on another computer only if it runs a Linux system of the same architecture. To check the compatibility, use the command uname -i on both computers and compare the outputs.
If you are going to analyze the dump on another computer, you also need the appropriate files from the kernel and kernel debug packages.
Put the kernel dump, the kernel image from /boot, and its associated debugging info file from /usr/lib/debug/boot into a single empty directory.
Additionally, copy the kernel modules from /lib/modules/$(uname -r)/kernel/ and the associated debug info files from /usr/lib/debug/lib/modules/$(uname -r)/kernel/ into a subdirectory named modules.
In the directory with the dump, the kernel image, its debug info file, and the modules subdirectory, launch the crash utility: crash vmlinux-version vmcore.
NOTE: Support for Kernel Images
Compressed kernel images (gzip, not the bzImage file) are supported by SUSE packages of crash since SUSE® Linux Enterprise Server 11. For older versions, you have to extract the vmlinux.gz (x86) or the vmlinuz (IA64) to vmlinux.
Regardless of the computer on which you analyze the dump, the crash utility will produce an output similar to this:
tux@mercury:~> crash /boot/vmlinux-184.108.40.206-0.1-default.gz /var/crash/2010-04-23-11\:17/vmcore crash 4.0-7.6 Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu"... KERNEL: /boot/vmlinux-220.127.116.11-0.1-default.gz DEBUGINFO: /usr/lib/debug/boot/vmlinux-18.104.22.168-0.1-default.debug DUMPFILE: /var/crash/2009-04-23-11:17/vmcore CPUS: 2 DATE: Thu Apr 23 13:17:01 2010 UPTIME: 00:10:41 LOAD AVERAGE: 0.01, 0.09, 0.09 TASKS: 42 NODENAME: eros RELEASE: 22.214.171.124-0.1-default VERSION: #1 SMP 2010-03-31 14:50:44 +0200 MACHINE: x86_64 (2999 Mhz) MEMORY: 1 GB PANIC: "SysRq : Trigger a crashdump" PID: 9446 COMMAND: "bash" TASK: ffff88003a57c3c0 [THREAD_INFO: ffff880037168000] CPU: 1 STATE: TASK_RUNNING (SYSRQ) crash>
The command output prints first useful data: There were 42 tasks running at the moment of the kernel crash. The cause of the crash was a SysRq trigger invoked by the task with PID 9446. It was a Bash process because the echo that has been used is an internal command of the Bash shell.
The crash utility builds upon GDB and provides many useful additional commands. If you enter bt without any parameters, the backtrace of the task running at the moment of the crash is printed:
crash> bt PID: 9446 TASK: ffff88003a57c3c0 CPU: 1 COMMAND: "bash" #0 [ffff880037169db0] crash_kexec at ffffffff80268fd6 #1 [ffff880037169e80] __handle_sysrq at ffffffff803d50ed #2 [ffff880037169ec0] write_sysrq_trigger at ffffffff802f6fc5 #3 [ffff880037169ed0] proc_reg_write at ffffffff802f068b #4 [ffff880037169f10] vfs_write at ffffffff802b1aba #5 [ffff880037169f40] sys_write at ffffffff802b1c1f #6 [ffff880037169f80] system_call_fastpath at ffffffff8020bfbb RIP: 00007fa958991f60 RSP: 00007fff61330390 RFLAGS: 00010246 RAX: 0000000000000001 RBX: ffffffff8020bfbb RCX: 0000000000000001 RDX: 0000000000000002 RSI: 00007fa959284000 RDI: 0000000000000001 RBP: 0000000000000002 R8: 00007fa9592516f0 R9: 00007fa958c209c0 R10: 00007fa958c209c0 R11: 0000000000000246 R12: 00007fa958c1f780 R13: 00007fa959284000 R14: 0000000000000002 R15: 00000000595569d0 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b crash>
Now it is clear what happened: The internal echo command of Bash shell sent a character to /proc/sysrq-trigger. After the corresponding handler recognized this character, it invoked the crash_kexec() function. This function called panic() and kdump saved a dump.
In addition to the basic GDB commands and the extended version of bt, the crash utility defines many other commands related to the structure of the Linux kernel. These commands understand the internal data structures of the Linux kernel and present their contents in a human readable format. For example, you can list the tasks running at the moment of the crash with ps. With sym, you can list all the kernel symbols with the corresponding addresses, or inquire an individual symbol for its value. With files, you can display all the open file descriptors of a process. With kmem, you can display details about the kernel memory usage. With vm, you can inspect the virtual memory of a process, even at the level of individual page mappings. The list of useful commands is very long and many of these accept a wide range of options.
The commands that we mentioned reflect the functionality of the common Linux commands, such as ps and lsof. If you would like to find out the exact sequence of events with the debugger, you need to know how to use GDB and to have strong debugging skills. Both of these are out of the scope of this document. In addition, you need to understand the Linux kernel. Several useful reference information sources are given at the end of this document.