17.1 Tracing System Calls with strace

The strace command traces system calls of a process and signals received by the process. strace can either run a new command and trace its system calls, or you can attach strace to an already running command. Each line of the command's output contains the system call name, followed by its arguments in parenthesis and its return value.

To run a new command and start tracing its system calls, enter the command to be monitored as you normally do, and add strace at the beginning of the command line:

tux@mercury:~> strace ls
execve("/bin/ls", ["ls"], [/* 52 vars */]) = 0
brk(0)                                  = 0x618000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) \
	= 0x7f9848667000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) \
	= 0x7f9848666000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT \
(No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=200411, ...}) = 0
mmap(NULL, 200411, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f9848635000
close(3)                                = 0
open("/lib64/librt.so.1", O_RDONLY)     = 3
[...]
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) \
= 0x7fd780f79000
write(1, "Desktop\nDocuments\nbin\ninst-sys\n", 31Desktop
Documents
bin
inst-sys
) = 31
close(1)                                = 0
munmap(0x7fd780f79000, 4096)            = 0
close(2)                                = 0
exit_group(0)                           = ?

To attach strace to an already running process, you need to specify the -p with the process ID (PID) of the process that you want to monitor:

tux@mercury:~> strace -p `pidof mysqld`
Process 2868 attached - interrupt to quit
select(15, [13 14], NULL, NULL, NULL)   = 1 (in [14])
fcntl(14, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
accept(14, {sa_family=AF_FILE, NULL}, [2]) = 31
fcntl(14, F_SETFL, O_RDWR)              = 0
getsockname(31, {sa_family=AF_FILE, path="/var/run/mysql"}, [28]) = 0
fcntl(31, F_SETFL, O_RDONLY)            = 0
fcntl(31, F_GETFL)                      = 0x2 (flags O_RDWR)
fcntl(31, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
[...]
setsockopt(31, SOL_IP, IP_TOS, [8], 4)  = -1 EOPNOTSUPP (Operation \
not supported)
clone(child_stack=0x7fd1864801f0, flags=CLONE_VM|CLONE_FS|CLONE_ \
FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_ \
PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fd1864809e0, \
tls=0x7fd186480910, child_tidptr=0x7fd1864809e0) = 21993
select(15, [13 14], NULL, NULL, NULL

The -e option understands several sub-options and arguments. For example, to trace all attempts to open or write to a particular file, use the following:

tux@mercury:~> strace -e trace=open,write ls ~
open("/etc/ld.so.cache", O_RDONLY)       = 3
open("/lib64/librt.so.1", O_RDONLY)      = 3
open("/lib64/libselinux.so.1", O_RDONLY) = 3
open("/lib64/libacl.so.1", O_RDONLY)     = 3
open("/lib64/libc.so.6", O_RDONLY)       = 3
open("/lib64/libpthread.so.0", O_RDONLY) = 3
[...]
open("/usr/lib/locale/cs_CZ.utf8/LC_CTYPE", O_RDONLY) = 3
open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
write(1, "addressbook.db.bak\nbin\ncxoffice\n"..., 311) = 311

To trace only network related system calls, use -e trace=network:

tux@mercury:~> strace -e trace=network -p 26520
Process 26520 attached - interrupt to quit
socket(PF_NETLINK, SOCK_RAW, 0)         = 50
bind(50, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
getsockname(50, {sa_family=AF_NETLINK, pid=26520, groups=00000000}, \
[12]) = 0
sendto(50, "\24\0\0\0\26\0\1\3~p\315K\0\0\0\0\0\0\0\0", 20, 0,
{sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20
[...]

The -c calculates the time the kernel spent on each system call:

tux@mercury:~> strace -c find /etc -name xorg.conf
/etc/X11/xorg.conf
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 32.38    0.000181         181         1           execve
 22.00    0.000123           0       576           getdents64
 19.50    0.000109           0       917        31 open
 19.14    0.000107           0       888           close
  4.11    0.000023           2        10           mprotect
  0.00    0.000000           0         1           write
[...]
  0.00    0.000000           0         1           getrlimit
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         3         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         4           fadvise64
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.000559                  3633        33 total

To trace all child processes of a process, use -f:

tux@mercury:~> strace -f rcapache2 status
execve("/usr/sbin/rcapache2", ["rcapache2", "status"], [/* 81 vars */]) = 0
brk(0)                                  = 0x69e000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) \
= 0x7f3bb553b000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) \
= 0x7f3bb553a000
[...]
[pid  4823] rt_sigprocmask(SIG_SETMASK, [],  <unfinished ...>
[pid  4822] close(4 <unfinished ...>
[pid  4823] <... rt_sigprocmask resumed> NULL, 8) = 0
[pid  4822] <... close resumed> )       = 0
[...]
[pid  4825] mprotect(0x7fc42cbbd000, 16384, PROT_READ) = 0
[pid  4825] mprotect(0x60a000, 4096, PROT_READ) = 0
[pid  4825] mprotect(0x7fc42cde4000, 4096, PROT_READ) = 0
[pid  4825] munmap(0x7fc42cda2000, 261953) = 0
[...]
[pid  4830] munmap(0x7fb1fff10000, 261953) = 0
[pid  4830] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid  4830] open("/dev/tty", O_RDWR|O_NONBLOCK) = 3
[pid  4830] close(3)
[...]
read(255, "\n\n# Inform the caller not only v"..., 8192) = 73
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
exit_group(0)

If you need to analyze the output of strace and the output messages are too long to be inspected directly in the console window, use -o. In that case, unnecessary messages, such as information about attaching and detaching processes, are suppressed. You can also suppress these messages (normally printed on the standard output) with -q. To optionally prepend timestamps to each line with a system call, use -t:

tux@mercury:~> strace -t -o strace_sleep.txt sleep 1; more strace_sleep.txt
08:44:06 execve("/bin/sleep", ["sleep", "1"], [/* 81 vars */]) = 0
08:44:06 brk(0)                         = 0x606000
08:44:06 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, \
-1, 0) = 0x7f8e78cc5000
[...]
08:44:06 close(3)                       = 0
08:44:06 nanosleep({1, 0}, NULL)        = 0
08:44:07 close(1)                       = 0
08:44:07 close(2)                       = 0
08:44:07 exit_group(0)                  = ?

The behavior and output format of strace can be largely controlled. For more information, see the relevant manual page (man 1 strace).