SUSE Support

Here When You Need Us

systemd-journald coredumps on fsync() because of the watchdog timer

This document (000019921) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server for SAP Applications 12
SUSE Linux Enterprise Server 12
 

Situation

systemd-journald crashes and referencing the core dump it was killed by SIGABRT signal during fsync(), as it was no more sending the keep-alive ping to PID1:
#0  0x00007effc3fc1340 in __fsync_nocancel () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x00005568cc92847b in journal_file_set_online (f=0x5568ccb5ee20) at src/journal/journal-file.c:108
No locals.
#2  0x00005568cc928da8 in journal_file_append_object (f=f@entry=0x5568ccb5ee20, type=type@entry=OBJECT_DATA, size=size@entry=74, ret=ret@entry=0x7ffe95cad3d8, offset=offset@entry=0x7ffe95cad3d0)
    at src/journal/journal-file.c:575
        r = <optimized out>
        p = <optimized out>
        tail = 0x5568ccb5ee20
        o = <optimized out>
        t = 0x57566d05cb922363
        __PRETTY_FUNCTION__ = "journal_file_append_object"
#3  0x00005568cc9296eb in journal_file_append_data (f=f@entry=0x5568ccb5ee20, data=0x7ffe95cad6d0, size=10, ret=ret@entry=0x7ffe95cad5d8, offset=offset@entry=0x7ffe95cad5d0) at src/journal/journal-file.c:1083
        hash = <optimized out>
        p = 3740872
        osize = 74
        o = 0x7effc2f054c8
        r = <optimized out>
        compression = 0
        eq = <optimized out>
        __PRETTY_FUNCTION__ = "journal_file_append_data"
        __func__ = "journal_file_append_data"
#4  0x00005568cc92b144 in journal_file_append_entry (f=f@entry=0x5568ccb5ee20, ts=0x7ffe95cad5e0, ts@entry=0x0, iovec=<optimized out>, n_iovec=n_iovec@entry=18, seqnum=seqnum@entry=0x7ffe95cade30,
    ret=ret@entry=0x0, offset=offset@entry=0x0) at src/journal/journal-file.c:1451
        p = 3740872
        o = 0x7effc2f054c8
        i = <optimized out>
        items = 0x7ffe95cad450
        r = <optimized out>
        xor_hash = 13695064817713584937
        _ts = {realtime = 1611087924359329, monotonic = 2638961987174}
        __PRETTY_FUNCTION__ = "journal_file_append_entry"
#5  0x00005568cc90f5fc in write_to_journal (priority=<optimized out>, n=18, iovec=<optimized out>, uid=<optimized out>, s=0x7ffe95cadd80) at src/journal/journald-server.c:538
        vacuumed = false
        f = 0x5568ccb5ee20
        r = <optimized out>
        f = <optimized out>
        vacuumed = <optimized out>
        r = <optimized out>
        _level = <optimized out>
        _e = <optimized out>
        _level = <optimized out>
        _e = <optimized out>
        _level = <optimized out>
        _e = <optimized out>
        _level = <optimized out>
        _e = <optimized out>
...

Resolution

The workaround would be to increase or disable the watchdog timeout from current default value of 3 minutes to 30 minutes:

#/usr/lib/systemd/system/systemd-journald.service
------------------------------------------------
[Service]
WatchdogSec=30min

or to disable the timeout:

[Service]
WatchdogSec=0 


Afterwards, please make sure to reload the systemd unit files:

# systemctl daemon-reload

 

Cause

This problem can be encountered when fsync() is too slow and journald fails to ping watchdog leading to a kill by SIGABRT.

Status

Reported to Engineering

Additional Information

In upstream code, this issue has been fixed starting from systemd v230, by flushing the journal asynchronously.  SLES 15 contains newer versions without this issue, but SLES 12 (even SP5 will latest updates) does not.  The upstream patch-set which directly addresses this issue is identified as:

fb42603752 journal: add void cast to fsync() calls
69a3a6fd3d journal: add void cast to journal_file_close() calls
ac2e41f510 journal: asynchronous journal_file_set_offline()
b58c888f30 journal: defer journal closes on rotate

However, v230 is not part of SLES 12 (even in 12 SP5 with latest updates), and porting these into v228 (the version used in SLES 12 SP5) would require other dependent changes, risking regressions. It was therefore decided not to port the fixes into SLES 12 SP5.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000019921
  • Creation Date: 18-Mar-2021
  • Modified Date:15-Apr-2025
    • SUSE Linux Enterprise Server
    • SUSE Linux Enterprise Server for SAP Applications

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.