System crash on NFS cluster over OCFS2

This document (000019759) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 15 SP2
SUSE Linux Enterprise Server 15 SP1
SUSE Linux Enterprise Server 12 SP5
SUSE Linux Enterprise Server 12 SP4

Situation

The system crashed with the following backtrace when running a NFS server cluster over OCFS2, while hitting an invalid inode, as ocfs2_test_inode_bit() was wrongly assuming it was gotten from per slot inode alloctor, instead of global_inode_alloc:
crash> bt
PID: 22426  TASK: ffff956e4ece44c0  CPU: 2   COMMAND: "nfsd"
 #0 [ffffb57ec87638b0] machine_kexec at ffffffffab061cc1
 #1 [ffffb57ec8763900] __crash_kexec at ffffffffab12602e
 #2 [ffffb57ec87639b8] crash_kexec at ffffffffab126d5d
 #3 [ffffb57ec87639d0] oops_end at ffffffffab03135f
 #4 [ffffb57ec87639f0] general_protection at ffffffffab801755
    [exception RIP: _raw_spin_lock+12]
    RIP: ffffffffab7246cc  RSP: ffffb57ec8763aa0  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: 00200067006e0069  RCX: 0000000000000000
    RDX: 0000000000000001  RSI: 0000000000000009  RDI: 00200067006e00f1
    RBP: 00200067006e00f1   R8: ffff957010325af8   R9: 0000000000000000
    R10: ffffb57ec3b5fda8  R11: 0000000000000001  R12: ffff956fdf40d7d0
    R13: ffff95700f4b3a90  R14: 000000000000ffff  R15: ffff95700f4b30d0
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #5 [ffffb57ec8763aa0] igrab at ffffffffab27f169
 #6 [ffffb57ec8763ab8] ocfs2_get_system_file_inode at ffffffffc0b04295 [ocfs2]
 #7 [ffffb57ec8763b38] ocfs2_test_inode_bit at ffffffffc0af1854 [ocfs2]
 #8 [ffffb57ec8763ba0] ocfs2_get_parent at ffffffffc0abf014 [ocfs2]
 #9 [ffffb57ec8763bf8] reconnect_path at ffffffffab2eae47
#10 [ffffb57ec8763c38] exportfs_decode_fh at ffffffffab2eb2fc
#11 [ffffb57ec8763d88] fh_verify at ffffffffc0d5f815 [nfsd]
#12 [ffffb57ec8763de8] nfsd4_proc_compound at ffffffffc0d6e274 [nfsd]
#13 [ffffb57ec8763e48] nfsd_dispatch at ffffffffc0d5c24a [nfsd]
#14 [ffffb57ec8763e78] svc_process_common at ffffffffc0857e34 [sunrpc]
#15 [ffffb57ec8763ed0] svc_process at ffffffffc0859234 [sunrpc]
#16 [ffffb57ec8763ef0] nfsd at ffffffffc0d5bc73 [nfsd]
#17 [ffffb57ec8763f10] kthread at ffffffffab0ae0dd
#18 [ffffb57ec8763f50] ret_from_fork at ffffffffab800235


On another crash pattern the following logs might appear on kernel ring buffer (dmesg):
[1348408.600951] (nfsd,16700,3):ocfs2_test_inode_bit:2928 ERROR: status = -22
[1348418.505151] (nfsd,16700,3):ocfs2_test_inode_bit:2902 ERROR: unable to get alloc inode in slot 65535
[1348418.505153] (nfsd,16700,3):ocfs2_test_inode_bit:2928 ERROR: status = -22
[1348418.508460] ------------[ cut here ]------------
[1348418.508462] kernel BUG at ../fs/ocfs2/inode.c:1387!


 

Resolution

This has been fixed on OCFS2 kernel module starting with the following kernel releases:

SLES15 SP2 - 5.3.18-24.12.1
SLES15 SP1 - 4.12.14-197.56.1
SLES12 SP5 - 4.12.14-122.37.1
SLES12 SP4 - 4.12.14-95.60.1
 

Cause

Problems were caused by an OCFS2 kernel bug

Additional Information

This issue may be encountered on the following SLES and kernel versions :
  • SUSE Linux Enterprise Server 15 SP2  with kernel 5.3.18-24.9.1  or older.
  • SUSE Linux Enterprise Server 15 SP1  with kernel 4.12.14-197.51.1 or older.
  • SUSE Linux Enterprise Server 12 SP5  with kernel 4.12.14-122.32.1 or older.
  • SUSE Linux Enterprise Server 12 SP4  with kernel 4.12.14-95.57.1 or older.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000019759
  • Creation Date: 27-Oct-2020
  • Modified Date:27-Oct-2020
    • SUSE Linux Enterprise Server
    • SUSE Linux Enterprise Server for SAP Applications

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center