How to recover from BTRFS errors

This document (7018181) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 11
SUSE Linux Enterprise Server 12

Situation

NOTE: This is a live document. Content may change without further notice!

Filesystem errors are not uncommon, yet, need to be resolved to ensure a safe and stable system.
This document concentrates on errors seen on the BTRFS filesystem on SUSE Linux Enterprise 11 and 12

Note that at some of these errors we have to aim in two directions:

  1. what actually had caused the corruption
  2. is there a bug in the tool to fix that corruption which prevents that.

In any case, repairing a filesystem does not necessarily mean to recover the data , it means to fix the filesystem itself, not it's content, at least not in any case.

Let's start with some best practices first. Whenever a btrfs filesystem contains errors, this TID is a good starting point.

Let's have a look on some typical errors seen in the past:

 WARNING: CPU: 2 PID: 452 at ../fs/btrfs/extent-tree.c:3731 btrfs_free_reserved_data_space_noquota+0xe8/0x100 [btrfs]()

Good thing is, it's a WARNING, not a fatal error
WARNINGs like this one, e.g. regarding quota,  typically are runtime only things that are fixed by btrfs after the WARNING is issued. Not a bad problem.

Yet, such an issue should be reported to Support for closer examination.

____________________________________________
If you see a message like:
BTRFS: Transaction aborted (error -2)
followed by a stack trace which looks like:
[<ffffffffa041277b>] __btrfs_abort_transaction+0x4b/0x120 [btrfs]
[<ffffffffa0445f87>] __btrfs_unlink_inode+0x367/0x3c0 [btrfs]
[<ffffffffa04499e7>] btrfs_unlink_inode+0x17/0x40 [btrfs]
[<ffffffffa0449a76>] btrfs_unlink+0x66/0xb0 [btrfs]
kernel: BTRFS warning (device sdb3): __btrfs_unlink_inode:3802: Aborting unused transaction(No such entry).
and the filesystem mounted read only, a possible way to fix this issue try:
mount -t btrfs -o recovery,ro /dev/<device_name> /<mount_point>
If this does not work, start the repair procedure
btrfs check —repair /dev/<device_name>
btrfs scrub start -Bf /dev/<device_name>
If this does not fix the problem using the tools from a SLES 12 SP 2 DVD might be a better choice.
As a last resort cleaning the transaction log can be done with:
btrfs rescue zero-log /dev/<device_name>
____________________________________________

More fatal issues are seen if the filesystem spits out tons of messages into the logs, slows down considerably or even goes readonly.


Resolution

If the Root filesystem is affected, reboot the system into a independent rescue system from DVD, ISO image, USB pendrive, etc... Use the latest available rescue system, the more recent, the better.
SLES 12 SP2 comes with btrfs v4.5.3. which currently is the best bet to fix a corrupted btrfs filesystem.

What to do if:
  • a bad tree root is found at mount time: use "-o recovery" This attempts to autocorrect that error.
  • weird ENOSPC issues seen: mount with "-o clear_cache" which will drop btrfs cache
  • quota issues prevent mounting: Needs the latest available btrfsprogs to fix that. See Section "Additional information"
  • quota issues seen during normal operation: run 'btrfs quota rescan'
  • Only if everything else  fails, run 'btrfs check' and see if the fsck could possibly fix this issue.

Note: If in doubt, open a  case with support. 'btrfs check --repair' run with a version which can't fix the particular issue might make things worse.

Additional Information

In some cases it may be necessary to use a more recent btrfs tools version to repair a damaged filesystem.
For SLES12 GA and SP1 it may work out to use the SP2 version of btrfsutils.

Download the SP2 iso image from the customer center and mount it to the /mnt directory on the system with the broken filesystem.
Extrace the btrfs tool from the btrfsutils rpm :

    rpm2cpio /mnt/suse/x86_64/btrfsprogs-4.5.3-14.4.x86_64.rpm |cpio -id ./usr/sbin/btrfs

    Then use ./usr/sbin/btrfs check --repair /dev/<defective btrfs device>

NOTE: This only works for a btrfs filesystem which is not mounted, iow it doesn't work for the root filesystem of a running system.
To repair that, reboot the system and boot the rescue system from dvd.

Further: as said above, use with care. In doubt, run "./usr/sbin/btrfs check /dev/<defective btrfs device> " and send the output to SUSE support for advice.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7018181
  • Creation Date: 24-Oct-2016
  • Modified Date:03-Mar-2020
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center