Filestore directory split error related log entries

This document (000020242) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Enterprise Storage 6

Situation

An OSD logs filestore directory split error events, indicated by the following log messages:

_created {DIR} has {NUM} objects, starting split in pg {PG}_head
_created {DIR} split completed in pg {PG}_head

followed by

filestore(/var/lib/ceph/osd/ceph-{ID}) error creating {OID} ({PATH}) in index

Further investigation shows "Medium Error" being logged for the disk belonging to that OSD by the SCSI subsystem (see output of "journalctl -t kernel") prior to the directory split error event.

Resolution

When a hardware issue for the OSD is detected, replace the OSD and wait for all PGs to complete their backfill to the new replaced OSD. Afterwards, run a deep-scrub for these PGs.

IMPORTANT: to avoid data loss, do not initiate a new deep-scrub by adding new OSDs or changing the weight until the initial deep-scrub is complete.

Cause

An error such as a failing disk during a filestore directory split is not handled as a scenario that requires manual intervention, i.e. it does not terminate the OSD process.

Status

Top Issue

Additional Information

SUSE can build and provide a PTF that changes the log output and intentionally causes ceph-osd processes to terminate. Please open a SUSE support case to obtain the PTF package.

After applying the PTF, the OSD process will log one of the following error messages when encountering a filestore directory split error,

error starting split {DIR} in pg {PG}
or
error completing split {DIR} in pg {PG}

and terminate itself.

The intentional termination of the OSD process in this situation makes it much easier to detect the problem and will avoid introducing inconsistencies. Depending on the systemd configuration, if the OSD terminates frequently, systemd will stop it permanently and the OSD will be set out and the process of data migration starts.

The expectation is that the cluster administrator will notice the OSD termination (e.g. ceph status provides information about downed OSDs and recent crashes) and will perform the necessary steps to investigate the cause.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.