SCSI errors resulting in down Object Storage Daemons (OSDs)

This document (7023562) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 12 Service Pack 3 (SLES 12 SP3)
SUSE Enterprise Storage 5

LSI MPT Fusion SAS 3.0 Device Driver
tuned

Situation

With the default driver version "15.100.00.00", SCSI commands are timing out resulting in the following SCSI errors and SUSE Enterprise Storage (SES) errors being seen:

2018-12-01T05:01:20.649519+01:00 server_name kernel: [40811.416245] sd 0:0:X:0: timing out command, waited 180s
2018-12-01T05:01:20.649528+01:00 server_name kernel: [40811.416279] sd 0:0:X:0: [sdX] tag#8 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
2018-12-01T05:01:20.649528+01:00 server_name kernel: [40811.416290] sd 0:0:X:0: [sdX] tag#8 Sense Key : Not Ready [current]
...
osd.xxx failed (root=xxxx,disktype=xxx,host=serverxxx) (xx reporters from different host after 100.964986 >= grace 100.287216)
we have enough reporters to mark osd.xxx down
...
map e422871 wrongly marked me down at ee422870

Resolution

The current workaround is to:

1. Make sure the latest HBA firmware is applied.
2. Update to the latest upstream mpt3sas driver version 26.00.00.00 available from the hardware vendor.

NOTE: See also the additional information section regarding "tuned".

Cause


Additional Information

Since SES 5.5 with DeepSea 0.8.6 we also are implementing and enabling tuned profiles. With the default tuned settings this results in power management being enabled for spinning disks. This can result in disks hosting OSDs to spin down, which is undesirable.

To disable power management on the drives hosting OSDs and to disable tuned for now take the following steps:

# hdparm -B 255 -S 0 /dev/sd<X>

# tuned-adm off
# systemctl stop tuned.service

To see the current disk status:

# hdparm -C /dev/sd<X>

Using "smartctl" which is part of the "smartmontools" package can also be useful, for example to view disk information (including "Start_Stop_Count"):

# smartctl -a /dev/sd<X>

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7023562
  • Creation Date: 05-Dec-2018
  • Modified Date:03-Mar-2020
    • SUSE Enterprise Storage
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center