SUSE Support

Here When You Need Us

When deploying additional new OSD hosts into an existing cluster, running DeepSea stage 3 fails due to "processes.wait"

This document (000019911) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Enterprise Storage 6
Dell PowerEdge R740xd Servers
 

Situation

Attempting to deploy new OSD servers, stage 3 fails and the DeepSea stage summary shows, excerpt:
 
Total states run: 1
Total run time: 275.271 ms
new-storage-node.my.company.ex:
----------
ID: wait for osd processes
Function: module.run
Name: cephprocesses.wait
Result: False
Comment: Module function cephprocesses.wait executed
Started: 13:56:37.095587
Duration: 932504.842 ms
Changes:
----------
ret:
False

Summary for new-storage-node.my.company.ex
------------
Succeeded: 0 (changed=1)
Failed: 1

In the "/var/log/salt/minion" log on the OSD host(s), errors similar to the following are seen, excerpt:
 
stderr: Error reading device /dev/sde at 0 length 512.
stderr: Error reading device /dev/sde at 0 length 4096.
...
 --> Was unable to complete a new OSD, will rollback changes
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.123 --yes-i-really-mean-it
...
stderr: purged osd.123
 -->  RuntimeError: command returned non-zero exit status: 5

Resolution

It is required to install the latest firmware for the Dell PowerEdge R740xd Servers.

Cause

Due to a problem with the firmware, reading from the disks returns unexpected errors resulting in OSD creation failing.

Additional Information

To verify the latest Firmware are being used, the Dell support site should be used.

Additionally the following errors may also be logged / seen in "/var/log/messages" of the intended new OSD hosts / servers:
 
2021-03-16T14:04:41.396107+01:00 new-storage-node kernel: [515288.323451] mpt3sas_cm0: log_info(0x3112043b): originator(PL), code(0x12), sub_code(0x043b)
2021-03-16T14:04:41.396125+01:00 new-storage-node kernel: [515288.323471] sd 0:0:1:0: [sde] tag#4 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_SENSE
2021-03-16T14:04:41.396127+01:00 new-storage-node kernel: [515288.323476] sd 0:0:1:0: [sde] tag#4 Sense Key : Illegal Request [current] 
2021-03-16T14:04:41.396129+01:00 new-storage-node kernel: [515288.323481] sd 0:0:1:0: [sde] tag#4 Add. Sense: Logical block reference tag check failed
2021-03-16T14:04:41.396130+01:00 new-storage-node kernel: [515288.323485] sd 0:0:1:0: [sde] tag#4 CDB: Read(32)
2021-03-16T14:04:41.396132+01:00 new-storage-node kernel: [515288.323489] sd 0:0:1:0: [sde] tag#4 CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 06
2021-03-16T14:04:41.396134+01:00 new-storage-node kernel: [515288.323492] sd 0:0:1:0: [sde] tag#4 CDB[10]: 3c bf ff 80 3c bf ff 80 00 00 00 00 00 00 00 08
2021-03-16T14:04:41.396173+01:00 new-storage-node kernel: [515288.323496] print_req_error: protection error, dev sde, sector 26789019520
...
2021-03-16T14:04:41.250174+01:00 new-storage-node kernel: [515755.159101] print_req_error: 400 callbacks suppressed
2021-03-16T14:04:41.250175+01:00 new-storage-node kernel: [515755.159106] print_req_error: protection error, dev sde, sector 0
 ...
2021-03-16T14:05:41.012024+01:00 new-storage-node kernel: [515756.919719] Buffer I/O error on dev sde, logical block 3348627440, async page read

 

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000019911
  • Creation Date: 16-Mar-2021
  • Modified Date:16-Mar-2021
    • SUSE Enterprise Storage

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.