SES5.5: Stage.2 fails with "Exception occurred in runner advise.osds".

This document (000019657) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Enterprise Server 5.5

Situation

SES5.5: Stage.2 fails with "Exception occurred in runner advise.osds".

Below is portion of stage.2 output:
          ID: advise OSDs
    Function: salt.runner
        Name: advise.osds
      Result: False
     Comment: Exception occurred in runner advise.osds: Traceback (most recent call last):
                File "/usr/lib/python2.7/site-packages/salt/client/mixins.py", line 395, in _low                      
                  data['return'] = self.functions[fun](*args, **kwargs)                                               
                File "/srv/modules/runners/advise.py", line 144, in osds                                              
                  unconfigured = _tidy('unconfigured', report)                                                        
                File "/srv/modules/runners/advise.py", line 179, in _tidy                                             
                  if report[minion][key]:                                                                             
              TypeError: string indices must be integers, not str                                                     
     Started: 18:02:57.779402
    Duration: 15817.362 ms
     Changes:   

Resolution

Run the following command on the admin node:
# salt -I roles:storage osd.report human=False

Most likely a minion is not replying, causing stage.2 to fail. 

Example of success:
ceph-osd4:~ # salt -I roles:storage osd.report human=False
ceph-osd11.ceph.example.com:
    ----------
    changed:
    unconfigured:
    unmounted:
ceph-osd12.ceph.example.com:
    ----------
    changed:
    unconfigured:
    unmounted:
ceph-osd15.ceph.example.com:
    ----------
    changed:
    unconfigured:
    unmounted:
ceph-osd14.ceph.example.com:
    ----------
    changed:
    unconfigured:
    unmounted:
ceph-osd10.ceph.example.com:
    ----------
    changed:
    unconfigured:
    unmounted:
ceph-osd13.ceph.example.com:
    ----------
    changed:
    unconfigured:
    unmounted:

Each should report back the triplet of 'unconfigured', 'changed' and 'unmounted'.  
Most likely one or more minions may not be replying.

Example of a failure:
ceph-osd4:~ # salt -I roles:storage osd.report human=False
ceph-osd16.ceph.example.com:
   The minion function caused an exception: Traceback (most recent call
last):
     File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1455,
in _thread_return
       return_data = executor.execute()
     File
"/usr/lib/python2.7/site-packages/salt/executors/direct_call.py", line
28, in execute
       return self.func(*self.args, **self.kwargs)
     File "/var/cache/salt/minion/extmods/modules/osd.py", line 2352,
in report
       un1, ch1 = _report_pillar(active)
     File "/var/cache/salt/minion/extmods/modules/osd.py", line 2412,
in _report_pillar
       unconfigured = __pillar__['ceph']['storage']['osds'].keys()
   KeyError: 'storage'

The remainder of the nodes report back as expected. 
In this case, this is a node that  was removed out of the cluster in the past and planned to reintroduce it at some point in the future. 

Address the minion which is not responding.  In this case "ceph-osd16.ceph.example.com"
In some cases, there may be a invalid minion key, which needs to removed.
List salt key:
  salt-key --list-all

Delete salt key:
  salt-key -d <minion_name>

Other cases, restarting the minion may resolve the issue. 
 

Cause

Customer removed a OSD node from the cluster, but did not delete the OSD node minion key from the salt-master configuration. 

Status

Top Issue

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000019657
  • Creation Date: 29-Jun-2020
  • Modified Date:29-Jun-2020
    • SUSE Enterprise Storage

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center