Performance Degradation Observed After SUSE Enterprise Storage 6 Patch Cycle
This document (000019829) is provided subject to the disclaimer at the end of this document.
Environment
Situation
Resolution
This can be accomplished via these methods:
-------------------------------------------------------------------------------------------------
Without restarting OSDs (temporary setting until OSD is restarted):
-------------------------------------------------------------------------------------------------
On the OSD node: # ceph deamon osd.<id> config set bluefs_buffered_io true
From the monitor: # ceph tell osd.<id> injectargs '--bluefs_buffered_io=true' # single osd
e.g. # ceph tell osd.56 injectargs '--bluefs_buffered_io=true'
# ceph tell osd.* injectargs '--bluefs_buffered_io=true' # all osds in cluster
---------------------------------------------------------------------------------------------------------------------------------------
Permanent setting for entire cluster (and without running the more invasive 'stage' commands):
---------------------------------------------------------------------------------------------------------------------------------------
On salt master, create/edit: /srv/salt/ceph/configuration/files/ceph.conf.d/osd.conf
Add the following line to osd.conf:
bluefs_buffered_io=true
The following command builds the ceph.conf to be pushed out, incorporating osd.conf:-
# salt '<salt_master_node_name>' state.apply ceph.configuration.create
e.g. # salt 'mon1.suse.com' state.apply ceph.configuration.create
------------------------
mon1:/srv/salt/ceph/configuration/files/ceph.conf.d # salt 'mon1.suse.com' state.apply ceph.configuration.create
mon1.suse.com:
Name: /var/cache/salt/minion/files/base/ceph/configuration - Function: file.absent - Result: Changed Started: - 18:13:46.846499 Duration: 11.657 ms
Name: /srv/salt/ceph/configuration/cache/ceph.conf - Function: file.managed - Result: Changed Started: - 18:13:46.858342 Duration: 5640.245 ms
Name: find /var/cache/salt/master/jobs -user root -exec chown salt:salt {} ';' - Function: cmd.run - Result: Changed Started: - 18:13:52.525985 Duration: 43.848 ms
Summary for mon1.suse.com
------------
Succeeded: 3 (changed=3)
Failed: 0
------------
Total states run: 3
Total run time: 5.696 s
------------------------
The next command pushes out the ceph.conf to all nodes in the cluster:
# salt '*' state.apply ceph.configuration
e.g.
mon1:/srv/salt/ceph/configuration/files/ceph.conf.d # salt '*' state.apply ceph.configuration
osdnode3.suse.com:
Name: /etc/ceph/ceph.conf - Function: file.managed - Result: Changed Started: - 18:16:47.268270 Duration: 101.657 ms
Summary for osdnode3.suse.com
------------
Succeeded: 1 (changed=1)
Failed: 0
------------
Total states run: 1
Total run time: 101.657 ms
....... [text removed to shorten example - there should be one 'entry' for each node in the cluster]
Summary for osdnode2.suse.com
------------
Succeeded: 1 (changed=1)
Failed: 0
------------
Total states run: 1
Total run time: 131.195 ms
mon1:/srv/salt/ceph/configuration/files/ceph.conf.d #
Check the /etc/ceph/ceph.conf on some of the nodes to make sure the change has been made.
----------------------------------------------------------------
NOTE: If only the 'permanent' change is made, each osd will have to be restarted in order to pick up the parameter change.
Cause
Since SUSE Enterprise Storage 6 is based on the 'Nautilus' release, prior to version 14.2.9.969 of SUSE Enterprise Storage 6, the default setting for 'bluefs_buffered_io' was also 'true' (enabled).
In version 14.2.9.969 and later, the default was changed to 'false' (disabled).
A decision was made that the parameter defaulting to 'true' was essentially a regression in the Nautilus release and that it should again default to 'false'. Having 'bluefs_buffered_io' set to true, has also been linked with performance issues for files over 2GB.
Status
Additional Information
The current recommendation is that if you have not seen a problem with bluefs_buffered_io enabled, you should be safe to continue to use it, but kernel swap usage should be regularly monitored for any sign of thrashing.
See also: https://www.spinics.net/lists/ceph-users/msg64164.html
Evidence appears to suggest that this issue will only occur where most or all of the OSDs are on slower hardware (spinning drives) as opposed to an environment based mostly on SSD / NVMe.
NOTE: This setting never defaulted to 'true' for SUSE Enterprise Storage 5.x
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000019829
- Creation Date: 15-Jan-2021
- Modified Date:20-Jan-2021
-
- SUSE Enterprise Storage
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com