I/O performance impact due to weekly mdcheck-start.service
This document (000022038) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server for SAP Applications Service Pack 4
SUSE Linux Enterprise Server for SAP Applications Service Pack 5
SUSE Linux Enterprise Server for SAP Applications Service Pack 6
SUSE Linux Enterprise Server for SAP Applications Service Pack 7
SUSE Linux Enterprise Server Service Pack 4
SUSE Linux Enterprise Server Service Pack 5
SUSE Linux Enterprise Server Service Pack 6
SUSE Linux Enterprise Server Service Pack 7
Situation
While running the weekly mdcheck-start.service, system I/O performance is drastically impacted by high I/O load. This issue may be encountered with systems utilizing a Software RAID (kernel md RAID) based on Enterprise grade multiqueue capable NVMe drives running SAP HANA workloads.
Resolution
Please increase "/sys/block/md<X>/md/group_thread_cnt" to the number of available queues available to the NVMe drives as can be seen in "/sys/block/nvmeXYZ/mq/".
Enterprise grade NVMe devices have several I/O queues which can be utilized in parallel. Per default the MD RAID5 implementation is using a single thread to handle RAID5 stripe I/O, thereby creating a bottleneck on devices with several queues. By setting the "group_thread_cnt" value MD is instructed to create several of these threads, removing this bottleneck.
Single queue devices (those who only have one queue in /sys/block/nvmeXYZ/mq) do not benefit from changing "group_thread_cnt" as it will not make any difference. The I/O submission will be handled by one CPU only and having several threads to calculate the RAID5 syndrome does not make any difference to the overall I/O performance.
Additional Information
Test results:
I/O baseline:
WRITE: bw=106MiB/s (111MB/s), 106MiB/s-106MiB/s (111MB/s-111MB/s)
READ: bw=424MiB/s (444MB/s), 424MiB/s-424MiB/s (444MB/s-444MB/s)
I/O performance during recovery:
WRITE: bw=43.2MiB/s (45.3MB/s), 43.2MiB/s-43.2MiB/s (45.3MB/s-45.3MB/s)
READ: bw=172MiB/s (181MB/s), 172MiB/s-172MiB/s (181MB/s-181MB/s)
I/O performance during recovery with "/sys/block/md<X>/md/group_cnt" set to 8:
WRITE: bw=105MiB/s (110MB/s), 105MiB/s-105MiB/s (110MB/s-110MB/s)
READ: bw=420MiB/s (441MB/s), 420MiB/s-420MiB/s (441MB/s-441MB/s)
I/O performance during recovery with "/sys/block/md<X>/md/group_cnt" set to 16:
WRITE: bw=107MiB/s (112MB/s), 107MiB/s-107MiB/s (112MB/s-112MB/s)
READ: bw=427MiB/s (447MB/s), 427MiB/s-427MiB/s (447MB/s-447MB/s)
As can be seen above, by setting the "/sys/block/md<X>/md/group_cnt" value the I/O performance for the system drastically increases during recovery when switching from single to multi queues.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000022038
- Creation Date: 08-Sep-2025
- Modified Date:09-Sep-2025
-
- SUSE Linux Enterprise Server for SAP Applications
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com