SUSE Support

Here When You Need Us

How compression is implemented at BlueStore level.

This document (000019629) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Enterprise Storage 6

Situation

Explanation on the functionality of BlueStore pool & osd compression.

Resolution

BlueStore compresses data on per-block basis and not a per-file one. Where block size is determined by write input data block (i.e. what client provides in write requests), BlueStore allocation unit size and some min/max compression blob size.

For HDD OSDs we have 64K alloc units and compression block sizes within [128K,512K] range.
If input block is lower than 128K - it's not compressed. If it's above 512K it's split into multiple chunks and each one is compressed independently (small tails < 128K bypass compression as per above).

Now imagine we get 128K write which is squeezed into 32K. To keep that block on disk BlueStore will allocate a  64K block anyway (due to alloc unit size). Hence even if compression ratio is good on raw data (0.25) resulting one will be 0.5.

E.g. the following stats will properly explain this in clear numbers:
osd.1
"bluestore_compressed": 92508886525,
"bluestore_compressed_allocated": 236632408064,
"bluestore_compressed_original": 473264816128,

The original data volume (473GB) was compressed to 95GB but to keep it on disk BlueStore needs 236GB.

Cause

BlueStore compression explained

Additional Information

Simple Examples:
SES6, Pool=cephfs_data
c1boyd1:/mnt/compression # ceph osd pool get cephfs_data compression_algorithm
compression_algorithm: snappy
c1boyd1:/mnt/compression # ceph osd pool get cephfs_data compression_mode
compression_mode: aggressive

Pool compression values are default and not included in ceph.conf for the test.
    "bluestore_compression_algorithm": "snappy",
    "bluestore_compression_max_blob_size": "0",
    "bluestore_compression_max_blob_size_hdd": "524288",
    "bluestore_compression_max_blob_size_ssd": "65536",
    "bluestore_compression_min_blob_size": "0",
    "bluestore_compression_min_blob_size_hdd": "131072",
    "bluestore_compression_min_blob_size_ssd": "8192",
    "bluestore_compression_mode": "none",
    "bluestore_compression_required_ratio": "0.875000"

Testing with "dd" and "/dev/zero" and bs=512, 65536, 128000, 131072.
File size verified with "ls -ahl"
Files less than 128k did not compress, and files equal 128k did compress.
Note: "compress_rejected_count" remained "0" in all tests.

When compression did not occur:
        "compress_success_count": 0,
        "compress_rejected_count": 0,

When compression did occur result were similar too:
        "compress_success_count": 494,
        "compress_rejected_count": 0,

Testing with "dd" and "/dev/urandom" and bs=131072.
compression mode "force"
Files 128k did not compress.
Note: "compress_rejected_count" is now non-zero.
        "compress_success_count": 0,
        "compress_rejected_count": 497,

Conclusion files needs to be at least 128k in size to compress. 
Depending on the data in the files, compresson_algorithm, compression_required_ratio, etc..., the files will qualify for compression.  
Data:
===========
5000 64k files, with bs=512. No compression occurred. 
# for i in {1..5000}; do echo zero$i;dd if=/dev/zero bs=512 count=$((128)) of=zero$i status=progress; done

c1boyd7:~ # ceph daemon osd.9 perf reset all
{
    "success": "perf reset all"
}
c1boyd7:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 0,
        "compress_rejected_count": 0,
        "bluestore_allocated": 109117440,
        "bluestore_stored": 24379269,
        "bluestore_compressed": 0,
        "bluestore_compressed_allocated": 0,
        "bluestore_compressed_original": 0,

c1boyd7:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 0,
        "compress_rejected_count": 0,
        "bluestore_allocated": 144900096,
        "bluestore_stored": 60137910,
        "bluestore_compressed": 0,
        "bluestore_compressed_allocated": 0,
        "bluestore_compressed_original": 0,

==============
5000 64k files, with bs=65536. No compression occurred. 
# for i in {1..5000}; do echo zero$i;dd if=/dev/zero bs=65536 count=1 of=zero$i status=progress; done

c1boyd7:~ # ceph daemon osd.9 perf reset all
{
    "success": "perf reset all"
}
c1boyd7:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 0,
        "compress_rejected_count": 0,
        "bluestore_allocated": 118358016,
        "bluestore_stored": 33559086,
        "bluestore_compressed": 0,
        "bluestore_compressed_allocated": 0,
        "bluestore_compressed_original": 0,

c1boyd7:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 0,
        "compress_rejected_count": 0,
        "bluestore_allocated": 154337280,
        "bluestore_stored": 69599109,
        "bluestore_compressed": 0,
        "bluestore_compressed_allocated": 0,
        "bluestore_compressed_original": 0,

==============
5000 128k files, with bs=512.  Compression occurred. 
# for i in {1..5000}; do echo zero$i;dd if=/dev/zero bs=512 count=$((256)) of=zero$i status=progress; done

c1boyd7:~ # ceph daemon osd.9 perf reset all
{
    "success": "perf reset all"
}
c1boyd7:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 0,
        "compress_rejected_count": 0,
        "bluestore_allocated": 117571584,
        "bluestore_stored": 32767913,
        "bluestore_compressed": 0,
        "bluestore_compressed_allocated": 0,
        "bluestore_compressed_original": 0,

c1boyd7:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 460,
        "compress_rejected_count": 0,
        "bluestore_allocated": 147718144,
        "bluestore_stored": 93061033,
        "bluestore_compressed": 2834520,
        "bluestore_compressed_allocated": 30146560,
        "bluestore_compressed_original": 60293120,

==================
5000 128k files, with bs=131072. Compression occurred.
# for i in {1..5000}; do echo zero$i;dd if=/dev/zero bs=131072 count=1 of=zero$i status=progress; done

c1boyd7:~ # ceph daemon osd.9 perf reset all
{
    "success": "perf reset all"
}
c1boyd7:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 0,
        "compress_rejected_count": 0,
        "bluestore_allocated": 130088960,
        "bluestore_stored": 45350789,
        "bluestore_compressed": 0,
        "bluestore_compressed_allocated": 0,
        "bluestore_compressed_original": 0,

c1boyd7:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 494,
        "compress_rejected_count": 0,
        "bluestore_allocated": 162463744,
        "bluestore_stored": 110100357,
        "bluestore_compressed": 3044028,
        "bluestore_compressed_allocated": 32374784,
        "bluestore_compressed_original": 64749568,

==============
5000 128k files, with bs=65536. Compression did occur.
 # for i in {1..5000}; do echo zero$i;dd if=/dev/zero bs=65536 count=2 of=zero$i status=progress; done

c1boyd7:~ # ceph daemon osd.9 perf reset all
{
    "success": "perf reset all"
}
c1boyd7:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 0,
        "compress_rejected_count": 0,
        "bluestore_allocated": 129826816,
        "bluestore_stored": 45081435,
        "bluestore_compressed": 0,
        "bluestore_compressed_allocated": 0,
        "bluestore_compressed_original": 0,

c1boyd7:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 456,
        "compress_rejected_count": 0,
        "bluestore_allocated": 164429824,
        "bluestore_stored": 109576069,
        "bluestore_compressed": 2809872,
        "bluestore_compressed_allocated": 29884416,
        "bluestore_compressed_original": 59768832,

==================
5000 125k files, with bs=128000. Compression did NOT occur.
# for i in {1..5000}; do echo zero$i;dd if=/dev/zero bs=128000 count=1 of=zero$i status=progress; done

c1boyd7:~ # ceph daemon osd.9 perf reset all
{
    "success": "perf reset all"
}
c1boyd7:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 0,
        "compress_rejected_count": 0,
        "bluestore_allocated": 125894656,
        "bluestore_stored": 41156485,
        "bluestore_compressed": 0,
        "bluestore_compressed_allocated": 0,
        "bluestore_compressed_original": 0,

osd07:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 0,
        "compress_rejected_count": 0,
        "bluestore_allocated": 200343552,
        "bluestore_stored": 113958789,
        "bluestore_compressed": 0,
        "bluestore_compressed_allocated": 0,
        "bluestore_compressed_original": 0,

==================
if=/dev/urandom
5000 128k files, with bs=131072. Compression occurred.
# for i in {1..5000}; do echo zero$i;dd if=/dev/urandom bs=131072 count=1 of=zero$i status=progress; done

osd07:~ # ceph daemon osd.9 perf reset all
{
    "success": "perf reset all"
}
osd07:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 0,
        "compress_rejected_count": 0,
        "bluestore_allocated": 130088960,
        "bluestore_stored": 45350789,
        "bluestore_compressed": 0,
        "bluestore_compressed_allocated": 0,
        "bluestore_compressed_original": 0,

osd07:~ # ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count"
        "compress_success_count": 0,
        "compress_rejected_count": 497,
        "bluestore_allocated": 186843136,
        "bluestore_stored": 102104965,
        "bluestore_compressed": 0,
        "bluestore_compressed_allocated": 0,
        "bluestore_compressed_original": 0,

Expanded command that could be useful:
# ceph daemon osd.9 perf dump | egrep -i "bluestore_compressed|bluestore_allocated|bluestore_stored|compress_.*_count|bluestore_write_big|bluestore_write_small"

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000019629
  • Creation Date: 29-Oct-2020
  • Modified Date:29-Oct-2020
    • SUSE Enterprise Storage

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.