How to migrate non-LVM OSD DB volume to another disk or partition

This document (000020276) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Enterprise Storage 6

Situation

An OSD is deployed with a standalone DB volume residing on a  (non-LVM LV) disk partition. This usually applies to legacy clusters originally deployed in pre-"ceph-volume" epoch (e.g. SES5.5) and later upgraded to SES6.

The goal is to move the OSD's RocksDB data from underlying BlueFS volume to another location, e.g. for having more space but keep using the OSD.

Resolution

1) Create a new partition of desired sized using 'parted' tool.  Create GPT table with parted's 'mktable gpt' command first.

Please note the resulting partition name under /dev subfolder, e.g. /dev/vdg1.
Please also note the partition uuid e.g. by looking for partition name under /dev/disk/by-partuuid:

 $ ls -l /dev/disk/by-partuuid
lrwxrwxrwx 1 root root 10 Jun 4 16:45 4f02c107-73a2-42f4-9be6-e609ba2a9f45 -> ../../vdg1

2) Set the noout flag to avoid data rebalance when OSDs go down

For example, to set noout for a specific OSD.12

  $ ceph osd set-group noout osd.12

can be used. To set noout for the whole OSD class  named 'hdd'

  $ ceph osd set-group noout hdd
 

3) Stop the OSD in question

  $ systemctl stop ceph-osd@12

4) Migrate bluefs data using ceph-bluestore-tool

  $ ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-12 --devs-source /var/lib/ceph/osd/ceph-12/block --devs-source /var/lib/ceph/osd/ceph-12/block.db --command bluefs-bdev-migrate --dev-target --dev-target /dev/vdg1
  inferring bluefs devices from bluestore path
  device removed:1 /var/lib/ceph/osd/ceph-12/block.db
  device added: 1 /dev/vdg1

5) Update OSD's config under /etc/ceph/osd/<osd_id>_<osd_fsid>.json

E.g. the original one could look like this:

  $ cat /etc/ceph/osd/12-cd805c98-41ff-400a-b3af-c5783cc1ae4d.json
  {
  "active": "ok",
  "block": {
    "path": "/dev/disk/by-partuuid/974e9dec-3121-4079-b2e9-3c929fc20e45",
    "uuid": "974e9dec-3121-4079-b2e9-3c929fc20e45"
  },
  "block.db": {
    "path": "/dev/disk/by-partuuid/17ffc366-ef7a-4577-9f1e-d541e18db8f3",
    "uuid": "17ffc366-ef7a-4577-9f1e-d541e18db8f3"
  },
  "block_uuid": "974e9dec-3121-4079-b2e9-3c929fc20e45",
    "bluefs": 1,
    "ceph_fsid": "4b9be45b-e09b-3c47-a00b-2b16d8151d75",
    "cluster_name": "ceph",
    "data": {
      "path": "/dev/vdb1",
      "uuid": "cd805c98-41ff-400a-b3af-c5783cc1ae4d"
    },
    "fsid": "cd805c98-41ff-400a-b3af-c5783cc1ae4d",
    "keyring": "AQAdPrdg0oQvExAAmvGRDoRzsVCH6QA0hIWHIQ==",
    "kv_backend": "rocksdb",
    "magic": "ceph osd volume v026",
    "mkfs_done": "yes",
    "ready": "ready",
    "require_osd_release": "",
    "systemd": "",
    "type": "bluestore",
    "whoami": 12
  }


Replace the original partition UUIDs in "block.db" section with the new one noted at step 1), i.e. replace the following lines

  "block.db": {
  "path": "/dev/disk/by-partuuid/17ffc366-ef7a-4577-9f1e-d541e18db8f3",
  "uuid": "17ffc366-ef7a-4577-9f1e-d541e18db8f3"
  },


with

  "block.db": {
  "path": "/dev/disk/by-partuuid/4f02c107-73a2-42f4-9be6-e609ba2a9f45",
  "uuid": "4f02c107-73a2-42f4-9be6-e609ba2a9f45"
  },

6) Activate OSD with a new partition

OSD's ID and FSID to be provided as parameters:

  $ ceph-volume simple activate 12 cd805c98-41ff-400a-b3af-c5783cc1ae4d

  Running command: /usr/bin/ln -snf /dev/vdb2 /var/lib/ceph/osd/ceph-12/block
  Running command: /usr/bin/chown -R ceph:ceph /dev/vdb2
  Running command: /usr/bin/ln -snf /dev/vdg1 /var/lib/ceph/osd/ceph-12/block.db
  Running command: /usr/bin/chown -R ceph:ceph /dev/vdg1
  Running command: /usr/bin/systemctl enable ceph-volume@simple-12-cd805c98-41ff-400a-b3af-c5783cc1ae4d
  Running command: /usr/bin/ln -sf /dev/null /etc/systemd/system/ceph-disk@.service
  --> All ceph-disk systemd units have been disabled to prevent OSDs getting triggered by UDEV events
  Running command: /usr/bin/systemctl enable --runtime ceph-osd@12
  Running command: /usr/bin/systemctl start ceph-osd@12
  --> Successfully activated OSD 12 with FSID cd805c98-41ff-400a-b3af-c5783cc1ae4d

7) Restart OSD and make sure it's running

  $ systemctl start ceph-osd@12
  $ sleep 10
  $ systemctl status ceph-osd@12

  ceph-osd@12.service - Ceph object storage daemon osd.12
  Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: disabled)
  Active: active (running) since Fri 2021-06-04 19:18:33 CEST; 2min 40s ago
  Process: 4606 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 12 (code=exited, status=0/SUCCESS)
  Main PID: 4610 (ceph-osd)
  Tasks: 61
  CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@12.service
  └─4610 /usr/bin/ceph-osd -f --cluster ceph --id 12 --setuser ceph --setgroup ceph


Note: We recommend to reboot the relevant host  for the very first upgraded OSD right after this OSD has got a new DB volume. This is necessary to make sure the provided procedure works fine for the customer's cluster. The next host reboot is recommended when every OSD at this host are upgraded. See step 8) below.

8) Reboot the host when every OSD on it is upgraded (optional)

Make sure all the OSDs are back online after the reboot and they have got new DB volume. This step is rather redundant and it's intended primarily to provide more data safety.

9) Clear noout flags once migration is complete

To clear noout for a specific OSD.12, use

  $ ceph osd unset-group noout osd.12

To clear noout for the whole OSD class named 'hdd'

  $ ceph osd unset-group noout hdd

Important Note: If any of the commands above report something unexpected/suspicious during the migrate process, stop the process immediately and consult with SUSE support engineers. Please DO NOT try to restart/recover OSD in question, neither proceed with different OSDs upgrade- Please collect broken and all the prior commands and their output. If the OSD is failing to restart (i.e. one is experiencing troubles at step 7) please also share OSD log from /var/ceph/log/ceph-osd.OSD_ID.log

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000020276
  • Creation Date: 09-Jun-2021
  • Modified Date:11-Jun-2021
    • SUSE Enterprise Storage

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center