Scope/Description

This guide will detail the process to add storage drives to a cluster. Although, not necessary it is best practice to add the same amount of drives to each node and drives of the same capacity.

Prerequisites

First make sure the status of the cluster is in a healthy state, and mark down the current number of OSD’s for reference.

ceph osd set norebalance

ceph osd set noout

ceph osd set norecover

Physically add the new drives to the cluster.
Access each OSD node.
Ensure disks are inserted and are showing up unpartitioned – you can use 45Drives Tools: lsdev or use lsblk
Ensure the disks you are trying to add match the physical slots where you inserted the disks.
Take note of the linux device name of the new disks you wish to add to the cluster for each node (ex. /dev/sdh /dev/sdi)
Run a report for each linux device to ensure they’re free of any partitions. This will also test for any errors.

[root@osd~]# ceph-volume lvm batch --report /dev/sdh

Run the command with all the necessary drives included to ensure it works properly.

[root@osd~]# ceph-volume lvm batch --report /dev/sdh /dev/sdi /dev/sdj

[root@osd~]# ceph-volume lvm batch /dev/sdh /dev/sdi /dev/sdj

Wait for OSD’s to be picked up and Placement Groups to peer, which can be seen by running (OSD’s recorded for reference in previous step):

ceph -s

To edit the rate the cluster distributes the data to the new drives, go to the ceph dashboard and edit the values for “osd_recovery_max_active” and “osd_max_backfills”. For the recovery setting, there is a global or per HDD/SSD option. Change these values for each use case/depending on the speed you want to integrate the new drives to the cluster.

Unset the flags that were previously set. This does not need to be done right away if you wanted to delay the backfill to a later time to have less IO on the system.

ceph osd unset norebalance

ceph osd unset noout

ceph osd unset norecover

Run a Ceph -s command to display the status of the cluster:

[root@osd~]# ceph -s

If no issues appear and everything is running smoothly, run a ceph health detail and confirm all data has been redistributed and that backfills have been complete:

[root@osd~]# ceph health detail

In the Ceph Dashboard check the OSD usage to ensure data is evenly distributed between drives.
If data isn’t distributed look at the ceph balancer status, confirm that the mode is “ceph-compat” and active is “true”

[root@osd~]#  ceph balancer status

In the Ceph Dashboard check the PG’s on each OSD, they should all be between 100-150
In the Ceph Dashboard check the Normal Distribution in the OSD overall performance tab
Lastly, if the Dashboard is unavailable run a “ceph osd df” and check the PG’s:

[root@osd~]# ceph osd df

Return OSD Backfill to the default value by navigating in the Dashboard to Cluster–>Configuration–>Search “backfill”–>OSD_max_backfill and set it to 1

If there are already volumes present on the drives, be sure to use wipedev on them. Before running wipedev on drives, ensure you are targeting the correct drive and there is no critical data on the device. Example to wipe a specific drive.

/opt/45drives/tools/wipedev -d 1-1

Was this article helpful?