Did you know  45Drives offers free  public and private  webinars ? Click here to learn more  & register! Build & Price

KB450419 – Offlining a Ceph Storage Node for Maintenance

You are here:

Scope/Description

  • This article will walk you through taking a Ceph node offline safely and then online it and bring the cluster back safely.

Prerequisites

  • Ceph Cluster.
  • SSH Access to a Ceph Node.

Steps

Setting Maintenance Options

  • SSH into the node you want to take down
  • Run these 3 commands to set flags on the cluster to prepare for offlining a node.
root@osd1:~# ceph osd set noout 
root@osd1:~# ceph osd set norebalance 
root@osd1:~# ceph osd set norecover

  • Run ceph -s to see the cluster is in a warning state and that the 3 flags have been set.
root@osd1:~# ceph -s

  • Run shutdown now on the node you wish to turn off.
root@osd1:~# shutdown now

Disabling Maintenance Options

  • Once the system is back up and running and joined to the cluster unset the 3 flags we previously set.
root@osd1:~# ceph osd unset noout
root@osd1:~# ceph osd unset norebalance
root@osd1:~# ceph osd unset norecover

  • Running ceph -s again to show a healthy state and to confirm the flags are unset.
root@osd1:~# ceph -s

Verification

  • A ceph -s shows a healthy state and shows all nodes online.

Troubleshooting

  • Ensure you are on a Ceph node that has permission to do Ceph commands.
  • If you are receiving a slow OPS error run the following on the node having the error
systemctl restart ceph-mon@hostname
Was this article helpful?
Dislike 0
Views: 163
Unboxing Racking Storage Drives Cable Setup Power UPS Sizing Remote Access