KB450156 – Replacing Failed OSDs

Last modified: October 23, 2019
You are here:
Estimated reading time: 1 min

OSD location

  • In the UI:
    • Cluster→OSDs. Sort By Status.
  • In the terminal:
    • ceph osd tree
    • Look for OSDs marked either, Down,Out or both.
  • osd.23 and osd.17 have failed on host vosd01

Resolve Linux Name of OSD devices

  • In the UI:
    • Select failed OSD → Metadata. Look for the devices feild.
  • In the terminal:
    • ceph-volume lvm list
      • Find the failed OSD. Read the device name.
  • osd.23=/dev/sdi and osd.17=/dev/sdg

Find Physical Location of Failed Devices

  • Remote into the host(s) with failed OSDs.
    • ssh root@vosd01
  • Map linux device name to physical alias name.
    • ls -al /dev/ | grep sdg
      • lrwxrwxrwx.  1 root       root              3 Jan  9 23:58 1-7 -> sdg
        brw-rw----.  1 root       disk         8,  96 Jan  9 23:58 sdg
    • ls -al  /dev/ | grep sdi
      •  
        lrwxrwxrwx.  1 root       root              3 Jan  9 23:58 1-6 -> sdi
        brw-rw----.  1 root       disk         8, 128 Jan  9 23:58 sdi
  • OSD 17 is in slot 1-7 on host vosd01
  • OSD 23 is in slot 1-6 on host vosd01
  • DO NOT PHYSICALLY REMOVE THESE DRIVES YET, SAFELY REMOVE FROM CLUSTER FIRST
  • DO NOT REMOVE FAILED OSDs FROM CLUSTER UNTIL YOU HAVE PHYSICALLY LOCATED THEM FIRST
  • Remote into host(s) with failed OSDs
    • ssh vosd01
  • Destroy the OSDs
    • ceph osd destroy 17 --yes-i-really-mean-it
    • ceph osd destroy 23 --yes-i-really-mean-it
  • Remove the failed disks physically from the system
  • Insert new drive into same slots. IF you use new slots take note of the name, use this new slot name below
  • Wipe the new disk
    • ceph-volume lvm zap /dev/1-7
    • ceph-volume lvm zap /dev/1-6
  • Recreate the old OSD with the create command using the old OSD.id with the new disk present
    • ceph-volume lvm create --osd-id 17 --data /dev/1-7
    • ceph-volume lvm create --osd-id 23 --data /dev/1-6
  • Observe data Migration
    • watch -n1 ceph -s
Was this article helpful?
Dislike 0
Views: 72
Unboxing Racking Storage Drives Cable Setup Power UPS Sizing Remote Access