This article describes the process of repairing damaged objects on your Ceph Cluster.
Configured Ceph Cluster
Experiencing HEALTH_ERR state with damaged objects
- This page will walk through how to fix a damaged object when the cluster health message below occurs.
cluster: id: cec9ca98-b59f-4d91-8ddd-43802195c735 health: HEALTH_ERR 1 scrub errors Possible data damage: 1 pg inconsistent data: pools: 10 pools, 1120 pgs objects: 29.66 k objects, 99 GiB usage: 320 GiB used, 7.7 TiB / 8.0 TiB avail pgs: 1119 active+clean 1 active+clean+inconsistent
To find the damaged PG run the following command:
ceph health detail
OSD_SCRUB_ERRORS 1 scrub errors PG_DAMAGED Possible data damage: 1 pg inconsistent pg 13.6 is active+clean+inconsistent, acting [4,18,14]
As seen in the output above , the PG with the damaged object is 13.6. Attempt to repair the PG.
ceph pg repair 13.6
- Watch that the a repair has begun in either the UI or terminal
data: pools: 10 pools, 1120 pgs objects: 29.66 k objects, 99 GiB usage: 320 GiB used, 7.7 TiB / 8.0 TiB avail pgs: 1119 active+clean 1 active+clean+scrubbing+deep+inconsistent+repair
- If successful the cluster should be updated to a healthy state.
Once the PG has been repaired you can run the following two commands to check if the cluster is in a healthy state.
ceph -s ceph health detail