Did you know  45Drives offers free  public and private  webinars ? Click here to learn more  & register! Build & Price

KB450157 – Repairing Inconsistent Placement Groups with Damaged Objects

You are here:
  • KB Home
  • Ceph
  • KB450157 – Repairing Inconsistent Placement Groups with Damaged Objects

Scope/Description

  • This article describes the process of repairing inconsistent PGs/damaged objects on your Ceph Cluster.

Prerequisites

  • Ceph Cluster
  • Experiencing HEALTH_ERR state with damaged objects
  • PGs that are inconsistent

Steps

Identifying damaged PGs

  • We can see with ceph -s  that we have some inconsistent PGs, and possible data damage.
  cluster:
    id:     cec9ca98-b59f-4d91-8ddd-43802195c735
    health: HEALTH_ERR
            1 scrub errors
            Possible data damage: 1 pg inconsistent
    data:
       pools:   10 pools, 1120 pgs
       objects: 29.66 k objects, 99 GiB
       usage:   320 GiB used, 7.7 TiB / 8.0 TiB avail
       pgs:     1119 active+clean
                1    active+clean+inconsistent
  • To find the damaged PG we can do ceph health detail.
ceph health detail

OSD_SCRUB_ERRORS 1 scrub errors 
PG_DAMAGED Possible data damage: 1 pg inconsistent
    pg 13.6 is active+clean+inconsistent, acting [4,18,14]
  • As seen in the output above , the PG with the damaged object is 13.6.

Repairing Inconsistent PGs

  • We can now repair the PG by doing ceph pg repair PG ID.
ceph pg repair 13.6
  • Watch that the PG repair has begun in either the Ceph Dashboard or terminal with watch ceph -s.
  data:
    pools:   10 pools, 1120 pgs
    objects: 29.66 k objects, 99 GiB
    usage:   320 GiB used, 7.7 TiB / 8.0 TiB avail
    pgs:     1119 active+clean
             1    active+clean+scrubbing+deep+inconsistent+repair
  • If successful the cluster should be updated to a healthy state.

Verification

  • Once the PG has been repaired  you can run the following two commands to check if the cluster is in a healthy state.
ceph -s
ceph health detail

Troubleshooting

  • If the process above failed to fix the HEALTH_ERR State you may have to manually fix the objects. See here for more details
Was this article helpful?
Dislike 1
Views: 197
Unboxing Racking Storage Drives Cable Setup Power UPS Sizing Remote Access