Resolve Scrub Errors on Ceph

Saat memantau Cluster Ceph melalui dashboard. Ditemukan peringatan health: HEALTH_ERR 1 scrub errors/Possible data damage: 1 pg inconsistent.

Situation

Jalankan ceph -s mendapati output berikut

  cluster:
    id:     0260f99a-117e-4c7e-8fbe-86c483bcd7e9
    health: HEALTH_ERR
            1 scrub errors
            Possible data damage: 1 pg inconsistent

  services:
    : 3 daes, quorum mon01,mon02,mon03 (age 10w)
    mgr: mon01(active, since 7w), standbys: mon02, mon03
    mds: cephfs:1 {0=mds01=up:active} 2 up:standby
    osd: 285 osds: 285 up (since 43h), 285 in (since 2w)
    rgw: 3 daes active (cephrgw01, cephrgw02, cephrgw03)

  data:
    pools:   8 pools, 4328 pgs
    objects: 294.96M objects, 463 TiB
    usage:   694 TiB used, 1.3 PiB / 2.0 PiB avail
    pgs:     4320 active+clean
             7    active+clean+scrubbing+deep
             1    active+clean+scrubbing+deep+inconsistent

  io:
    client:   3.8 MiB/s rd, 188 MiB/s wr, 11 op/s rd, 732 op/s wr

Resolution

Jalankan ceph health detail untuk menemukan ID pg untuk pg yang tidak konsisten

HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent
OSD_SCRUB_ERRORS 1 scrub errors
PG_DAMAGED Possible data damage: 1 pg inconsistent
    pg 5.6f1 is active+clean+scrubbing+deep+inconsistent, acting [7,141,208,199,70,37,182,131,120,259]

Perbaiki pg dengan command ceph pg repair $pgid

ceph pg repair 5.6f1