20 Ceph PG数

配置グループ (PG) 数が十分でないと、Ceph クラスターおよびデータ分散のパフォーマンスに影響します。
これは、nearfull osds エラーメッセージの主な原因の 1 つです。

現在のPG数確認

ceph osd pool get TEST-pool pg_num
32

1.変更前の準備

scrub止める

ceph osd set noscrub
ceph osd set nodeep-scrub

back_fillの値を下げておく

ceph tell 'osd.*' injectargs --osd-max-backfills=1 --osd-recovery-max-active=3

2.PG_num変更

ceph osd pool set TEST-pool pg_num 128
ceph osd pool set TEST-pool pgp_num 128

3.変更の確認

# ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP    META     AVAIL    %USE  VAR   PGS  STATUS
 3    hdd  0.81870   1.00000  838 GiB   54 GiB   53 GiB  10 KiB  986 MiB  784 GiB  6.43  0.92   89      up
 0    hdd  0.81870   1.00000  838 GiB   54 GiB   53 GiB  11 KiB  555 MiB  785 GiB  6.38  0.91   89      up
 1    hdd  0.81870   1.00000  838 GiB   63 GiB   62 GiB   4 KiB  712 MiB  775 GiB  7.50  1.07  102      up
 2    hdd  0.81870   1.00000  838 GiB   65 GiB   64 GiB   9 KiB  435 MiB  774 GiB  7.71  1.10  107      up
                       TOTAL  3.3 TiB  235 GiB  232 GiB  37 KiB  2.6 GiB  3.0 TiB  7.01 


# ceph -s
  cluster:
    id:     82c91e96-51db-4813-8e53-0c0044a958f1
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum ceph001,ceph002,ceph003 (age 16h)
    mgr: ceph001(active, since 16h), standbys: ceph003, ceph004, ceph002
    osd: 4 osds: 4 up (since 16h), 4 in (since 17h); 22 remapped pgs
 
  data:
    pools:   2 pools, 129 pgs
    objects: 20.27k objects, 78 GiB
    usage:   236 GiB used, 3.0 TiB / 3.3 TiB avail
    pgs:     0.775% pgs not active
             3006/60795 objects misplaced (4.944%)
             107 active+clean
             17  active+remapped+backfill_wait
             4   active+remapped+backfilling
             1   peering
 
  io:
    client:   11 KiB/s wr, 0 op/s rd, 4 op/s wr
    recovery: 44 MiB/s, 11 objects/s

4.完了後

ceph osd unset noscrub
ceph osd unset nodeep-scrub

PG変更中メモ

PG数変更後は、misplaceをremapし続ける、一度に実行数backfillは、[osd_max_backfills/osd_recovery_max_active]このへんで決まってそう。

misplacedが下記の値(defautl: 5%)を下回ると、次のPGをremapし始める。

※延々終わらないように見えるが、全部のPGをremapするとちゃんと終わる。

target_max_misplaced_ratio

# ceph config get mgr target_max_misplaced_ratio
0.050000

Ceph

目次