Ceph:每个OSD太多的PG

时间:2016-11-23 17:55:19

标签: ceph

我使用推荐值(使用文档中的公式)配置Ceph。我有3个OSD,我的配置(我已经放在监控节点和所有3个OSD上)包括:

osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 150
osd pool default pgp num = 150

当我运行ceph status时,我得到:

 health HEALTH_WARN
        too many PGs per OSD (1042 > max 300)

这有两个原因令人困惑。首先,因为推荐的配方不满足Ceph。其次,最令人费解的是,当我的配置为150时,它表示每个OSD有1042个PG。

我做错了什么?

2 个答案:

答案 0 :(得分:9)

在设置PG计数之前,您需要了解3件事。

<强> 1。 OSD数量

ceph osd ls

Sample Output:
 0
 1
 2

 Here Total number of osd is three.

<强> 2。池数

ceph osd pool lsrados lspools

Sample Output:
  rbd
  images
  vms
  volumes
  backups

Here Total number of pool is five.

第3。复制计数

ceph osd dump | grep repli

Sample Output:
 pool 0 'rbd' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 38 flags hashpspool stripe_width 0
 pool 1 'images' replicated size 2 min_size 2 crush_ruleset 1 object_hash rjenkins pg_num 30 pgp_num 30 last_change 40 flags hashpspool stripe_width 0
 pool 2 'vms' replicated size 2 min_size 2 crush_ruleset 1 object_hash rjenkins pg_num 30 pgp_num 30 last_change 42 flags hashpspool stripe_width 0
 pool 3 'volumes' replicated size 2 min_size 2 crush_ruleset 1 object_hash rjenkins pg_num 30 pgp_num 30 last_change 36 flags hashpspool stripe_width 0
 pool 4 'backups' replicated size 2 min_size 2 crush_ruleset 1 object_hash rjenkins pg_num 30 pgp_num 30 last_change 44 flags hashpspool stripe_width 0

You can see each pool has replication count two.

现在让我们进入计算

计算:

总PGs计算:

Total PGs = (Total_number_of_OSD * 100) / max_replication_count

This result must be rounded up to the nearest power of 2.

示例:

OSD的数量:3
没有复制计数:2

总PGs =(3 * 100)/ 2 = 150.最近的150到2的功率是256.

因此最大推荐PG为256

您可以为每个游泳池设置PG

每个池的总PG数计算:

Total PGs = ((Total_number_of_OSD * 100) / max_replication_count) / pool count

This result must be rounded up to the nearest power of 2.

示例:

OSD的数量:3
没有复制计数:2
没有游泳池:5

总PGs =((3 * 100)/ 2)/ 5 = 150/5 = 30。最近的30到2的力是32.

因此每个游泳池的总数为32。

2的权力表:

2^0     1
2^1     2
2^2     4
2^3     8
2^4     16
2^5     32
2^6     64
2^7     128
2^8     256
2^9     512
2^10    1024

有用的命令

ceph osd pool create <pool-name> <pg-number> <pgp-number> - To create a new pool

ceph osd pool get <pool-name> <pg_num> - To get number of PG in a pool

ceph osd pool get <pool-name> <pgp_num> - To get number of PGP in a pool

ceph osd pool set <pool-name> <pg_num number> - To increase number of PG in a pool

ceph osd pool set <pool-name> <pgp_num number> - To increase number of PGP in a poo

*usually pg and pgp number is same

答案 1 :(得分:0)

我如何在12.2.4发光中修复它:

每个OSD太多PG(380>最多200)可能会导致许多阻止请求。

首先你需要设置:

[global]

mon_max_pg_per_osd = 800  # < depends on you amount of PGs
osd max pg per osd hard ratio = 10 # < default is 2, try to set at least 5. It will be
mon allow pool delete = true # without it you can't remove a pool 

然后逐个重启所有MON和OSD。

检查值:

ceph --admin-daemon /var/run/ceph/ceph-mon.ceph2.asok config get  mon_max_pg_per_osd
ceph --admin-daemon /var/run/ceph/ceph-osd.3.asok config get osd_max_pg_per_osd_hard_ratio

现在看这里:

rados lspools
ceph osd pool get .users.email pg_num

在我的情况下,默认情况下pg_num是128或类似的东西(我的群集是4岁,很多升级都有很多变化)。你可以这样减少它。

小心:

ceph osd pool create .users.email.new 8
rados cppool .users.email default.rgw.lc.new
ceph osd pool delete .users.email .users.email --yes-i-really-really-mean-it
ceph osd pool rename .users.email.new .users.email
ceph osd pool application enable .users.email rgw

如果还不够,请尝试找到另一个可以切割的游泳池。