我已经制作了一个小型集群,该集群具有1.4.4的功能,可以在3个节点上进行测试,每个节点1个OSD为20G。我通过设置部署集群:如上所述的databaseSizeMB和journalSizeMB。然后,我创建一个复制池,并将其设置为targetizeratio 95%。部署成功。 然后我通过填充复制池进行了测试,结果很糟糕。 3 OSD pod崩溃并且无法启动,日志中提到了这一点:
debug 2020-10-06T07:38:30.108+0000 7f8e0c5a2700 -1 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write failed to allocate 0x10000 allocated 0x 0 min_alloc_size 0x10000 available 0x 0
debug 2020-10-06T07:38:30.108+0000 7f8e0c5a2700 -1 bluestore(/var/lib/ceph/osd/ceph-0) _do_write _do_alloc_write failed with (28) No space left on device
debug 2020-10-06T07:38:30.108+0000 7f8e0c5a2700 -1 bluestore(/var/lib/ceph/osd/ceph-0) _txc_add_transaction error (28) No space left on device not handled on operation 10 (op 2, counting from 0)
debug 2020-10-06T07:38:30.108+0000 7f8e0c5a2700 -1 bluestore(/var/lib/ceph/osd/ceph-0) ENOSPC from bluestore, misconfigured cluster
似乎没有足够的空间来写元数据。这是一个测试群集,因此我可以轻松地重置群集,但我不想在生产群集上遇到此问题。避免这种情况的最好方法是什么?