Skip to main content

Ceph don's - replication size of 2

· 2 min read
Joachim Kraftmayer

Again and again I come across people who use a replication of 2 for a replicated Ceph pool.

If you know exactly what you are doing, you can do this. I would strongly advise against it.

One simple reason for this is that you can't form a clear majority with two people, there always has to be at least a third.

There are error scenarios in which it can quickly happen that both OSDs (osdA and osdB) of a placement group (replication size = 2) are not available. If an osdA fails, the cluster only has one copy of the object and the default value (min_size = 2) on the pool means that the cluster would no longer allow any write operations to the object.

If min_size=1 (not recommended) then the osdB could be gone for a short time and the osdA would come back. Now osdA does not know whether further write operations have been carried out on osdB during its offline phase.

You can then only hope that all osds come back or you can then manually make the decision for the most current data set. While in the background more and more blocked_requests accumulate in the cluster that would like to access the data.