Properly remove Ceph OSD from a productive cluster

November 2, 2017 · One min read

Managing Director at Clyso

In a productive cluster, the removal of OSDs or entire hosts can affect regular operations for users, depending on the load. It is therefore recommended, for example, to gradually remove an OSD or a host from productive operation in order to ensure full replication over the entire process.

You could execute the commands manually step by step and always wait until the data has been completely redistributed in the cluster.

ceph osd crush reweight osd.<ID> 1.0
ceph osd crush reweight osd.<ID> 8.0
ceph osd crush reweight osd.<ID> 6.0
ceph osd crush reweight osd.<ID> 4.0
ceph osd crush reweight osd.<ID> 2.0
ceph osd crush reweight osd.<ID> 0.0

We wrote our own script for automation years ago, so it should also work with earlier versions, such as Hammer, Jewel, Kraken and Luminous.

ceph osd out <ID>
INITV: service ceph stop osd.<ID>
SYSTEMD: systemctl stop ceph-osd@<ID>
ceph osd crush remove osd.<ID>
ceph auth del osd.<ID>
ceph osd rm <ID>
Achtung beim Löschen von Elementen aus der CRUSHMAP fängt der Ceph Cluster an die Verteilung wieder auszugleichen.