This guide covers the migration of Ceph Monitors (MONs) from worker nodes to control plane (master) nodes. It is structured to start with the standard, automated approach and provides "Advanced Manual Ops" if the automated process gets stuck due to scheduling conflicts.
Phase 1: Normal MON Migration
Under normal circumstances, updating the Ceph Cluster custom resource (CR) should trigger Rook to drain and move the MONs automatically.
-
Update the Ceph Cluster Placement
Apply your new placement rules and tolerations to the Ceph Cluster YAML.
spec:
placement:
mon:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: Exists
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule" -
Monitor the Transition Watch the pods. Rook should terminate one MON at a time and attempt to recreate it on a node matching the new affinity.
kubectl -n rook-ceph get pods -l app=rook-ceph-mon -wSuccess: If pods move to master nodes and reach Running status, no further action is needed.
Failure: If pods remain Pending for more than 5 minutes, proceed to Phase 2.
Phase 2: Advanced Manual Ops (The Recovery)
If the MON is stuck, it is likely due to a conflict where the Deployment has both the new Node Affinity (Master) and an old Node Selector (Worker).
-
Freeze the Operator
Stop the operator from making further changes while you perform surgery.
kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas=0 -
Identify and Clear the Conflict
The goal is to ensure the nodeSelector is not forcing the MON to a node that the nodeAffinity forbids.
Check for the conflict:
kubectl -n rook-ceph get deploy <stuck-mon-name> -o \
jsonpath='{.spec.template.spec.nodeSelector}'The Conflict Rule: If the command above returns a specific node name (e.g.,
kubernetes.io/hostname: worker-1or rule that does not match) but your new affinity requires a master node, the Pod will never schedule. -
Manually Re-map the MON Endpoints
Rook uses a ConfigMap to remember where MONs belong. If the MON is stuck, you must update its "home" manually.
Edit the ConfigMap:
kubectl -n rook-ceph edit cm rook-ceph-mon-endpointsModify the Entry: Locate the stuck MON's ID (e.g., a, b, or c). Change its recorded IP address and Node Name to match the target master node, clean up the conflict or amend affinity and node selector fields if required.
Note: Be extremely careful to keep the syntax intact. Only change the values for the specific MON being moved.
-
Delete the Conflicted Deployment
Delete the stuck MON deployment entirely. Because you updated the ConfigMap in Step 3, Rook will recreate this deployment with the correct settings when it restarts.
kubectl -n rook-ceph delete deploy <stuck-mon-deployment-name> -
Resume Reconciliation
Scale the operator back up.
kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas=1 -
Final Verification
Ensure the new Pod has cleared the nodeSelector conflict and is running on the master node.
kubectl -n rook-ceph get pod -l app=rook-ceph-mon -o wide