Manually Removing an OSD From Your Ceph Cluster (without moving all your data multiple times)

The process of manually removing an OSD from your ceph cluster, as documented on the current, official ceph docs (Luminous release as of this writing) will result in data being re-balanced twice. Luckily Sébastien Han told us how to do it correctly on his blog in 2015. This is mainly a rehash of what he said (so I can always find it) with minor updates. Note, this is for planned disk removal, not post-failure.

First, reweight the OSD to 0 in the crush map to begin ejecting all the data to other disks. Note that we are talking about the crush weight based on disk capacity, not the re-weight (0-1 value to help balance disks that get full ahead of their friends).

$ ceph osd crush reweight osd.<ID> 0.0

This will take a little while, depending on the size and utilization of you cluster. Just wait until ceph -s shows things good again. Then:

$ ceph osd out <ID>

I usually use one node as a ceph-deploy/management server for most of these commands. On the server actually hosting the OSD:

# systemctl stop ceph-osd@<ID>

Back on your management host:

$ ceph osd crush remove osd.<ID>
$ ceph auth del osd.<ID>
$ ceph osd rm <ID>

If this is the only/last OSD on a host, I have found that the host can hang out in your crush map even when empty. To get rid of it:

$ ceph osd crush remove <hostname>

This keeps your ceph osd tree output nice and neat.

Leave a Reply Cancel reply