Manually Removing an OSD From Your Ceph Cluster (without moving all your data multiple times)

The process of manually removing an OSD from your ceph cluster, as documented on the current, official ceph docs (Luminous release as of this writing) will result in data being re-balanced twice. Luckily Sébastien Han told us how to do it correctly on his blog in 2015. This is mainly a rehash of what he said (so I can always find it) with minor updates. Note, this is for planned disk removal, not post-failure. Continue reading “Manually Removing an OSD From Your Ceph Cluster (without moving all your data multiple times)”

Ceph: key for mgr.HOST exists but cap mds does not match

Somewhere along the lines, maybe during the upgrade to Luminous one of my larger Ceph clusters got borked up. Everything was running fine, but my two dedicated MDSes which also act as MONs weren’t running the MGR daemon. Easy enough to fix with ceph-deploy:

$ ceph-deploy mgr create HOST
.....
[HOST][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.HOST mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-HOST/keyring
[HOST][ERROR ] Error EINVAL: key for mgr.HOST exists but cap mds does not match
[HOST][ERROR ] exit code from command was: 22
[ceph_deploy.mgr][ERROR ] could not create mgr
[ceph_deploy][ERROR ] GenericError: Failed to create 1 MGRs

Continue reading “Ceph: key for mgr.HOST exists but cap mds does not match”