Hello Ceph community! Here at Clyso we’ve been thinking quite a bit about the tuning defaults and hardware/software recommendations we will be making for users of our upcoming Clyso Enterprise Storage (CES) product based on Ceph. We decided that given how useful some of this information is both for CES and for the upstream project, we’d open the document up to the community for feedback and to help us build a better product. We’ll be adding more content as time goes on. Feel free to reach out at mark.nelson at clyso.com if you have thoughts or questions!
4 posts tagged with "ssd"
View All Tagsceph osd migrate DB to larger ssd/flash device
First we wanted to use ceph-bluestore-tool bluefs-bdev-new-wal. However, it turned out that it is not possible to ensure that the second DB is actually used. For this reason, we decided to migrate the entire bluefs of the osd to an ssd/flash.
Verify the current osd bluestore setup
ceph-bluestore-tool show-label –dev device […]
Verify the current size of the osd bluestore DB
ceph-bluestore-tool bluefs-bdev-sizes –path <osd path>
ceph-bluestore-tool bluefs-bdev-migrate –path osd path –dev-target new-device –devs-source device1 [–devs-source device2]
Verify the size of the osd bluestore DB after the migration
ceph-bluestore-tool bluefs-bdev-sizes –path <osd path>
if the size does not correspond to the new target size execute the following command:
ceph-bluestore-tool bluefs-bdev-expand –path osd path
Instruct BlueFS to check the size of its block devices and, if they have expanded, make use of the additional space. Please note >that only the new files created by BlueFS will be allocated on the preferred block device if it has enough free space, and the >existing files that have spilled over to the slow device will be gradually removed when RocksDB performs compaction. In other >words, if there is any data spilled over to the slow device, it will be moved to the fast device over time. https://docs.>ceph.com/en/octopus/man/8/ceph-bluestore-tool/#commands
Verify the new osd bluestore setup
ceph-bluestore-tool show-label –dev device […]
Update
You might be interested in a migration method on a higher layer with ceph-volume lvm.
docs.clyso.com/blog/ceph-volume-ceph-osd-migrate-db-to-larger-ssd-flash-device/
Appendix
I'm trying to figure out the appropriate process for adding a separate SSD block.db to an existing OSD. From what I gather the two steps are: 1. Use ceph-bluestore-tool bluefs-bdev-new-db to add the new db device 2. Migrate the data ceph-bluestore-tool bluefs-bdev-migrate I followed this and got both executed fine without any error. Yet when the OSD got started up, it keeps on using the integrated block.db instead of the new db. The block.db link to the new db device was deleted. Again, no error, just not using the new db www.spinics.net/lists/ceph-users/msg62357.html
Sources
docs.ceph.com/en/octopus/man/8/ceph-bluestore-tool
ceph-volume - ceph osd migrate DB to larger ssd/flash device
But, I already mentioned it (for a bit different case) in newer versions there is ceph-volume lvm migrate [1] which I think allows to do the same but in much simpler way. I have not tried it to and the documentation is not very clear to me so one need to experiment with this before writing exact instructions. We might also need to use new-db [2] and new-wal [3] commands before running migrate but I am not sure they are needed for this particular case.
[1] https://docs.ceph.com/en/latest/ceph-volume/lvm/migrate/
verify ceph osd DB and WAL setup
When configuring osd in mixed setup with db and wal colocated on a flash device, ssd or NVMe. There were always changes and irritations where the DB and the WAL are really located.
With a simple test it can be checked:
The location of the DB for the respective OSD can be verified via
ceph osd metadata osd.<id>
and the variable "bluefs_dedicated_db": "1"
.
The WAL was created separately in earlier Ceph versions and automatically on the same device as the DB in later Ceph versions.
The WAL can be easily tested by using the ceph osd.<id> tell bench
command.
First you check larger write operations with the command:
ceph tell osd.0 bench 65536 409600
Second, you check with smaller objects that are smaller than the bluestore_prefer_deferred_size_hdd (64k).
ceph tell osd.0 bench 65536 4096
If you compare the IOPs of the two tests, one result should correspond to the IOPs of an SSD and the other result should be quite low for the HDD. From this you can know if the WAL is on the HDD or the flash device.