Skip to main content

4 posts tagged with "db"

View All Tags

· 2 min read
Joachim Kraftmayer

ceph-volume can be used to create for a existing OSD a new WAL/DB on a faster device without the need to recreate the OSD.

ceph-volume lvm new-db --osd-id 15 --osd-fsid FSID --target cephdb/cephdb1
--> NameError: name 'get_first_lv' is not defined
this is a bug in ceph-volume v16.2.7 that will be fixed in v16.2.8
[https://github.com/ceph/ceph/pull/44209](https://github.com/ceph/ceph/pull/44209)

First create a new Logical Volume on the Device that will hold the new WAL/DB

vgcreate cephdb /dev/sdb
Volume group "cephdb" successfully created
lvcreate -L 100G -n cephdb1 cephdb
Logical volume "cephdb1" created.

Now stop running OSD and if it was deactivated ( cephadm ) then activate it on the host

systemctl stop ceph-FSID@osd.0.service
ceph-volume lvm activate --all --no-systemd

Create new WAL/DB on new Device

ceph-volume lvm new-db --osd-id 0 --osd-fsid OSD-FSID --target cephdb/cephdb1
--> Making new volume at /dev/cephdb/cephdb1 for OSD: 0 (/var/lib/ceph/osd/ceph-0)
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-0/block.db
Running command: /bin/chown -R ceph:ceph /dev/dm-1
--> New volume attached.

Migrate existing WAL/DB to new Device

ceph-volume lvm migrate --osd-id 0 --osd-fsid OSD-FSID --from data --target cephdb/cephdb1
--> Migrate to existing, Source: ['--devs-source', '/var/lib/ceph/osd/ceph-0/block'] Target: /var/lib/ceph/osd/ceph-0/block.db
--> Migration successful.

Deactivate OSD and start it

ceph-volume lvm deactivate 0
Running command: /bin/umount -v /var/lib/ceph/osd/ceph-0
stderr: umount: /var/lib/ceph/osd/ceph-0 unmounted
systemctl start ceph-FSID@osd.0.service

· 2 min read
Joachim Kraftmayer

First we wanted to use ceph-bluestore-tool bluefs-bdev-new-wal. However, it turned out that it is not possible to ensure that the second DB is actually used. For this reason, we decided to migrate the entire bluefs of the osd to an ssd/flash.

bluestore

Verify the current osd bluestore setup

ceph-bluestore-tool show-label –dev device []

Verify the current size of the osd bluestore DB

ceph-bluestore-tool  bluefs-bdev-sizes –path <osd path>
ceph-bluestore-tool bluefs-bdev-migrate –path osd path –dev-target new-device –devs-source device1 [–devs-source device2]

Verify the size of the osd bluestore DB after the migration

ceph-bluestore-tool  bluefs-bdev-sizes –path <osd path>

if the size does not correspond to the new target size execute the following command:

ceph-bluestore-tool bluefs-bdev-expand –path osd path

Instruct BlueFS to check the size of its block devices and, if they have expanded, make use of the additional space. Please note >that only the new files created by BlueFS will be allocated on the preferred block device if it has enough free space, and the >existing files that have spilled over to the slow device will be gradually removed when RocksDB performs compaction. In other >words, if there is any data spilled over to the slow device, it will be moved to the fast device over time. https://docs.>ceph.com/en/octopus/man/8/ceph-bluestore-tool/#commands

Verify the new osd bluestore setup

ceph-bluestore-tool show-label –dev device []

Update

You might be interested in a migration method on a higher layer with ceph-volume lvm.

docs.clyso.com/blog/ceph-volume-ceph-osd-migrate-db-to-larger-ssd-flash-device/

Appendix

I'm trying to figure out the appropriate process for adding a separate SSD block.db to an existing OSD. From what I gather the two steps are: 1. Use ceph-bluestore-tool bluefs-bdev-new-db to add the new db device 2. Migrate the data ceph-bluestore-tool bluefs-bdev-migrate I followed this and got both executed fine without any error. Yet when the OSD got started up, it keeps on using the integrated block.db instead of the new db. The block.db link to the new db device was deleted. Again, no error, just not using the new db www.spinics.net/lists/ceph-users/msg62357.html

Sources

docs.ceph.com/en/octopus/man/8/ceph-bluestore-tool

tracker.ceph.com/attachments/download/4478/bluestore.png

www.suse.com/support/kb/doc/?id=000020276

· One min read
Joachim Kraftmayer

But, I already mentioned it (for a bit different case) in newer versions there is ceph-volume lvm migrate [1] which I think allows to do the same but in much simpler way. I have not tried it to and the documentation is not very clear to me so one need to experiment with this before writing exact instructions. We might also need to use new-db [2] and new-wal [3] commands before running migrate but I am not sure they are needed for this particular case.

[1] https://docs.ceph.com/en/latest/ceph-volume/lvm/migrate/

[2] https://docs.ceph.com/en/latest/ceph-volume/lvm/newdb/

[3] https://docs.ceph.com/en/latest/ceph-volume/lvm/newwal/

· One min read
Joachim Kraftmayer

When configuring osd in mixed setup with db and wal colocated on a flash device, ssd or NVMe. There were always changes and irritations where the DB and the WAL are really located. With a simple test it can be checked: The location of the DB for the respective OSD can be verified via ceph osd metadata osd.<id> and the variable "bluefs_dedicated_db": "1".

The WAL was created separately in earlier Ceph versions and automatically on the same device as the DB in later Ceph versions. The WAL can be easily tested by using the ceph osd.<id> tell bench command.

First you check larger write operations with the command:

ceph tell osd.0 bench 65536 409600

Second, you check with smaller objects that are smaller than the bluestore_prefer_deferred_size_hdd (64k).

ceph tell osd.0 bench 65536 4096

If you compare the IOPs of the two tests, one result should correspond to the IOPs of an SSD and the other result should be quite low for the HDD. From this you can know if the WAL is on the HDD or the flash device.