Skip to main content

101 posts tagged with "ceph"

View All Tags

· One min read
Joachim Kraftmayer

The Ceph Octopus release 15.2.5 introduces the new feature rbd mirroring based on snapshots.

The new method no longer uses the journal to synchronize the data. It synchronizes the data between two snapshots using the fast-diff and delta-export features.

This type of synchronization requires fewer IOP resources and does not directly affect the performance of the current system, as is the case with journal base mirroring.

The new implementation directly uses the kernel features of Ceph and not external libraries: librbd, rbd-nbd, ...

sources

docs.ceph.com/en/latest/rbd/rbd-mirroring/

github.com/ceph/ceph/pull/34032/files

· One min read
Joachim Kraftmayer

This guide will detail the process of adding OSD nodes to an existing cluster running RedHat Enterprise Storage 4 (Nautilus). The process can be completed without taking the cluster out of production.

Set ceph cluster into maintenance mode

ceph osd set norebalance

ceph osd set nobackfill

ceph osd set norecover

Verify ceph cluster status

ceph status

Make sure that the new ceph node is defined in the /etc/hosts file.

vim /usr/share/ceph-ansible/hosts
[mons]
...
[mgrs]
...
[osds]
ceph-node1
ceph-node2
ceph-node3
ceph-node4
...

ping test before ansible playbook execution


ansible-playbook site-conatiner.yml --limit ceph-node4

unset maintenance mode

ceph osd unset nobackfill

ceph osd unset norecover

ceph osd unset norebalance

verify added Check that all Osds with hard drives have been added as expected

ceph osd tree
ceph osd crush tree
ceph osd df
ceph -s

verify all services uses the same version

ceph versions

sources

docs.ceph.com/projects/ceph-ansible/en/latest/day-2/osds.html

docs.ceph.com/projects/ceph-ansible/en/latest/

· 2 min read
Joachim Kraftmayer

Perhaps someone has already thought about using EC (erasure coding) for ceph pools, so that the overhead for the secure storage of data is not too high. This was already a topic in many of the trainings we have held in recent years.

But what most people forget after creating EC pools is how to get all the information about an existing pool.

ceph osd pool ls

or

ceph osd pool ls detail

don't really give information about the configuration of erasure coding pools. However, there is a small option that lets ceph spill the beans a bit more.

ceph osd pool ls detail --format=json

you might get more information than you want.

But with

ceph osd pool ls detail --format=json | jq '.'

the whole thing looks much more friendly to the eyes.

And here we find more information about the erasure coded pools:

ceph osd pool ls detail --format=json | jq '.' | grep erasure_code_profile
erasure_code_profile": "clyso-costum-profile",

If you want to list all defined profiles, then use

ceph osd erasure-code-profile ls

You can get detailed information about an erasure code profile with:

ceph osd erasure-code-profile get clyso-costum-profile

· One min read
Joachim Kraftmayer

We had the problem of getting the correct authorizations for the Ceph CSI user on the pools.

We then found the following bug for the version prior to 14.2.12.

https://github.com/ceph/ceph/pull/36413/files#diff-1ad4853f970880c78ea0e52c81e621b4

Was then solved with version 14.2.12.

https://tracker.ceph.com/issues/46321

· One min read
Joachim Kraftmayer
monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]

We see this example again and again with customers who copy their keyring file directly from the output of:

 ceph auth ls

In the client.\<name\>.keyring the name is enclosed in square brackets and the key is separated by an equal sign and in the ceph auth ls by a colon.

· One min read
Joachim Kraftmayer
  • osd max backfills: This is the maximum number of backfill operations allowed to/from OSD. The higher the number, the quicker the recovery, which might impact overall cluster performance until recovery finishes.
  • osd recovery max active: This is the maximum number of active recover requests. Higher the number, quicker the recovery, which might impact the overall cluster performance until recovery finishes.
  • osd recovery op priority: This is the priority set for recovery operation. Lower the number, higher the recovery priority. Higher recovery priority might cause performance degradation until recovery completes.
ceph tell 'osd.*' injectargs --osd-max-backfills=2 --osd-recovery-max-active=2

Recommendation

Start in small steps, observe the Ceph status, client IOPs and throughput and then continue to increase in small steps.

In the producton with regard to the applications and hardware infrastructure, we recommend setting these settings back to default as soon as possible.

Sources

https://www.suse.com/support/kb/doc/?id=000019693