Skip to main content

speed up or slow down ceph recovery

· One min read
Joachim Kraftmayer
Managing Director at Clyso
  • osd max backfills: This is the maximum number of backfill operations allowed to/from OSD. The higher the number, the quicker the recovery, which might impact overall cluster performance until recovery finishes.
  • osd recovery max active: This is the maximum number of active recover requests. Higher the number, quicker the recovery, which might impact the overall cluster performance until recovery finishes.
  • osd recovery op priority: This is the priority set for recovery operation. Lower the number, higher the recovery priority. Higher recovery priority might cause performance degradation until recovery completes.
ceph tell 'osd.*' injectargs --osd-max-backfills=2 --osd-recovery-max-active=2

Recommendation

Start in small steps, observe the Ceph status, client IOPs and throughput and then continue to increase in small steps.

In the producton with regard to the applications and hardware infrastructure, we recommend setting these settings back to default as soon as possible.

Sources

https://www.suse.com/support/kb/doc/?id=000019693

official debian packages for ceph nautilus available

· One min read
Joachim Kraftmayer
Managing Director at Clyso

For some time now there are no more official packages for Debian on the ceph.io site. The reason for this is the switch to a C++ version, which was only supported by Debian with Buster. All the more pleasing is the fact that Bernd Zeimetz has been working on the Ceph package for Debian since 28.11.2019 and is currently maintaining it for the current Nautilus versions from 14.2.4-1 for Bullseye and Buster backports.

See changelog of the project:

https://packages.debian.org/de/bullseye/ceph

https://packages.debian.org/de/buster-backports/ceph

Extension of a Ceph cluster

· 2 min read
Joachim Kraftmayer
Managing Director at Clyso

Before the Lumionous Release

  • Ceph Cluster is in status HEALTH_OK
  • Add all OSDs with weight 0 to the Ceph cluster
  • Gradually increase the weight of all new OSDs by 0.1 to 1.0, depending on the base load of the cluster.
  • Wait until the Ceph cluster has reached the status HEALTH_OK again or all PGs have reached the status active+clean
  • Repeat the weight increase for the new OSDs until you have achieved the desired weighting.

Since the Luminous Release

  • Ceph cluster is in HEALTH_OK status
  • Set the 'norebalance' flag (and normally also nobackfill)
  • Add the new OSDs to the cluster
  • Wait until the PGs start peering with each other (this can take a few minutes)
  • Remove the norebalance and nobackfill flag
  • Wait until the Ceph cluster has reached the HEALTH_OK status again

Since the Nautilus Release

With the Nautilus release PG splitting and merging was introduced and the following default values were set:

"osd_pool_default_pg_num": "8"

"osd_pool_default_pgp_num": "0"

ceph.com/rados/new-in-nautilus-pg-merging-and-autotuning/

Furthermore, the osd_pool_default_pg_num should be set to a value that makes sense for the respective Ceph cluster.

The value 0 of osd_pool_default_pgp_num now indicates that this value is automatically monitored by the Ceph cluster and adjusted according to the following criteria:

Starting in Nautilus, this second step is no longer necessary: as long as pgp_num and pg_num currently match, pgp_num will automatically track any pg_num changes. More importantly, the adjustment of pgp_num to migrate data and (eventually) converge to pg_num is done gradually to limit the data migration load on the system based on the new target_max_misplaced_ratio config option (which defaults to .05, or 5%). That is, by default, Ceph will try to have no more than 5% of the data in a “misplaced” state and queued for migration, limiting the impact on client workloads. ceph.com/rados/new-in-nautilus-pg-merging-and-autotuning/

note

Before the Nautilus release, the number of PGs had to be adjusted manually for the respective pools. With Nautilus, the Ceph Manager module pg_autoscaler can take over.

ceph.com/rados/new-in-nautilus-pg-merging-and-autotuning/

Ceph (I/O latency) für RBD

· One min read
Joachim Kraftmayer
Managing Director at Clyso

When commissioning new Ceph clusters, our standard tests also include measuring the I/O latency for RBD.

We also always measure the performance values for the entire stack. Over the years, we have seen the results of our hard work in improving ceph osd in various tests.

For our tests, we create a temporary work file and read random blocks with non-cached read operations from it.

We are now measuring latencies of 300 to 600 microseconds.