Skip to main content

Ceph Quincy (v17)

This document lists known critical bugs affecting Ceph Quincy (v17) releases.

PG Splitting/Merging Causes OSD Out-Of-Memory

Severity: High
Affected Versions: 17.2.0, 17.2.1, 17.2.2, 17.2.3
Bug Report: https://tracker.ceph.com/issues/53729

Description

A bug in the PG splitting and merging code can cause the OSD to go out-of-memory, a condition which persists even after restart. Offline tools are available in fixed releases to workaround the issue.

Recommendation

  • Do not change pg_num for any pool until after upgrade to a fixed release
  • Disable the pg autoscaler
  • Fixed in v17.2.4

BlueStore Potential Corruption

Severity: Critical
Affected Versions: 17.2.8
Bug Report: https://tracker.ceph.com/issues/69764

Description

Some versions of Ceph were released with a bug that may cause OSDs to crash and corrupt the on-disk data.

Recommendation

Upgrade to a fixed version (17.2.9 or 18.2.7) as soon as possible.

RadosGW --bypass-gc Data Loss Bug

Severity: Critical
Affected Versions: 17.2.x
Bug Report: https://tracker.ceph.com/issues/73348

Description

A long-standing data loss bug with --bypass-gc causes deletion of copied object data. If any of the deleted objects had been copied to/from other buckets, --bypass-gc deletes the data of those copies too. As a result, the copies are still visible to ListObjects requests but GetObject requests fail with NoSuchKey.

Note: This bug also affects Ceph Reef (v18) and Squid (v19). See the Reef known bugs and Squid known bugs pages for details.

Recommendation

  • Use radosgw-admin bucket rm without --bypass-gc which correctly handles copied objects
  • Follow the bug tracker for fix and backport updates
  • If you've used --bypass-gc in the past, audit your buckets for objects with missing data