CES 24.11.0
Release Date: November 2024
Based on: Ceph Quincy 17.2.7
Previous Version: 1.0.0
Summary
CES 24.11.0 introduces significant performance optimizations, enhanced RGW configurations, and improved cluster management features. This release includes comprehensive upstream backports and downstream optimizations.
What's New
- Enhanced RGW performance with optimized thread pool and bucket shard configurations
- Improved OSD performance with increased inflight thresholds
- Better cluster balancing with updated deviation settings
- Enhanced scrubbing capabilities
- RocksDB performance improvements
- Dashboard branding updates
Downstream Patches
common/options: Reduce rgw_thread_pool_size
- Change: 512 to 128
- Reason: Short term patch until https://github.com/ceph/ceph/pull/57167 can be backported.
Increase default bucket shard count
- Change: from 11 to 31
- Benefit: Improves per-bucket PUT performance dramatically on initial ingest. Pushes off bucket resharding. Some impact on list performance for small buckets, but doesn't hurt large bucket listing performance.
common/options: Increase objecter inflight thresholds
- Max in-flight data in bytes: 100_M to 1_G
- Max in-flight operations: 1_K to 8_K
- Benefit: Increases performance of RGW workloads and is a common optimization. Allows 8192 128K objects, similar object size target relative to upstream defaults.
ceph-objectstore-tool: make 'rm-omap' command support remove many keys
- Feature: rm-omap can remove multiple keys
- Based on: https://github.com/ceph/ceph/pull/22379
- Fixes: https://tracker.ceph.com/issues/38215
osd: increase osd_max_pg_per_osd_hard_ratio to 10
- Issue: Hard ratio of 3 hit too often during normal maintenance tasks such as adding a new host
- Problem: OSDs prevent pg peering and many PGs stuck in activating state
- Solution: Increased to 10 to prevent "maybe_wait_for_max_pg withhold creation of pg" warnings
mgr/balancer: set upmap_max_deviation to 1
- Issue: Default upmax_max_deviation 5 not effective for well-balanced clusters
- Problem: Especially evident on clusters with many pools - deviations pile up, OSDs vary significantly
- Solution: Set to 1 for better balance
osd: default osd_op_type wpq
- Issue: mclock still unstable in corner cases in quincy
- Solution: Setting to wpq fixes some issues
Dashboard Enhancements
- Update logo on create cluster page
- CES dashboard branding (logo, constants, favicon, login page links)
Images
- Change: Container images now use harbor.clyso.com instead of quay.io
Upstream Backports
os/bluestore: get rid off resulting lba alignment in allocators
- Fixes: https://tracker.ceph.com/issues/63618, https://tracker.ceph.com/issues/62815
- Based on: https://github.com/ceph/ceph/pull/54877
osd/scrub: increasing max_osd_scrubs from 1 to 3
- Issue: Current default value of '1' too low
- Problem: Cluster susceptible to scrub scheduling delays from local issues
- Solution: Increased to 3 for better resilience
cmake/modules: Fix Debian/Ubuntu RocksDB Performance Issues
- Fix: Setting CXXFLAGS environmental variable for BlueStore RocksDB performance
osd/ECTransaction: Remove incorrect asserts in generate_transactions
- Fixes: https://tracker.ceph.com/issues/65509
- Issue: Incorrect asserts from EC Overwrites implementation
common/options: Set LZ4 compression for bluestore RocksDB
- Benefit: Extremely positive results in field testing
- Reference: https://ceph.io/en/news/blog/2022/rocksdb-tuning-deep-dive/
common/options: Update RocksDB CF Tuning
- max_write_buffer_number: 128 to 64
- min_write_buffer_number_to_merge: 16 to 6
- write_buffer_size: 8388608 to 16777216
- Removed: ttl
- L/P settings: min_write_buffer_number_to_merge=32
common/options: increase mds_cache_trim_threshold 2x
- Change: 256K to 512K
- Benefit: MDS trims LRU more actively, keeping cache size under configured limit
mgr/dashboard: disable dashboard v3 in quincy
cephadm: disable ms_bind_ipv4 if we will enable ms_bind_ipv6
- Issue: IPv6 cluster bootstrap left ms_bind_ipv4 enabled
- Fixes: https://tracker.ceph.com/issues/66436
mgr/k8sevents: update V1Events to CoreV1Events
mgr/prometheus: s/pkg_resources.packaging/packaging/
Known Issues
- None at time of release