Skip to main content

Ceph RGW Multisite Deployment Tutorial

This tutorial guides you through setting up a Ceph RGW multisite deployment with a single realm and two zones.

Prerequisites

  • Two clusters with Ceph installed:
    • Cluster A: xx.xx.1.18
    • Cluster B: xx.xx.1.19
  • Administrative access to both clusters.
  • Cephadm installed for managing RGW services.

Procedure

Configure "Cluster A" (Master Zone)

  1. Create a Realm

    Run the following command on Cluster A to create a realm named acme:

    radosgw-admin realm create --rgw-realm=acme --default

    Example output:

    {
    "id": "ae173ce5-6bf9-4f87-9480-bxxxx",
    "name": "acme",
    "current_period": "4cf57995-b15b-4627-b2ed-xxxx",
    "epoch": 1
    }
  2. Create a Master Zone Group

    Create a zone group (zg1) with Cluster A's endpoint:

    radosgw-admin zonegroup create \
    --rgw-zonegroup=zg1 \
    --endpoints=http://xx.xx.1.18:8080 \
    --rgw-realm=acme --master --default
  3. Create a Master Zone

    Run this command on a node in "Cluster A" to create the master zone zone-a:

    radosgw-admin zone create \
    --rgw-zonegroup=zg1 \
    --rgw-zone=zone-a \
    --master --default \
    --endpoints=http://xx.xx.1.18:8080
  4. Create a System User for Synchronization

    Create a system user for multisite synchronization:

    radosgw-admin user create --uid="sync-user" --display-name="Sync User" --system

    Note the access_key and secret_key from the output.

  5. Bind the System User to the Zone

    Set the system user's credentials for zone-a:

    ACCESS_KEY=<your_access_key>
    SECRET_KEY=<your_secret_key>
    radosgw-admin zone modify --rgw-zone=zone-a --access-key="$ACCESS_KEY" --secret="$SECRET_K
  6. Commit the Period

    Commit the changes to the realm's period:

    radosgw-admin period update --commit
  7. Deploy RGW on "Cluster A"

    Deploy the RGW service for zone-a using cephadm:

    ceph orch apply rgw acme zone-a --placement="1 cephadm-test-2" --port=8080

Configure "Cluster B" (Secondary Zone)

  1. Pull the Realm and Period from "Cluster A"

    Reuse the system user's credentials to pull the realm and period:

    ACCESS_KEY=<your_access_key>
    SECRET_KEY=<your_secret_key>
    radosgw-admin realm pull \
    --url=http://xx.xx.1.18:8080 \
    --access-key="$ACCESS_KEY" --secret="$SECRET_KEY"
  2. Set the Realm as Default

    Set the pulled realm as the default:

    radosgw-admin realm default --rgw-realm=acme
  3. Create a Secondary Zone

    Create a secondary zone (zone-b) in the same zone group:

    radosgw-admin zone create \
    --rgw-zonegroup=zg1 \
    --rgw-zone=zone-b \
    --endpoints=http://xx.xx.1.19:8080 \
    --access-key="$ACCESS_KEY" --secret="$SECRET_KEY"
  4. Commit the Period

    Commit the changes to the realm's period:

    radosgw-admin period update --commit
  5. Deploy RGW on "Cluster B"

    Deploy the RGW service for zone-b using cephadm:

    ceph orch apply rgw acme zone-b --placement="1 cephadm-test-3" --port=8080

Configure Sync Policy (Optional)

  1. To explicitly sync all buckets between zones, create a sync group and pipe:

    radosgw-admin sync group create --group-id=group1
    radosgw-admin sync group pipe create --group-id=group1 --pipe-id=pipe1 \
    --source-zones='*' --source-bucket='*' \
    --dest-zones='*' --dest-bucket='*'
    radosgw-admin period update --commit

Topology Overview

The final topology will look like this:

Realm: acme
└── Zonegroup: zg1
├── zone-a (master, endpoints http://xx.xx.1.18:8080)
└── zone-b (secondary, endpoints http://xx.xx.1.19:8080)

You now have a fully-configured Ceph RGW multisite deployment with a single realm and two zones. When deploying a Ceph RGW multisite setup, data synchronization between clusters ensures consistency and high availability. The synchronization process is managed by the RGW multisite feature, which replicates data across zones within the same realm.

Key Reasons for Data Synchronization

  1. Disaster Recovery: By replicating data between zones, the system ensures that data remains accessible even if one cluster experiences a failure.
  2. Geographical Redundancy: Synchronization allows users to access data from the nearest zone, reducing latency and improving performance.
  3. Consistency Across Zones: The synchronization mechanism ensures that all zones within the same realm have the same data, maintaining consistency for applications and users.
  4. Load Balancing: With synchronized data, read and write operations can be distributed across zones, balancing the load and improving scalability.
  5. Compliance and Backup: Synchronization provides an additional layer of data protection, meeting compliance requirements and serving as a backup mechanism.

The synchronization is achieved through the use of a system user with specific credentials, which facilitates secure communication and data replication between the zones. The period update command ensures that all changes to the realm, zone group, and zones are committed and propagated across the deployment.

You can check the sync status with radosgw-admin sync status, and the result will look like this:

          realm xxx (acme)
zonegroup xxx (zg1)
zone xxx (zone-a)
current time 2025-08-21T20:40:28Z
zonegroup features enabled: resharding
disabled: compress-encrypted
metadata sync no sync (zone is master)
data sync source: xxx (zone-b)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source

Dynamic Resharding

Dynamic resharding is a process where the system automatically reshards the bucket index without requiring manual intervention. This feature should be turned on on both clusters to work. If you feel the need to increase or decrease the shards manually, you can use radosgw-admin bucket reshard --bucket=<bucket> --num-shards=<shards> --yes-i-really-mean-it to change it during the maintenance window. Once the sync is done, check with radosgw-admin bucket sync status --bucket=<bucket>.

If you want to enable/disable dynamic resharding automatically, you can create a cron job with crontab and add these two lines:

0 22 * * * ceph config set client.rgw rgw_dynamic_resharding true --> turn on at 10pm
0 4 * * * ceph config set client.rgw rgw_dynamic_resharding false --> turn off at 4 am

If you want to throttle the sync process, try to set rgw_max_objs_per_shard from 100000 to a lower value, which will reduce the work in every sync cycle when dynamic rehsharding is enabled.