Overview
Chorus is a data replication software designed for multiple S3 storage systems. It works by:
- Users inputting storage credentials into the Chorus configuration.
- One storage is selected as the
main
while others becomefollowers
. - Once configured and started, Chorus's S3 API can be used instead of the
main
storage's API. - Chorus proxies requests to the
main
storage and asynchronously replicates the data tofollower
storages. - All existing data is also replicated from
main
tofollower
in background. - Data replication can be configured, paused, resumed by user by bucket with web admin UI or CLI
Components
Chorus is structured around two main web services: Chorus Proxy
and Chorus Worker
.
The Chorus Proxy
operates as an intermediary for the main
S3 storage, which also means Chorus provides an S3 API.
Using Chorus Proxy involves:
- Sending a request to the Chorus S3 API
-1-
. - The
Chorus Proxy
redirects the request to themain
storage according to routing policy in config-2-3-4-7-
. - For write requests (
{POST}
,{PUT}
,{DELETE}
), the proxy creates a task to copy changes from themain
tofollower
storages according to replication policy-5-6-
. - The
Chorus Worker
retrieves the task and syncs changes from themain
to thefollower
-8-9-10-
.
All changes generated by the proxy are stored in an event
queue.
Chorus also has a initial migration
feature for cases where the main
S3 storage isn't initially empty.
This allows Chorus to transfer existing data to followers
in the background.
The initial migration
process involves:
sss
- Listing all buckets in the
main
. - Listing all objects for all listed buckets in the
main
. - Creating a task for each object to sync it from the
main
to thefollower
. - The worker processes tasks in the background, copying or updating files as needed.
Features
- routing & replication per bucket, PAUSE & RESUME
- defining custom s3 credentials for
Chorus Proxy
- sync obj/bucket meta, content, tags, ACL
- migrate existing data in background
- track replication lag
- worker rate-limit