Operations
Resource requirements
Proxy
- stateless - horizontally scalable
- low-memory
- low-CPU
- high-network
Worker
- stateless - horizontally scalable
- high-memory
- low-CPU
- high-network
Set the following worker configuration parameters to limit the worker-instance network and memory consumption:
concurrency: 10 # max number of simultaneously processed tasks (copied abjects)
Strict worker limits can cause delay in replications and increasing number of objects in the work queue. A large amount of unprocessed tasks increases Redis RAM consumption. Balance worker limits according to your workload and desired replication delay.
Storage Rate Limiting
Each storage can have its own rate limit (requests per minute):
storage:
storages:
my_storage:
rateLimit:
enabled: true
rpm: 60
Use this when destination storages have API rate limits or limited capacity.
Work Queue (Redis)
- scale: Redis-Cluster, Redis-Sentinel.
- persistence: Redis-AOF, Redis-Snapshot (RDB).
- fault-tolerance - not critical. In case of data loss, bucket replication can be restarted.
- memory: 1M migrated obj ~ 105MB, 1M tasks in the queue ~ 700MB
- low cpu: 100-1000 rps
Monitoring
All components expose Prometheus metrics on the :9090/metrics endpoint. The
metrics can be used in a Grafana dashboard to monitor workload, task rate,
error rate, S3 objects, and bytes uploaded and downloaded.
Tracing
All components support open-tracing protocol. Currently this has been tested only with Jaeger.
To enable tracing, modify the configuration file so that it includes this::
trace:
enabled: false
endpoint: # url to Jaeger or other open trace provider
Logging
All components support structured logging. To enable JSON logs, modify the configuration file so that it includes the following:
log:
json: false # false for dev console logger, true - json log for prod to export to Grafana&Loki.
level: info