Skip to main content

Operations

Resource requirements

Proxy

  • stateless - horizontally scalable
  • low-memory
  • low-CPU
  • high-network

Worker

  • stateless - horizontally scalable
  • high-memory
  • low-CPU
  • high-network

Set the following worker configuration parameters to limit the worker-instance network and memory consumption:

concurrency: 10 # max number of simultaneously processed tasks (copied abjects)

Strict worker limits can cause delay in replications and increasing number of objects in the work queue. A large amount of unprocessed tasks increases Redis RAM consumption. Balance worker limits according to your workload and desired replication delay.

Storage Rate Limiting

Each storage can have its own rate limit (requests per minute):

storage:
storages:
my_storage:
rateLimit:
enabled: true
rpm: 60

Use this when destination storages have API rate limits or limited capacity.

Work Queue (Redis)

  • scale: Redis-Cluster, Redis-Sentinel.
  • persistence: Redis-AOF, Redis-Snapshot (RDB).
  • fault-tolerance - not critical. In case of data loss, bucket replication can be restarted.
  • memory: 1M migrated obj ~ 105MB, 1M tasks in the queue ~ 700MB
  • low cpu: 100-1000 rps

Monitoring

All components expose Prometheus metrics on the :9090/metrics endpoint. The metrics can be used in a Grafana dashboard to monitor workload, task rate, error rate, S3 objects, and bytes uploaded and downloaded.

Tracing

All components support open-tracing protocol. Currently this has been tested only with Jaeger.

To enable tracing, modify the configuration file so that it includes this::

trace:
enabled: false
endpoint: # url to Jaeger or other open trace provider

Logging

All components support structured logging. To enable JSON logs, modify the configuration file so that it includes the following:

log:
json: false # false for dev console logger, true - json log for prod to export to Grafana&Loki.
level: info