S3 API Lifecycle Policy to Delete Incomplete Multipart Uploads
Problem
During operation, multipart uploads are repeatedly left as incomplete without appropriate adjustments.
Users usually notice that the total number of objects displayed in a bucket is higher than the number of S3 objects. This difference can be explained by multipart uploads in the bucket that are not completed or not properly canceled.
Solution
Incomplete multipart uploads are caused by an interruption when uploading objects that are larger than the threshold value (default: 8MB) in order to divide them into several chunks and upload them. The storage does not know whether the upload will be continued or how long it should hold the data. This results in incomplete multipart upload artifacts that have to be cleaned up by the owner.
The owner of the bucket is responsible for creating a lifecycle policy or canceling the multipart uploads manually.
Define a S3 lifecycle policy to delete ( abort ) incomplete multipart uploads after 3 days
abort-lc-mp-3days.xml
<LifecycleConfiguration>
<Rule>
<ID>abort-multipartupload-3days</ID>
<Prefix></Prefix>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>3</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
Check if lc already exists
root@ceph-clyso # s3cmd getlifecycle s3://quota-clyso
ERROR: S3 error: 404 (NoSuchLifecycleConfiguration)
root@ceph-clyso #
So no lc configuration exists for this bucket.
Upload the lc policy file
root@ceph-clyso # s3cmd setlifecycle abort-lc-mp-3days.xml s3://quota-clyso
s3://quota-clyso/: Lifecycle Policy updated
root@ceph-clyso #
root@ceph-clyso # s3cmd getlifecycle s3://quota-clyso
<?xml version="1.0" ?>
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Rule>
<ID>abort-multipartupload-3days</ID>
<Prefix/>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>3</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
root@ceph-clyso #
Manual commands to delete ( abort ) incomplete multipart uploads after 3 days
List incomplete multipart uploads for a bucket
root@ceph-clyso# s3cmd multipart s3://quota-clyso/10g-multipart.bin
s3://quota-clyso/10g-multipart.bin
Initiated Path Id
2024-06-20T22:16:43.567Z s3://quota-clyso/10-9 2~W2b888tDQMnMy0g8n6XPQeiQSC1pQ9w
root@ceph-clyso#
Abort incomplete multipart uploads for a bucket
root@ceph-clyso# s3cmd abortmp s3://quota-clyso/10-9 2~W2b888tDQMnMy0g8n6XPQeiQSC1pQ9w
s3://quota-clyso/10-9
root@ceph-clyso#
Verify incomplete multipart uploads for a bucket
root@ceph-clyso# s3cmd multipart s3://quota-clyso/10g-multipart.bin
s3://quota-clyso/10g-multipart.bin
Initiated Path Id
root@ceph-clyso#
For ceph/radosgw administrators are the following commands are useful for the status and processing of the lifecycle policies in the internals of radosgw:
- radosgw-admin lc list
- radosgw-admin lc process