S3 API Lifecycle Policy to Delete Incomplete Multipart Uploads
Problem
During operation, multipart uploads are repeatedly left incomplete without appropriate adjustments.
Sometimes the total number of objects displayed in a bucket is higher than the number of S3 objects. This difference is due to multipart uploads in the bucket that have not been completed or that have not been canceled properly.
Solution
Incomplete multipart uploads are caused by interruptions when uploading objects that are larger than the threshold value (default: 8MB). The threshold value is the value above which objects are divided into multiple chunks prior to upload. The storage does not know whether the upload will be continued, nor does it know how long it should hold the data. This results in incomplete multipart upload artifacts that have to be cleaned up by the owner.
The owner of the bucket is responsible for creating a lifecycle policy or canceling the multipart uploads manually. Creating a lifecycle policy or cancelling the multipart uploads manually will correct this problem.
Define an S3 lifecycle policy to delete (abort) incomplete multipart uploads after 3 days
abort-lc-mp-3days.xml
<LifecycleConfiguration>
<Rule>
<ID>abort-multipartupload-3days</ID>
<Prefix></Prefix>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>3</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
Check if lc already exists
root@ceph-clyso # s3cmd getlifecycle s3://quota-clyso
ERROR: S3 error: 404 (NoSuchLifecycleConfiguration)
root@ceph-clyso #
This means that no lifecycle configuration exists for this bucket.
Upload the lc policy file
root@ceph-clyso # s3cmd setlifecycle abort-lc-mp-3days.xml s3://quota-clyso
s3://quota-clyso/: Lifecycle Policy updated
root@ceph-clyso #
root@ceph-clyso # s3cmd getlifecycle s3://quota-clyso
<?xml version="1.0" ?>
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Rule>
<ID>abort-multipartupload-3days</ID>
<Prefix/>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>3</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
root@ceph-clyso #
Manual commands for deleting (aborting) incomplete multipart uploads after 3 days
List incomplete multipart uploads for a bucket
root@ceph-clyso# s3cmd multipart s3://quota-clyso/10g-multipart.bin
s3://quota-clyso/10g-multipart.bin
Initiated Path Id
2024-06-20T22:16:43.567Z s3://quota-clyso/10-9 2~W2b888tDQMnMy0g8n6XPQeiQSC1pQ9w
root@ceph-clyso#
Abort incomplete multipart uploads for a bucket
root@ceph-clyso# s3cmd abortmp s3://quota-clyso/10-9 2~W2b888tDQMnMy0g8n6XPQeiQSC1pQ9w
s3://quota-clyso/10-9
root@ceph-clyso#
Verify incomplete multipart uploads for a bucket
root@ceph-clyso# s3cmd multipart s3://quota-clyso/10g-multipart.bin
s3://quota-clyso/10g-multipart.bin
Initiated Path Id
root@ceph-clyso#
The following commands are useful to ceph/radosgw administrators who want to keep track of the status and processing of the lifecycle policies in the internals of RadosGW:
- radosgw-admin lc list
- radosgw-admin lc process