Skip to main content

S3 API Lifecycle Policy to Delete Incomplete Multipart Uploads

Problem

During operation, multipart uploads are repeatedly left incomplete without appropriate adjustments.

Sometimes the total number of objects displayed in a bucket is higher than the number of S3 objects. This difference is due to multipart uploads in the bucket that have not been completed or that have not been canceled properly.

Solution

Incomplete multipart uploads are caused by interruptions when uploading objects that are larger than the threshold value (default: 8MB). The threshold value is the value above which objects are divided into multiple chunks prior to upload. The storage does not know whether the upload will be continued, nor does it know how long it should hold the data. This results in incomplete multipart upload artifacts that have to be cleaned up by the owner.

The owner of the bucket is responsible for creating a lifecycle policy or canceling the multipart uploads manually. Creating a lifecycle policy or cancelling the multipart uploads manually will correct this problem.

Define an S3 lifecycle policy to delete (abort) incomplete multipart uploads after 3 days

abort-lc-mp-3days.xml

<LifecycleConfiguration>
<Rule>
<ID>abort-multipartupload-3days</ID>
<Prefix></Prefix>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>3</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>

Check if lc already exists

root@ceph-clyso # s3cmd getlifecycle s3://quota-clyso
ERROR: S3 error: 404 (NoSuchLifecycleConfiguration)
root@ceph-clyso #

This means that no lifecycle configuration exists for this bucket.

Upload the lc policy file

root@ceph-clyso # s3cmd setlifecycle abort-lc-mp-3days.xml s3://quota-clyso
s3://quota-clyso/: Lifecycle Policy updated
root@ceph-clyso #
root@ceph-clyso # s3cmd getlifecycle s3://quota-clyso
<?xml version="1.0" ?>
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Rule>
<ID>abort-multipartupload-3days</ID>
<Prefix/>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>3</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
root@ceph-clyso #

Manual commands for deleting (aborting) incomplete multipart uploads after 3 days

List incomplete multipart uploads for a bucket

root@ceph-clyso# s3cmd multipart s3://quota-clyso/10g-multipart.bin
s3://quota-clyso/10g-multipart.bin
Initiated Path Id
2024-06-20T22:16:43.567Z s3://quota-clyso/10-9 2~W2b888tDQMnMy0g8n6XPQeiQSC1pQ9w
root@ceph-clyso#

Abort incomplete multipart uploads for a bucket

root@ceph-clyso# s3cmd abortmp s3://quota-clyso/10-9 2~W2b888tDQMnMy0g8n6XPQeiQSC1pQ9w
s3://quota-clyso/10-9
root@ceph-clyso#

Verify incomplete multipart uploads for a bucket

root@ceph-clyso# s3cmd multipart s3://quota-clyso/10g-multipart.bin
s3://quota-clyso/10g-multipart.bin
Initiated Path Id
root@ceph-clyso#
tip

The following commands are useful to ceph/radosgw administrators who want to keep track of the status and processing of the lifecycle policies in the internals of RadosGW:

  • radosgw-admin lc list
  • radosgw-admin lc process