How to Use AWS On-Demand S3 Batch Operations

Select Language:

If you’re working with Amazon S3 and need to run batch operations quickly, there’s good news. Amazon S3 now has a feature called the manifest generator that lets you create and run batch jobs instantly, without waiting for inventory reports. You can access this feature through the S3 Console or using the AWS CLI and SDKs. It was available through the command line before, but it was only added to the user interface last month.

Here’s how to use the S3 Console for on-demand batch operations:

Start by opening the Amazon S3 service in the AWS Management Console. From the side menu, select Batch Operations. Click on Create Job and choose the region where you want the job to run. Instead of using a pre-existing inventory report, under the manifest section, select “Generate an object list using filters.” Then, choose your source bucket, such as s3://your-source-bucket.

Next, apply filters to specify exactly which objects you want to include. You can filter by prefix to target specific folders (like “2024/jan-24/”), storage classes such as Glacier or Deep Archive, object size ranges, or creation dates. This way, you can focus only on the data you want to process.

After setting your filters, choose the type of operation you want to perform—such as copying, restoring, tagging, or deleting objects. Configure the specific details for that operation, like the destination bucket and encryption options for copying, or expiration days for restores. You can choose to stick with default API settings or customize your own.

Once your operation setup is complete, give your job a name and description. Set the job priority on a scale from 1 to 10, where higher numbers are more urgent. Decide whether the job should run immediately after creation or if you prefer to review the settings first. Specify where to store the manifest output for auditing, and set reporting preferences to track complete or only failed tasks. Finally, assign an IAM role with the necessary permissions to run the job.

Review all your configurations carefully on the summary page. When everything looks good, click Create Job to start it. If you selected to review before running, you can trigger the job manually afterward.

The main benefits of using the manifest generator are that you can run tasks right away without waiting for inventory reports, filter objects dynamically based on the current state of your bucket, and combine multiple filters for precise targeting.

For example, if you need to restore backups stored in Glacier Deep Archive immediately—say, after a disaster—you can filter for only Glacier or Deep Archive objects within a specific folder, set up the restore operation with suitable retrieval settings, and run it instantly. This approach cuts down the usual delay of 24-48 hours and allows for quick, large-scale responses.

If you prefer working via command line, it’s best to generate a JSON input file. First, run this command to create a skeleton configuration:

aws s3control create-job –generate-cli-skeleton

This command provides a template with all possible settings. You then modify the JSON to include your chosen operation, filters, and other options. Make sure to keep only the parts you need, remove unnecessary fields, and fill in the required information.

Once your JSON file is ready, create the batch job with this command:

aws s3control create-job –cli-input-json file://YOUR_FILE_NAME.json

Be aware that the manifest generator currently only works within the same region—it doesn’t support cross-region jobs.

For detailed information about each configuration option, check the official S3 control create-job documentation. If you have any questions or need further assistance, don’t hesitate to reach out.

A comprehensive blog post with more details is also in the works, along with a knowledge center article to help you get the most out of this new feature.