How to Scale ECS with SQS Standard on AWS

Select Language:

If you’re running a workload on Amazon ECS that relies on SQS, optimizing your auto-scaling setup can help ensure you have the right capacity without overspending or risking message loss. Here’s how you can improve your ECS auto-scaling to better match your application’s actual processing needs.

First, standard SQS metrics like the number of visible messages aren’t enough because they don’t reflect your containers’ processing capacity. Instead, create a custom CloudWatch metric called ‘backlog per instance’ or ‘backlog per task.’ This metric is calculated by dividing the number of messages in your queue by the number of active ECS tasks. It gives a clearer picture of whether your scale needs adjustment based on real workload.

To find the target value for this metric, use this simple formula:

Acceptable backlog per task = (Your acceptable latency) / (Average processing time per message).

For example, if your application can tolerate up to 60 seconds of delay and each message takes about 5 seconds to process, your target backlog per task would be 12 messages. If your maximum concurrency is 200 messages, make sure to factor that into your scaling decisions.

Here are some specific steps to address common issues:

Choosing the right metric: Instead of just queue depth, track the actual number of requests your containers are processing at any moment. Your application can send custom metrics to CloudWatch indicating current concurrency levels, and you can average these across all tasks to inform your autoscaling rules.
Preventing over-provisioning: Because standard metrics don’t show your true processing ability, switching to a backlog-per-task metric helps. You can combine data on queue length and active tasks using CloudWatch math to get a meaningful workload indicator. This prevents unnecessary scaling actions, saving costs and avoiding resource wastage.
Avoiding message loss and abrupt scaling down: For tasks that run for a long time, use ECS scale-in protection. When a task begins processing, enable scale-in protection to prevent it from terminating prematurely. Once it finishes, disable protection. This approach ensures in-flight messages aren’t lost when scaling down.

For implementation, if your workload is predictable, step scaling based on your custom backlog metric works well—adding specific numbers of tasks as the backlog grows. However, with variable processing times, using target tracking policies that adjust based on your combined backlog and active task metrics provides more stability and responsiveness.

To set this up, you’ll need to publish your custom metrics to CloudWatch using the AWS CLI or SDKs. Then, create a scaling policy that aims to maintain your desired backlog-per-task value.

This method will help your ECS service scale more smoothly, reduce costs by avoiding over-provisioning, and protect your messages during scale-in events. Keeping your auto-scaling aligned with your actual processing capacity ensures your workload runs efficiently, no matter how processing times fluctuate.