AWS Lambda Error – Throttling: Rate Exceeded

 

AWS Lambda Error – Throttling: Rate Exceeded

A practical diagnostic guide for resolving Lambda throttling caused by concurrency exhaustion, burst limit saturation, or runaway retry storms.





Problem

Your Lambda suddenly fails under load with errors like:

{
  "errorMessage": "Rate Exceeded",
  "errorType": "ThrottlingException"
}

Or CloudWatch logs show:

Task timed out
Throttling: Rate Exceeded
Execution is currently limited by concurrent execution limit

This means AWS Lambda tried to scale up to meet demand — but hit a concurrency or burst limit before it could execute your function.

Possible causes:

  • Account concurrency limit exceeded
  • Function-level reserved concurrency set too low
  • SQS consumers scaling aggressively and consuming all concurrency
  • Async retry storms from EventBridge/SNS/S3
  • Region burst limit saturated during sudden traffic spikes

Clarifying the Issue

There are two different throttling mechanisms, and fixing the wrong one wastes hours:

1. Concurrency Limit Throttling
Lambda cannot exceed your account’s concurrency limit (default 1,000, higher if increased).
All functions share this pool unless reserved.

2. Burst Limit Throttling (First-Second Scaling)
Lambda cannot instantly scale infinitely. Each region allows:

  • Immediate burst capacity, then
  • A steady ramp-up rate

If your traffic spikes faster than Lambda can ramp, you are throttled even if plenty of total concurrency is available.


Why It Matters

Throttling causes:

  • API failures
  • Lost or delayed messages
  • SQS queues backing up
  • Retry storms that multiply the load
  • Customer-facing outages

You must distinguish demand throttling from scaling throttling to fix the issue quickly.


Key Terms

  • Concurrent Executions – Number of function invocations running at the same time
  • Reserved Concurrency – Hard cap + guaranteed minimum for one function
  • Burst Limit – Immediate concurrency Lambda provides during sudden spikes
  • Ramp Rate – Additional concurrency Lambda adds per minute after bursting
  • Retry Storm – Exponential retries from event sources overwhelming Lambda

Steps at a Glance

  1. Confirm throttling in CloudWatch
  2. Check account concurrency limits
  3. Inspect reserved concurrency settings
  4. Validate burst scaling behavior
  5. Check SQS redrive & scaling behavior
  6. Look for async retry storms
  7. Apply architectural fixes

Detailed Steps

Step 1: Confirm throttling in CloudWatch

Run:

aws logs tail /aws/lambda/my-function --since 5m

Look for:

Rate Exceeded
Execution is currently limited

Now check invocation counts:

aws lambda get-account-settings

Determine whether the issue is account-wide or function-specific.


Step 2: Check account concurrency limits

Run:

aws lambda get-account-settings --query 'AccountLimit.ConcurrentExecutions'
aws lambda get-account-settings --query 'AccountUsage.ConcurrentExecutions'

If Usage == Limit, no more Lambdas can run.

Fix by:

  • Requesting a concurrency limit increase
  • Reducing consumer scaling (SQS/Kinesis/DynamoDB Streams)
  • Moving functions to separate accounts/environments

Step 3: Inspect reserved concurrency settings

Run:

aws lambda get-function-concurrency --function-name my-function

If ReservedConcurrency = 0 → function cannot run at all.
If ReservedConcurrency is too low → function throttles while others run normally.

Fix by:

aws lambda put-function-concurrency \
  --function-name my-function \
  --reserved-concurrent-executions 50

Important:
Apply Reserved Concurrency to isolate critical functions — especially SQS consumers, which will automatically scale to consume ALL available concurrency if not capped.


Step 4: Validate burst scaling behavior

Lambda has immediate burst capacity, then a ramp-up rate:

Region behavior:

  • us-east-1:

    • 3,000 immediate concurrent executions
    • Then +500 per minute
  • Most other regions:

    • 500 immediate concurrent executions
    • Then +500 per minute

If traffic spikes faster than Lambda can ramp, even with plenty of total concurrency, you still get throttling.

Mitigate by:

  • Smoothing spikes with SQS/EventBridge
  • Using Provisioned Concurrency for sustained throughput
  • Pre-warming during known traffic events

Step 5: Check SQS scaling + redrive policy

SQS → Lambda is poll-based.
The poller will scale your Lambda aggressively until it hits your account concurrency limit.

To prevent runaway scaling, check redrive settings:

aws sqs get-queue-attributes \
  --queue-url https://sqs.us-east-1.amazonaws.com/123/my-queue \
  --attribute-names All

Ensure this queue has:

  • maxReceiveCount
  • deadLetterTargetArn

Fix by setting a redrive policy:

aws sqs set-queue-attributes \
  --queue-url https://sqs.us-east-1.amazonaws.com/123/my-queue \
  --attributes '{"RedrivePolicy":"{\"deadLetterTargetArn\":\"arn:aws:sqs:us-east-1:123:my-dlq\",\"maxReceiveCount\":\"5\"}"}'

Step 6: Look for async retry storms

EventBridge, SNS, and S3 retry aggressively if your function fails.
You may be overwhelmed by retries, not real traffic.

Check recent async failures:

aws lambda list-function-event-invoke-config

Check DLQs for volume spikes.

Fix by:

  • Sending failed async events to DLQ
  • Fixing failing code first before re-enabling triggers
  • Using Reserved Concurrency to contain blast radius

Step 7: Apply architectural fixes

To permanently eliminate throttling:

  • Use Reserved Concurrency for workload isolation
  • Use Provisioned Concurrency for predictable latency
  • Insert buffering layers (SQS)
  • Add backoff + jitter to callers
  • Split high-traffic functions into multiple specialized functions
  • Add rate limiting upstream

Pro Tips

  • Use CloudWatch Metric Math to graph concurrency vs throttles
  • Enable Lambda Insights to watch cold starts + duration
  • Log throttles with alarms for early detection
  • For SQS, cap concurrency using reserved-concurrent-executions

Conclusion

Throttling is not a Lambda malfunction — it is a signal that traffic exceeded burstramp, or concurrency limits.
By confirming the source, adjusting reserved concurrency, smoothing spikes, and isolating aggressive consumers like SQS, you can restore stable, predictable Lambda scaling.


Aaron Rose is a software engineer and technology writer at tech-reader.blog and the author of Think Like a Genius.

Comments

Popular posts from this blog

The New ChatGPT Reason Feature: What It Is and Why You Should Use It

Insight: The Great Minimal OS Showdown—DietPi vs Raspberry Pi OS Lite

Raspberry Pi Connect vs. RealVNC: A Comprehensive Comparison