When DynamoDB Throttles: Restoring Stability Under Load (Lambda → DynamoDB)
Throttling is DynamoDB’s way of saying “breathe.”
You’ve got a smooth-running serverless workflow — until DynamoDB starts rejecting requests faster than you can scale them.
We’ll build the pipeline, trigger throttling on purpose, surface the failure, and fix it with smarter retries, exponential backoff, and jitter.
Problem
Your Lambda writes records into DynamoDB flawlessly during light traffic. Then a burst hits — maybe a thousand S3 events in a minute — and suddenly, ProvisionedThroughputExceededException appears in your logs.
Some writes succeed, others fail silently. Data drifts.
Clarifying the Issue
DynamoDB protects itself under load by throttling requests that exceed the table’s capacity limits. When that happens:
- The affected request receives a
ProvisionedThroughputExceededException. - Lambda logs the error but doesn’t automatically retry unless you code for it.
- S3 (or any async invoker) assumes success once Lambda accepts the event.
Result: incomplete persistence. The table looks fine, but it’s missing items that never made it through throttling.
Why It Matters
Throttling isn’t a crash — it’s a quiet slow bleed.
Your table doesn’t lose data; your application does.
Without backoff, you waste capacity retrying too fast. Without autoscaling, you never catch up.
In a production system, that means broken metrics, lost transactions, and user-facing inconsistencies.
Key Terms
- ProvisionedThroughputExceededException — The official signal that DynamoDB is throttling requests.
- Exponential Backoff — A retry pattern that doubles the wait time after each failure.
- Jitter — Random variation added to retry delays to prevent simultaneous retries from overwhelming the service.
- Autoscaling — DynamoDB’s ability to automatically adjust read/write capacity based on demand.
- Idempotent Write — A write pattern that ensures retries don’t duplicate data.
Steps at a Glance
- Create the baseline pipeline.
- Test the pipeline under normal load.
- Trigger throttling deliberately.
- Observe the throttling errors in CloudWatch.
- Implement exponential backoff with jitter.
- Enable DynamoDB autoscaling.
- Re-test to confirm stability.
- Clean up all resources.
Step 1 – Create the Baseline Pipeline
We’ll create a simple Lambda that writes to a DynamoDB table.
aws dynamodb create-table \
--table-name throttle-demo-table \
--attribute-definitions AttributeName=id,AttributeType=S \
--key-schema AttributeName=id,KeyType=HASH \
--billing-mode PROVISIONED \
--provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5
aws iam create-role \
--role-name throttle-demo-role \
--assume-role-policy-document file://trust-policy.json
aws iam put-role-policy \
--role-name throttle-demo-role \
--policy-name throttle-demo-policy \
--policy-document file://inline-policy.json
aws iam attach-role-policy \
--role-name throttle-demo-role \
--policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Create the Lambda function:
import boto3, hashlib, json, time
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('throttle-demo-table')
def lambda_handler(event, context):
for i in range(100):
item_id = hashlib.md5(str(i).encode()).hexdigest()
table.put_item(Item={'id': item_id, 'data': f'record-{i}'})
print("Batch write complete.")
Deploy it:
zip function.zip lambda_handler.py
aws lambda create-function \
--function-name throttle-demo-func \
--runtime python3.9 \
--role arn:aws:iam::<account-id>:role/throttle-demo-role \
--handler lambda_handler.lambda_handler \
--zip-file fileb://function.zip
✅ Lambda and DynamoDB ready.
Step 2 – Test Under Normal Load
aws lambda invoke \
--function-name throttle-demo-func \
output.json
aws dynamodb scan \
--table-name throttle-demo-table \
--select COUNT
Expected output:
{"Count":100}
✅ Normal operation confirmed.
Next, overload it.
Step 3 – Trigger Throttling
We’ll drop the write capacity to 1 to force throttling.
aws dynamodb update-table \
--table-name throttle-demo-table \
--provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1
Run the same function again.
aws lambda invoke \
--function-name throttle-demo-func \
output.json
Step 4 – Observe the Throttling Errors
aws logs filter-log-events \
--log-group-name /aws/lambda/throttle-demo-func \
--filter-pattern "ProvisionedThroughputExceededException"
Expected output:
botocore.exceptions.ClientError: An error occurred (ProvisionedThroughputExceededException)
✅ Throttling reproduced.
Next, implement retry logic.
Step 5 – Implement Exponential Backoff with Jitter
import boto3, hashlib, json, time, random
from botocore.exceptions import ClientError
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('throttle-demo-table')
def lambda_handler(event, context):
for i in range(100):
item_id = hashlib.md5(str(i).encode()).hexdigest()
backoff = 0.1
while True:
try:
table.put_item(Item={'id': item_id, 'data': f'record-{i}'})
break
except ClientError as e:
if e.response['Error']['Code'] == 'ProvisionedThroughputExceededException':
print(f"Throttled on {item_id}. Retrying after {backoff}s")
time.sleep(backoff + random.uniform(0, 0.1)) # Add jitter
backoff = min(backoff * 2, 5)
else:
raise
✅ Backoff with jitter added.
Next, scale dynamically.
Step 6 – Enable DynamoDB Autoscaling
aws application-autoscaling register-scalable-target \
--service-namespace dynamodb \
--resource-id table/throttle-demo-table \
--scalable-dimension dynamodb:table:WriteCapacityUnits \
--min-capacity 1 \
--max-capacity 50
aws application-autoscaling put-scaling-policy \
--service-namespace dynamodb \
--resource-id table/throttle-demo-table \
--scalable-dimension dynamodb:table:WriteCapacityUnits \
--policy-name scale-writes \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration file://autoscale-config.json
✅ Table now self-scales with load.
Next, confirm it works.
Step 7 – Re-Test to Confirm Stability
aws lambda invoke \
--function-name throttle-demo-func \
output.json
aws dynamodb describe-table \
--table-name throttle-demo-table \
--query 'Table.ProvisionedThroughput'
Expected output:
{"WriteCapacityUnits": > 5}
✅ Autoscaling engaged, writes stable.
Next, clean up.
Step 8 – Clean Up All Resources
aws lambda delete-function \
--function-name throttle-demo-func
aws dynamodb delete-table \
--table-name throttle-demo-table
✅ All resources deleted.
Pro Tips
• Backoff needs both delay and jitter — randomizing wait times reduces retry collisions.
• Use on-demand mode for unpredictable traffic patterns.
• Monitor ThrottledRequests CloudWatch metric for early detection.
• Keep retry logic lightweight to avoid cascading slowdowns.
Conclusion
Throttling is DynamoDB’s way of saying “breathe.”
It’s not failure — it’s feedback.
Designing with backoff, jitter, and autoscaling turns transient rejection into graceful resilience.
Aaron Rose is a software engineer and technology writer at tech-reader.blog and the author of Think Like a Genius.


Comments
Post a Comment