When DynamoDB Throttles: Restoring Stability Under Load (Lambda → DynamoDB)

 

When DynamoDB Throttles: Restoring Stability Under Load (Lambda → DynamoDB)

Throttling is DynamoDB’s way of saying “breathe.”





You’ve got a smooth-running serverless workflow — until DynamoDB starts rejecting requests faster than you can scale them.

We’ll build the pipeline, trigger throttling on purpose, surface the failure, and fix it with smarter retries, exponential backoff, and jitter.


Problem

Your Lambda writes records into DynamoDB flawlessly during light traffic. Then a burst hits — maybe a thousand S3 events in a minute — and suddenly, ProvisionedThroughputExceededException appears in your logs.

Some writes succeed, others fail silently. Data drifts.


Clarifying the Issue

DynamoDB protects itself under load by throttling requests that exceed the table’s capacity limits. When that happens:

  • The affected request receives a ProvisionedThroughputExceededException.
  • Lambda logs the error but doesn’t automatically retry unless you code for it.
  • S3 (or any async invoker) assumes success once Lambda accepts the event.

Result: incomplete persistence. The table looks fine, but it’s missing items that never made it through throttling.


Why It Matters

Throttling isn’t a crash — it’s a quiet slow bleed.

Your table doesn’t lose data; your application does.

Without backoff, you waste capacity retrying too fast. Without autoscaling, you never catch up.

In a production system, that means broken metrics, lost transactions, and user-facing inconsistencies.


Key Terms

  • ProvisionedThroughputExceededException — The official signal that DynamoDB is throttling requests.
  • Exponential Backoff — A retry pattern that doubles the wait time after each failure.
  • Jitter — Random variation added to retry delays to prevent simultaneous retries from overwhelming the service.
  • Autoscaling — DynamoDB’s ability to automatically adjust read/write capacity based on demand.
  • Idempotent Write — A write pattern that ensures retries don’t duplicate data.

Steps at a Glance

  1. Create the baseline pipeline.
  2. Test the pipeline under normal load.
  3. Trigger throttling deliberately.
  4. Observe the throttling errors in CloudWatch.
  5. Implement exponential backoff with jitter.
  6. Enable DynamoDB autoscaling.
  7. Re-test to confirm stability.
  8. Clean up all resources.

Step 1 – Create the Baseline Pipeline

We’ll create a simple Lambda that writes to a DynamoDB table.

aws dynamodb create-table \
  --table-name throttle-demo-table \
  --attribute-definitions AttributeName=id,AttributeType=S \
  --key-schema AttributeName=id,KeyType=HASH \
  --billing-mode PROVISIONED \
  --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

aws iam create-role \
  --role-name throttle-demo-role \
  --assume-role-policy-document file://trust-policy.json

aws iam put-role-policy \
  --role-name throttle-demo-role \
  --policy-name throttle-demo-policy \
  --policy-document file://inline-policy.json

aws iam attach-role-policy \
  --role-name throttle-demo-role \
  --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

Create the Lambda function:

import boto3, hashlib, json, time

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('throttle-demo-table')

def lambda_handler(event, context):
    for i in range(100):
        item_id = hashlib.md5(str(i).encode()).hexdigest()
        table.put_item(Item={'id': item_id, 'data': f'record-{i}'})
    print("Batch write complete.")

Deploy it:

zip function.zip lambda_handler.py
aws lambda create-function \
  --function-name throttle-demo-func \
  --runtime python3.9 \
  --role arn:aws:iam::<account-id>:role/throttle-demo-role \
  --handler lambda_handler.lambda_handler \
  --zip-file fileb://function.zip

✅ Lambda and DynamoDB ready.


Step 2 – Test Under Normal Load

aws lambda invoke \
  --function-name throttle-demo-func \
  output.json

aws dynamodb scan \
  --table-name throttle-demo-table \
  --select COUNT

Expected output:

{"Count":100}

✅ Normal operation confirmed.
Next, overload it.


Step 3 – Trigger Throttling

We’ll drop the write capacity to 1 to force throttling.

aws dynamodb update-table \
  --table-name throttle-demo-table \
  --provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1

Run the same function again.

aws lambda invoke \
  --function-name throttle-demo-func \
  output.json

Step 4 – Observe the Throttling Errors

aws logs filter-log-events \
  --log-group-name /aws/lambda/throttle-demo-func \
  --filter-pattern "ProvisionedThroughputExceededException"

Expected output:

botocore.exceptions.ClientError: An error occurred (ProvisionedThroughputExceededException)

✅ Throttling reproduced.
Next, implement retry logic.


Step 5 – Implement Exponential Backoff with Jitter

import boto3, hashlib, json, time, random
from botocore.exceptions import ClientError

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('throttle-demo-table')

def lambda_handler(event, context):
    for i in range(100):
        item_id = hashlib.md5(str(i).encode()).hexdigest()
        backoff = 0.1
        while True:
            try:
                table.put_item(Item={'id': item_id, 'data': f'record-{i}'})
                break
            except ClientError as e:
                if e.response['Error']['Code'] == 'ProvisionedThroughputExceededException':
                    print(f"Throttled on {item_id}. Retrying after {backoff}s")
                    time.sleep(backoff + random.uniform(0, 0.1))  # Add jitter
                    backoff = min(backoff * 2, 5)
                else:
                    raise

✅ Backoff with jitter added.
Next, scale dynamically.


Step 6 – Enable DynamoDB Autoscaling

aws application-autoscaling register-scalable-target \
  --service-namespace dynamodb \
  --resource-id table/throttle-demo-table \
  --scalable-dimension dynamodb:table:WriteCapacityUnits \
  --min-capacity 1 \
  --max-capacity 50

aws application-autoscaling put-scaling-policy \
  --service-namespace dynamodb \
  --resource-id table/throttle-demo-table \
  --scalable-dimension dynamodb:table:WriteCapacityUnits \
  --policy-name scale-writes \
  --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration file://autoscale-config.json

✅ Table now self-scales with load.
Next, confirm it works.


Step 7 – Re-Test to Confirm Stability

aws lambda invoke \
  --function-name throttle-demo-func \
  output.json

aws dynamodb describe-table \
  --table-name throttle-demo-table \
  --query 'Table.ProvisionedThroughput'

Expected output:

{"WriteCapacityUnits": > 5}

✅ Autoscaling engaged, writes stable.
Next, clean up.


Step 8 – Clean Up All Resources

aws lambda delete-function \
  --function-name throttle-demo-func

aws dynamodb delete-table \
  --table-name throttle-demo-table

✅ All resources deleted.


Pro Tips

• Backoff needs both delay and jitter — randomizing wait times reduces retry collisions.
• Use on-demand mode for unpredictable traffic patterns.
• Monitor ThrottledRequests CloudWatch metric for early detection.
• Keep retry logic lightweight to avoid cascading slowdowns.


Conclusion

Throttling is DynamoDB’s way of saying “breathe.”

It’s not failure — it’s feedback.

Designing with backoff, jitter, and autoscaling turns transient rejection into graceful resilience.


Aaron Rose is a software engineer and technology writer at tech-reader.blog and the author of Think Like a Genius.

Comments

Popular posts from this blog

The New ChatGPT Reason Feature: What It Is and Why You Should Use It

Insight: The Great Minimal OS Showdown—DietPi vs Raspberry Pi OS Lite

Raspberry Pi Connect vs. RealVNC: A Comprehensive Comparison