ThrottlingException During Rekognition Batch Processing
ThrottlingException During Rekognition Batch Processing
How to detect, monitor, and fix Rekognition rate limits before production fails.
#AWS #AmazonRekognition #CloudArchitecture #DevOps
Category: Service Quotas & Throttling
Problem
Your batch image job calls Amazon Rekognition repeatedly.
The first set of requests succeed.
Then the workload begins failing with:
ThrottlingException: Rate exceeded
Permissions are correct. Credentials are valid. No configuration changes occurred.
The job slows down or collapses entirely.
Clarifying the Issue
ThrottlingException means you have exceeded a service quota or request rate limit.
Rekognition enforces limits on:
- Transactions per second (TPS)
- Concurrent video jobs
- Certain API categories
When your request rate exceeds the allowed threshold, Rekognition responds with throttling.
This is not a permissions failure.
It is capacity protection.
In AWS services:
📌 Throttling protects shared infrastructure from burst overload.
📌 If you send requests too quickly, they will be rejected.
Why It Matters
This failure commonly appears in:
- Large image ingestion pipelines
- Parallelized batch jobs
- Serverless fan-out architectures
- High-concurrency container workloads
Developers often test with small datasets and see no issue.
Production volume exposes the limit.
Without proper retry strategy, throttling can:
- Cause cascading failures
- Trigger retry storms
- Inflate costs
- Extend processing times dramatically
This is a quota-boundary issue — not a Rekognition issue.
Key Terms
- ThrottlingException – AWS rejected the request due to exceeding allowed request rate.
- TPS (Transactions Per Second) – Number of API calls permitted per second.
- Service Quota – AWS-defined maximum usage limits per account per region.
- Exponential Backoff – Retry strategy that increases delay between attempts.
- Jitter – Randomized delay added to backoff to prevent synchronized retries.
Steps at a Glance
- Confirm the exact API being throttled.
- Measure current request rate (TPS).
- Check Rekognition service quotas and CloudWatch metrics.
- Implement exponential backoff with jitter.
- Reduce parallelism or batch size.
- Request a quota increase if necessary.
Detailed Steps
Step 1: Identify the Throttled API
Inspect logs to determine which Rekognition API call is failing:
DetectLabelsDetectFacesStartLabelDetection- Other endpoints
Different APIs may have different rate limits.
Focus on the exact operation generating the error.
Step 2: Measure Your Request Rate
Instrument your application to calculate:
- Requests per second
- Number of concurrent workers
- Burst patterns
Many throttling issues occur not from sustained load, but from sudden spikes.
Example:
- 200 images dispatched simultaneously
- All workers call Rekognition at once
- Instant throttle
Step 3: Review Service Quotas and Metrics
Navigate to:
AWS Console → Service Quotas → Rekognition
Identify:
- Default TPS limits
- Concurrent job limits
- Region-specific constraints
Also review CloudWatch metrics for Rekognition, specifically throttling-related metrics such as ThrottledCount, to confirm real-time rate limiting and identify spike patterns.
Confirm your measured request rate exceeds the documented quota.
Step 4: Implement Exponential Backoff with Jitter
When throttled, do not retry immediately.
Use exponential backoff:
delay = base * (2 ^ retry_count)
Add jitter:
delay = random_between(0, delay)
This prevents synchronized retry storms.
Most AWS SDKs include built-in retry mechanisms, but the default retry count is often low (commonly 3 attempts). For high-volume Rekognition workloads, you may need to explicitly increase the maximum retry attempts in your client configuration.
Step 5: Reduce Parallelism
If you control concurrency:
- Lower worker count
- Throttle internal job queue
- Process images in smaller batches
Controlled throughput is more efficient than uncontrolled retry loops.
Step 6: Request Quota Increase
If workload legitimately requires higher throughput:
- Submit a quota increase request via Service Quotas
- Provide expected TPS requirements
- Confirm region alignment
AWS often approves increases for production use cases.
Pro Tips
- Throttling is often burst-driven, not average-driven.
- Retry storms amplify throttling if backoff is not implemented.
- Serverless fan-out architectures can overwhelm limits instantly.
- Monitor CloudWatch
ThrottledCountmetrics and create alarms before production batches begin failing. - Most SDK default retry counts are low — tune them intentionally.
- Always implement retry with jitter — not fixed delays.
If permissions are correct but you see ThrottlingException, look at request velocity.
Conclusion
ThrottlingException during Rekognition batch processing indicates you exceeded service limits.
Rekognition is protecting itself from overload.
The solution is not more permissions.
It is controlled throughput:
- Measure
- Back off
- Reduce concurrency
- Or increase quota
📌 This is a quota-boundary issue — not a Rekognition issue.
Aaron Rose is a software engineer and technology writer at tech-reader.blog and the author of Think Like a Genius.


Comments
Post a Comment