Amazon DynamoDB: Fixing the ItemCollectionSizeLimitExceededException - The 10 GB Index Trap
Understand why DynamoDB index partitions hit a hard 10 GB limit — and how to redesign your access patterns to avoid hitting the wall.
Problem
You run a seemingly harmless write operation, only to be met with this intimidating error:
ItemCollectionSizeLimitExceededException: Item collection size limit exceeded for partition key.
It usually shows up when working with Global Secondary Indexes (GSIs) or Local Secondary Indexes (LSIs) and leaves developers scratching their heads. You didn’t exceed your table size, so why is DynamoDB rejecting writes?
The answer: you’ve hit the 10 GB per-partition limit — one of DynamoDB’s most rigid architectural constraints.
Clarifying the Issue
Every partition key in a table defines a logical group of related items called an item collection. When you create a GSI or LSI, DynamoDB maintains additional storage for those indexed attributes. But there’s a catch: the total size of all indexed items under a single partition key cannot exceed 10 GB.
This limit applies to both the base table and all of its GSIs or LSIs, though it’s most commonly encountered in Global Secondary Indexes because they often use partition keys with lower cardinality.
This means if one partition key (for example, Customer#1234
) grows too large, DynamoDB blocks further writes that would expand its indexed data beyond this limit.
Common Scenarios:
- A popular customer, tenant, or account accumulates millions of related records.
- An index key (like
status
ortype
) is too broad, causing excessive co-location. - Write-heavy applications that rely on a small number of “hot” partitions.
The problem isn’t with your provisioned throughput — it’s with data distribution and index design.
Why It Matters
- Architectural inflexibility: Once you hit the 10 GB ceiling, no amount of capacity scaling can help.
- Data skew: Overloaded partitions lead to unbalanced workloads and throttling.
- Unexpected write failures: This exception usually appears in production, not testing, because test data rarely reaches the scale where the 10 GB limit is breached.
- Operational drag: Fixing the issue requires refactoring table design and data access patterns, not a simple configuration tweak.
The 10 GB limit is effectively DynamoDB’s way of saying, “You’ve modeled your data too narrowly — it’s time to shard your keys.”
Key Terms
- Item Collection — The group of all items sharing the same partition key.
- GSI (Global Secondary Index) — An index that enables queries on attributes other than the primary key.
- LSI (Local Secondary Index) — An index that shares the same partition key as the base table but with a different sort key.
- Partition Key Sharding — The practice of appending a random or deterministic suffix to distribute data across multiple partitions.
Steps at a Glance
- Identify which partition key triggered the error.
- Analyze item collection size using CloudWatch or AWS CLI.
- Evaluate index design for data skew.
- Redesign partition key schema to distribute load.
- Backfill or migrate data using new sharding strategy.
Detailed Steps
Step 1: Identify which partition key triggered the error.
Examine CloudWatch logs or the DynamoDB Streams event that failed. The error message itself usually includes the offending partition key. For batch operations, identify which item collection exceeded the limit.
Step 2: Analyze item collection size using CloudWatch or AWS CLI.
Use the AWS CLI to estimate storage per partition key:
aws dynamodb scan \
--table-name YourTable \
--filter-expression "PK = :pk" \
--expression-attribute-values '{":pk": {"S": "Customer#1234"}}'
Or query CloudWatch metrics under TableSizeBytes and ItemCount to estimate the data volume.
Step 3: Evaluate index design for data skew.
Review which attributes are indexed. If a single GSI partition key corresponds to millions of items, you’re overloading a single logical partition. This often happens when indexing by a broad attribute like status = active
or region = us-east-1
.
Step 4: Redesign partition key schema to distribute load.
Consider introducing synthetic sharding or bucketing.
Example:
Old Key: Customer#1234
New Key: Customer#1234#A
New Key: Customer#1234#B
New Key: Customer#1234#C
This evenly spreads items across multiple partitions, keeping each below 10 GB.
Note: Sharding increases write scalability but can complicate reads. Your application must now query across all shard suffixes (e.g., Customer#1234#A
, #B
, #C
) or implement an aggregation layer. Avoid falling back to Scan
operations unless absolutely necessary.
Step 5: Backfill or migrate data using new sharding strategy.
Implement a one-time migration job to copy or rewrite items into their new partition keys. Test with small batches before full migration. Then, update your application logic to use the new key structure going forward.
ProTip #1: Watch for Silent Growth
Monitor partition sizes proactively. Set up CloudWatch alarms that trigger when item count or storage bytes grow abnormally for a single partition key.
ProTip #2: Design for Cardinality Early
When designing indexes, favor high-cardinality attributes (like order_id
, session_id
, or timestamp
) to ensure wide distribution. Low-cardinality fields invite partition bloat.
Conclusion
The ItemCollectionSizeLimitExceededException isn’t just a technical error — it’s a design signal. DynamoDB is warning that your data model has become too centralized. By embracing partition sharding and high-cardinality design early, you prevent this constraint from ever becoming a blocker.
Even before the 10 GB ceiling is reached, Adaptive Capacity may redistribute partitions temporarily to handle data skew, but persistent hot keys will still cause throttling and inconsistent performance.
ProTip: Think of DynamoDB’s 10 GB limit as a design guardrail, not a bug. It’s teaching you to distribute your data intelligently and build for scale from the start.
Aaron Rose is a software engineer and technology writer at tech-reader.blog and the author of Think Like a Genius.
Comments
Post a Comment