AWS Lambda: "Exactly Once" — The Myth of Idempotency in Serverless
“Exactly once” is a comforting story that doesn’t survive contact with production.
Problem
You’ve done everything right.
Your Lambda logs look clean. Your retries are configured. You’ve even added idempotent writes.
And yet—every so often—one job runs twice, one record disappears, or one transaction lands out of order.
What happened?
Welcome to the myth of “Exactly Once.”
In distributed systems like AWS Lambda, the comforting idea that every message is processed exactly one time is a fantasy.
The system didn’t fail you—it’s simply doing what it’s designed to do: at-least-once delivery.
Your job is to make that reality safe.
Clarifying the Issue
AWS Lambda provides two invocation models—synchronous (direct) and asynchronous (queued).
The trouble begins with the latter.
When you invoke a function asynchronously, AWS queues the event for delivery and retries it up to two more times if the first attempt fails or times out. That’s reliability, not error. But it means your function may run three times for one event.
That’s why “exactly once” is a myth.
True exactly-once delivery requires coordination between the producer and consumer at the transaction layer, typically through a consensus protocol (like Paxos or Raft). These systems trade off speed, scalability, and cost for certainty—something AWS Lambda intentionally avoids.
In practice, you’ll always get at-least-once delivery and occasional duplicate side effects.
The antidote isn’t prevention—it’s idempotency: designing your operations so that duplicate execution produces the same result.
Why It Matters
The “exactly-once” myth leads to dangerous assumptions:
- Duplicate state changes (two inserts, two charges, two emails).
- Hidden inconsistencies when retries land minutes later.
- Silent data corruption that doesn’t trigger alarms.
When you build around “exactly-once,” you’re trusting an imaginary contract.
When you build for “at-least-once,” you’re designing for reality.
Reliability comes from tolerance, not trust.
Key Terms
- Exactly-once delivery: A theoretical guarantee that each message is processed only once—rare in real systems.
- At-least-once delivery: AWS’s default for asynchronous invocations—events may be delivered more than once, but never fewer.
- Idempotency: The property of an operation that can be repeated without changing the result.
- Deduplication Window: The five-minute window in which SQS deduplicates identical messages automatically.
- Effectively-once: A practical state achieved through idempotency, deduplication, and strong observability.
Steps at a Glance
- Accept that “exactly once” is a myth.
- Identify side effects that create external state.
- Enforce idempotency at the operation level.
- Add deduplication and correlation tracking.
- Test with simulated retries.
- Monitor for duplicate signals in production.
Detailed Steps
Step 1: Accept the myth
You can’t eliminate retries. AWS guarantees at-least-once delivery for asynchronous invocation—that means “maybe twice.”
Accepting this truth isn’t defeat—it’s maturity.
Retries are how distributed systems achieve reliability. They only become dangerous when you assume they’ll never happen.
Step 2: Identify your side effects
Map every place your Lambda changes external state:
- DynamoDB writes
- S3 uploads
- Notifications (SNS, SES, Slack)
- Payment API calls
These are the pressure points where retries cause real damage.
Make each one either idempotent or compensating (safe to reverse).
Step 3: Enforce idempotency at the operation level
Your function must prove it already processed a given event before doing it again.
Use unique business identifiers—OrderId, TransactionId, UserId+Timestamp—to anchor every action.
Example (DynamoDB conditional write):
table.put_item(
Item=record,
ConditionExpression="attribute_not_exists(OrderId)"
)
If Lambda retries, this check prevents the duplicate write.
In SQL, you can use INSERT ... ON CONFLICT DO NOTHING.
In S3, use consistent object keys (orders/123.json) to overwrite safely.
Step 4: Add deduplication and correlation tracking
Give every event a Correlation ID—a unique marker that follows the request through logs and metrics.
Example:
import uuid
def handler(event, context):
correlation_id = event.get("id") or str(uuid.uuid4())
print(f"[{correlation_id}] Processing order")
process_order(event)
Then use that ID in downstream services to recognize repeat deliveries.
If you’re using SQS, define a Deduplication ID for FIFO queues to suppress replays automatically.
Step 5: Test through chaos
Create chaos intentionally.
Replay the same event multiple times in staging and verify that your Lambda and downstream systems handle it predictably.
If you can safely hit your API twice with the same payload and end up with only one result, you’ve achieved effective idempotency.
Step 6: Monitor for duplicates
Use CloudWatch Logs Insights or X-Ray traces to detect repeat correlation IDs:
fields @timestamp, @message
| filter @message like "correlation_id"
| stats count(*) by correlation_id
| filter count_ > 1
This instantly surfaces duplicate invocations that standard metrics won’t show.
Pro Tip #1: Exactly Once is a Marketing Term
If someone claims their system delivers exactly once, ask for their consensus algorithm and latency budget.
Then ask if they’ve ever run it at scale.
Pro Tip #2: Don’t Eliminate Retries — Embrace Them
Retries are how distributed systems breathe.
Designing for idempotency means you can retry fearlessly—and that’s real reliability.
Conclusion
“Exactly once” is a comforting story that doesn’t survive contact with production.
In AWS Lambda, the reality is at-least-once execution—and your safety comes from how you handle that truth.
Build idempotent operations, use correlation tracking, and verify your systems through chaos.
You’ll never get exactly once, but you can get something better:
Effectively once, every time.
Aaron Rose is a software engineer and technology writer at tech-reader.blog and the author of Think Like a Genius.
.jpeg)

Comments
Post a Comment