The Secret Life of AWS: The Idempotency Key (Amazon DynamoDB)

 

The Secret Life of AWS: The Idempotency Key (Amazon DynamoDB)

How to protect your event-driven architecture from duplicate processing and retries

#AWS #DynamoDB #Idempotency #EventDriven




Margaret is a senior software engineer. Timothy is her junior colleague. They work in a grand Victorian library in London — the kind of place where code quality is the unspoken objective, and craftsmanship is the only thing that matters.

Episode 61

Timothy was staring at a severe customer support ticket. He looked up, his face pale, as Margaret walked into the studio.

"A customer clicked 'Buy' exactly once," Timothy said, pointing at the billing dashboard. "But their credit card was charged twice. I checked the EventBridge logs. The checkout service correctly emitted a single OrderPlaced event. But for some reason, the downstream payment Lambda function executed twice."

"Welcome to the reality of distributed systems," Margaret said, pulling up a chair. "Now that we have decoupled our architecture using an event bus, we must design for its side effects. Amazon EventBridge, SQS, and SNS are designed for extreme high availability. To guarantee that a message is never lost during a momentary network blip, these services use an at-least-once delivery model."

"So they intentionally send duplicates?" Timothy asked.

"Occasionally, yes," Margaret nodded. "While services like SQS FIFO queues offer exactly-once processing, they come with strict throughput limitations. For high-volume event buses, at-least-once is the standard. That means 99.9% of the time, your event is delivered exactly once. But during a network retry, a duplicate will occur. You cannot stop duplicate events from arriving. Instead, you must design your downstream services to be idempotent. An operation is idempotent if executing it multiple times produces the exact same result as executing it once."

The Idempotency Table

Margaret opened the AWS Console and navigated to Amazon DynamoDB.

"To make your payment function idempotent, it needs a memory," she explained. "Before it charges a credit card, it must check to see if it has already processed this specific order. We will create a dedicated DynamoDB table called PaymentIdempotency."

"So the Lambda function does a read query to check if the Order ID exists, and if it doesn't, it writes the Order ID to the table and charges the card?" Timothy asked.

"That is a logical guess, but it introduces a severe race condition," Margaret corrected. "If two duplicate events arrive at the exact same millisecond, both instances of the Lambda function will query the table, both will see the Order ID is missing, and both will simultaneously charge the card. We need an atomic operation. We are going to use a Conditional Write."

Conditional Writes and TTL

Margaret updated Timothy's Node.js payment service. She added a DynamoDB PutItem command right at the beginning of the handler.

const { DynamoDBClient, PutItemCommand } = require("@aws-sdk/client-dynamodb");
const client = new DynamoDBClient({ region: "us-east-1" });

async function processPayment(event) {
    // If a natural unique ID doesn't exist, the publisher must generate a UUID
    const idempotencyKey = event.detail.orderId; 

    const command = new PutItemCommand({
        TableName: "PaymentIdempotency",
        Item: { IdempotencyKey: { S: idempotencyKey } },
        ConditionExpression: "attribute_not_exists(IdempotencyKey)"
    });

    try {
        // Attempt to atomically claim the idempotency key
        await client.send(command);
        
        // If successful, we are the first execution. Charge the card.
        await chargeCreditCard(idempotencyKey);
        return { status: "Success" };

    } catch (error) {
        if (error.name === "ConditionalCheckFailedException") {
            // The key already exists. This is a duplicate event.
            console.log(`Duplicate event detected for Order ${idempotencyKey}. Ignoring.`);
            return { status: "Already Processed" };
        }
        throw error;
    }
}

"Notice the ConditionExpression," Margaret pointed out. "We are telling DynamoDB to insert the key into the table only if it does not already exist. DynamoDB handles this operation atomically at the database level. If two duplicate events hit the database at the exact same microsecond, DynamoDB guarantees that only one will succeed."

"And the one that fails throws a ConditionalCheckFailedException," Timothy observed, reading the catch block. "My code catches that specific error, knows it is a duplicate event, and safely ignores it."

"Exactly," Margaret smiled. "You return a success message back to EventBridge so it stops retrying the event, but you skip the credit card charge."

"Will this table eventually hold billions of records and cost a fortune?" Timothy asked.

"Not if we configure a Time to Live (TTL)," Margaret replied. "We don't need to remember every order forever. DynamoDB will automatically delete the idempotency records after 30 days, keeping the table small and the storage costs near zero."

Timothy updated his architecture diagram. His event-driven system was no longer just fast and decoupled; it was mathematically safe.


Key Concepts Introduced:

At-Least-Once Delivery: A foundational concept in distributed computing. Message brokers and event buses prioritize ensuring a message is never lost. To achieve this, they guarantee a message will be delivered at least once, meaning duplicate deliveries will occasionally occur due to network timeouts or internal retries. (While exactly-once services like SQS FIFO exist, they often introduce throughput constraints).

Idempotency & Keys: A property of certain operations in computer science where applying the operation multiple times has the same effect as applying it exactly once. In cloud architecture, designing idempotent microservices is a strict requirement to prevent duplicate processing. Downstream consumers use a unique Idempotency Key (like an Order ID or a publisher-generated UUID) to track whether they have already successfully processed that specific payload.

DynamoDB Conditional Writes: An atomic database operation that allows you to specify a condition that must evaluate to true in order for a write operation to succeed. By using the attribute_not_exists() condition, architects can safely claim an idempotency key. If the key already exists, DynamoDB rejects the write with a ConditionalCheckFailedException, safely identifying a duplicate event without introducing race conditions.

Time to Live (TTL): A DynamoDB feature that allows you to define a timestamp for when an item is no longer needed. DynamoDB automatically deletes expired items from the table without consuming write capacity, which is essential for keeping temporary storage (like an idempotency table) cost-effective and performant.


Aaron Rose is a software engineer and technology writer at tech-reader.blog

Catch up on the latest explainer videos, podcasts, and industry discussions below.


Comments

Popular posts from this blog

Insight: The Great Minimal OS Showdown—DietPi vs Raspberry Pi OS Lite

The New ChatGPT Reason Feature: What It Is and Why You Should Use It

Raspberry Pi Connect vs. RealVNC: A Comprehensive Comparison