The Secret Life of AWS: The Memory Layer (Amazon ElastiCache)

How to slash database costs and achieve sub-millisecond latency with the Cache-Aside pattern

#AWS #ElastiCache #Redis #Caching

Margaret is a senior software engineer. Timothy is her junior colleague. They work in a grand Victorian library in London — the kind of place where code quality is the unspoken objective, and craftsmanship is the only thing that matters.

Episode 64

Timothy was reviewing his AWS Cost Explorer with a mix of relief and concern. Margaret’s API Gateway throttling from the previous week had successfully blocked the competitor’s botnet, but the marketing team’s new flash sale was driving unprecedented, legitimate traffic to the storefront.

"The good news is that the API is healthy, and we are not dropping any customer orders," Timothy reported to Margaret. "The bad news is our DynamoDB Read Capacity costs for the Product Catalog table are skyrocketing. Every time a customer loads the homepage, our Lambda function queries the database for the exact same top-selling mechanical keyboard. We have read that item from disk 400,000 times today."

"And has the description or price of that keyboard changed at all today?" Margaret asked.

"No," Timothy admitted. "It has been exactly the same all week."

"Then you are paying your database to repeat itself," Margaret said, pulling up a whiteboard. "Databases like DynamoDB or Amazon RDS are built for durability—they write data to physical storage. Reading from physical storage is computationally expensive. For read-heavy, unchanging data, we need to introduce a high-speed memory layer using Amazon ElastiCache."

The Cache-Aside Pattern

Margaret opened the AWS Console and navigated to ElastiCache.

"ElastiCache provisions a managed, in-memory data store. While Memcached is great for simple key-value needs, we will use the Redis engine because it is the enterprise standard, supporting advanced data structures," she explained. "We will also enable Redis Cluster Mode with replicas so the cache itself does not become a single point of failure. Because RAM is exponentially faster than a solid-state drive, retrieving data from Redis takes less than a millisecond."

To use it, Margaret updated Timothy's Node.js catalog service to implement the Cache-Aside Pattern.

const redis = require('redis');

// Initialize the client outside the handler to reuse the connection across Lambda invocations
const redisClient = redis.createClient({ url: process.env.REDIS_URL }); 
redisClient.connect(); 

exports.handler = async function getProductDetails(productId) {
    const cacheKey = `product:${productId}`;

    // 1. The Cache Read (Check memory first)
    const cachedData = await redisClient.get(cacheKey);
    
    if (cachedData) {
        console.log("Cache Hit!");
        return JSON.parse(cachedData); 
    }

    // 2. The Cache Miss (Query the primary database)
    console.log("Cache Miss. Querying DynamoDB...");
    const databaseData = await queryDynamoDB(productId);

    // 3. The Cache Write (Store in memory for future requests)
    // We set an expiration (TTL) of 3600 seconds (1 hour)
    await redisClient.setEx(cacheKey, 3600, JSON.stringify(databaseData));

    return databaseData;
}

Hits, Misses, and Expiration

Timothy studied the code. "So when the first customer of the day loads the mechanical keyboard, the cache is empty. The function experiences a Cache Miss, reads the heavy data from DynamoDB, and then saves a copy into ElastiCache."

"Exactly," Margaret nodded. "But when the second, third, and four-hundred-thousandth customer loads that exact same keyboard..."

"The function checks ElastiCache, finds the data, and returns it instantly," Timothy finished, his eyes widening. "That is a Cache Hit. The Lambda function never even speaks to DynamoDB."

"From 400,000 database reads to zero," Margaret smiled. "That is the power of caching. To make it even faster, you can 'warm' the cache by pre-loading popular items during your deployment, so even the very first customer does not experience a miss."

"But wait," Timothy paused. "What happens if marketing changes the price of the keyboard in the primary database? If we leave the old price in the cache, the customer will see the wrong data."

"There is a famous saying in computer science," Margaret replied. "There are only two hard things in programming: naming things, and cache invalidation. That is why we use a Time to Live (TTL)," she pointed out, referencing the setEx command. "We instruct ElastiCache to automatically delete the cached product after one hour. The next request will trigger a Cache Miss, forcing the application to fetch the fresh price from the database and re-cache it. You trade a tiny amount of staleness for massive scalability."

Timothy updated his architecture diagram. His application was no longer endlessly hammering its database; it had a blazing-fast memory layer protecting its core storage.

Key Concepts Introduced:

Amazon ElastiCache: A fully managed, in-memory caching service provided by AWS. It supports popular open-source engines like Redis (the standard for advanced data structures and high availability via Cluster Mode) and Memcached. By storing frequently accessed data in RAM rather than on disk, ElastiCache delivers sub-millisecond response times, accelerating performance and reducing backend database load.

The Cache-Aside Pattern: A widely used strategy where the application manages the cache. The application checks the cache first (Cache Hit). If data is missing (Cache Miss), it queries the primary database, returns the data, and simultaneously writes a copy into the cache for future requests.

Connection Management in Serverless: When using caching inside AWS Lambda, it is critical to initialize the Redis client outside the main handler function. This allows the execution environment to reuse the same TCP connection across multiple invocations, preventing connection exhaustion on the cache cluster.

Time to Live (TTL) & Cache Invalidation: The primary challenge of caching is preventing permanent data staleness. A TTL is an expiration timer attached to a cached item. Once it expires, the cache automatically deletes the item, forcing the application to fetch fresh data on the next request. For highly predictable traffic, engineers will often engage in Cache Warming (pre-loading popular items) to prevent a flood of initial cache misses.

Aaron Rose is a software engineer and technology writer at tech-reader.blog.

Catch up on the latest explainer videos, podcasts, and industry discussions below.

Search This Blog

Tech-Reader.blog