AWS Lambda: “Unexpected Data Persisted Across Invocations” — When Reused Containers Keep Old State
Lambda’s warm-start optimization saves time but can cost trust
Note on Examples
The following examples use Python, but the concepts apply universally across AWS Lambda runtimes (Node.js, Java, Go, etc.). The patterns and fixes described here are runtime-agnostic.
Problem
You deploy a Lambda expecting a clean slate for every invocation.
But suddenly, the logs show yesterday’s values appearing in today’s run.
A cached variable.
A leftover list.
An old file still sitting in /tmp.
No errors, no exceptions — just stale state quietly bleeding into new executions.
Welcome to container reuse gone wrong, where Lambda’s warm-start optimization keeps your old data alive longer than you think.
Clarifying the Issue
AWS Lambda doesn’t start from zero every time.
When a Lambda container finishes, AWS freezes the execution environment and may reuse it for future invocations of the same function version.
That reuse speeds things up — fewer cold starts — but it also carries risk.
Global variables, in-memory caches, open file handles, and temporary files persist across invocations within that same container.
If you’re not resetting or reinitializing state, your new request may inherit leftovers from a previous run.
Typical causes include:
- Global Variables Not Reset – Objects or lists defined outside the handler retain their content.
- Uncleared
/tmpFiles – Files written to/tmpremain between invocations until the container is retired. - Cached Credentials or Config – Tokens, secrets, or cached configs linger and become stale.
- Mutable Default Parameters – In Python, default lists or dicts can accumulate changes between runs.
- Unintended State Sharing via Singletons – Global SDK clients or connections store stale session data.
This behavior is by design — Lambda reuses environments for efficiency — but unless you code defensively, you’ll meet the ghost of your last invocation.
Why It Matters
A reused container is a double-edged sword.
It improves performance through warm starts but introduces cross-invocation contamination.
When state persists, it can cause:
- Data Integrity Issues – A variable from a prior invocation leaks into a new user’s request.
- Security Risks – Stale credentials or tokens reused unintentionally.
- Intermittent Behavior – Errors that vanish on retries or new containers.
- Testing Confusion – Local testing resets everything, but production Lambda doesn’t.
This is one of the hardest AWS issues to reproduce because it only appears under warm-start conditions — not in your dev environment.
Key Terms
- Execution Context: The environment (code, libraries, global variables, /tmp) reused between invocations.
- Warm Start: A Lambda invocation that reuses an existing execution context.
- Cold Start: A new container created from scratch for a function version.
- Ephemeral Storage: The
/tmpspace allocated to each container — preserved across invocations but not guaranteed to persist indefinitely. - Stateless Function: A design principle ensuring no state from one invocation affects the next.
Steps at a Glance
- Identify where state persists across invocations.
- Isolate handler logic from global variables.
- Clear or reset global state explicitly.
- Reinitialize temporary directories or files.
- Use defensive patterns for caches and connections.
- Validate that each invocation starts clean.
Detailed Steps
Step 1: Identify Persistent State
Start by printing the container ID or creating a unique marker in /tmp.
You’ll quickly see whether the same container is reused:
import os, uuid
def handler(event, context):
marker_path = '/tmp/container-id.txt'
if not os.path.exists(marker_path):
with open(marker_path, 'w') as f:
f.write(str(uuid.uuid4()))
with open(marker_path, 'r') as f:
container_id = f.read()
print(f"Running in container: {container_id}")
If the container_id stays the same across invocations, you’re running warm.
Step 2: Isolate Handler Logic
Keep your business logic inside the handler function.
Avoid defining mutable global variables at the top level.
❌ Bad Example
cache = [] # persists across invocations
def handler(event, context):
cache.append(event)
print(len(cache)) # grows forever
✅ Good Example
def handler(event, context):
cache = []
cache.append(event)
print(len(cache)) # resets each time
This pattern ensures that any object necessary for the function (like a cache or config) is created anew on every run or passed into the function, aligning with the Dependency Injection (DI) principle and ensuring testability.
Step 3: Clear or Reset Global State
If global state is unavoidable (for example, database connections or SDK clients), reset it deliberately:
global_cache = {}
def handler(event, context):
global global_cache
global_cache.clear()
# Rebuild necessary state
Use this reset pattern to guarantee a clean start within reused containers.
Step 4: Reinitialize /tmp Between Invocations
The /tmp directory persists within the same container.
Always check or clean it before use:
import os, shutil
TMP_DIR = '/tmp/data'
def handler(event, context):
if os.path.exists(TMP_DIR):
shutil.rmtree(TMP_DIR)
os.mkdir(TMP_DIR)
This prevents stale files or partial data from polluting new runs.
Step 5: Use Defensive Patterns for Caches and Connections
Reusable resources like DB clients, S3 connections, or HTTP sessions can improve performance — but only if they’re safe to reuse.
import boto3
# INITIALIZED OUTSIDE THE HANDLER — SAFE TO REUSE
session = boto3.session.Session()
s3 = session.client('s3')
def handler(event, context):
# Always validate credentials before use
s3.list_buckets()
Cache connections, not state. Always verify validity before reuse.
Step 6: Validate a Clean Start
Add diagnostic logging to confirm that containers start clean:
print(f"Invocation ID: {context.aws_request_id}")
print(f"Cache length: {len(cache)}")
Track these over multiple runs. If they drift upward, state is leaking.
Pro Tip #1: Treat /tmp as Shared, Not Safe
/tmp persists but isn’t guaranteed.
It’s a cache, not a vault — always rebuild, never rely.
Pro Tip #2: Log Container Reuse During Development
Warm start behavior is unpredictable in dev mode.
Log your container reuse locally (or simulate it with persistent processes) to expose cross-invocation issues before they reach production.
Conclusion
Lambda’s warm-start optimization saves time but can cost trust.
When yesterday’s state seeps into today’s execution, you’re debugging a ghost that only lives in reused containers.
By isolating logic, clearing globals, and validating clean starts,
you turn Lambda back into what it was meant to be: stateless, predictable, and safe.
In short — every invocation deserves a clean slate.
Make sure yours gets one.
Aaron Rose is a software engineer and technology writer at tech-reader.blog and the author of Think Like a Genius.
.jpeg)

Comments
Post a Comment