Solve: Track Redshift Schema Drift with a Snapshot-Based Python Tool


Solve: Track Redshift Schema Drift with a Snapshot-Based Python Tool

It’s one of the subtler failure modes in Redshift.

A dashboard breaks. A query throws an error. A view stops working. But this time, it’s not a broken view. It’s a deeper issue: a column was renamed, or a table was dropped, or a datatype was silently changed somewhere downstream.

Redshift doesn’t track schema changes. There’s no built-in version history. So if a table changes shape on Monday, and things break on Thursday—you’re flying blind.

We saw this recently in a Redshift system with aging components and unclear ownership. After troubleshooting one broken query, the team realized the bigger problem: no one could say with certainty what their schema even looked like last week.

So we built a tool.


A Schema Drift Tracker, Built for Redshift

This Python tool connects to Redshift, grabs a complete snapshot of your schema, and saves it as a JSON file. Later, you can run it again and compare the new snapshot to the old one. It flags any differences—added, removed, or altered objects—so you can detect schema drift over time.

It’s lightweight. No installs, no agents, no extra infrastructure.

Just this:

Bash
python redshift_schema_tracker.py --baseline schema_snapshot.json
python redshift_schema_tracker.py --compare schema_snapshot.json    


And the output looks like this:

Bash
Schema Drift Report
Baseline: schema_snapshot_2025-04-01.json
Current: schema_snapshot_2025-04-04.json

Removed table: staging.transactions_2022
Added column: public.orders.discount (type: FLOAT4)
Changed type: public.customers.customer_id (VARCHAR(50) → VARCHAR(100))    


You’ll get a clean, readable diff of schema changes—perfect for audits, root cause analysis, or just bringing visibility into a complex system.


When to Use This Tool
  • You’re about to make schema changes and want a before/after view
  • You inherited a Redshift environment and need visibility
  • You’re troubleshooting an error and suspect a table or column changed
  • You want to monitor drift as part of system hygiene
It’s not magic. But it gives you memory—the one thing Redshift doesn’t have.


Grab the Tool

You can download and run the tool yourself here:

👉 View on GitHub Gist 


This is part of our Solve series: tactical tools for quiet problems that hurt.

Stay tuned for more.


Need AWS Expertise?

We're happy to help you with your AWS projects!  Feel free to contact us.



Image: Gemini

Comments

Popular posts from this blog

The New ChatGPT Reason Feature: What It Is and Why You Should Use It

Raspberry Pi Connect vs. RealVNC: A Comprehensive Comparison

The Reasoning Chain in DeepSeek R1: A Glimpse into AI’s Thought Process