The Secret Life of Python: How to Share Data Between Processes

 

The Secret Life of Python: How to Share Data Between Processes

A practical guide to ValueArray, and when to use queues instead

#Python #Multiprocessing #SharedMemory #ParallelProgramming




Margaret is a senior software engineer. Timothy is her junior colleague. They work in a grand Victorian library in London — the kind of place where code quality is the unspoken objective, and craftsmanship is the only thing that matters.

Episode 39

Timothy was looking at his "Grandmaster Analysis" engine. It was fast—blindingly fast—but it had a major flaw. Each of his four worker processes was living in its own Parallel Universe.

"Margaret," Timothy said, "I want to display a live 'Global High Score' on my screen. I want every worker to be able to add to the same total whenever they find a winning move. But when I use a normal global variable, each worker just updates their own private copy. At the end, my main program still thinks the total is zero!"

Margaret nodded. "That’s the cost of Multiprocessing, Timothy. Since each process has its own memory, they can't see each other's notes. To them, the other workers don't even exist."

"But," she added, eyes twinkling, "we can build a Wormhole."


The Value and the Array

"Python gives us a special way to carve out a tiny piece of memory that sits outside the parallel universes," Margaret explained. "We call it Shared Memory. In the multiprocessing module, we use a Value for a single piece of data, or an Array for a fixed-size list."

Timothy looked at the code. It looked familiar, but with a twist.

from multiprocessing import Process, Value, Lock
import time

def record_win(shared_total, lock):
    for _ in range(100):
        time.sleep(0.01)
        # We still need a Lock, even in parallel universes!
        with lock:
            shared_total.value += 1

if __name__ == "__main__":
    # 'i' stands for integer (Type Code), and we start at 0
    # This 'Value' lives in a shared space all processes can see
    global_wins = Value('i', 0)
    win_lock = Lock()

    processes = [Process(target=record_win, args=(global_wins, win_lock)) for _ in range(4)]

    for p in processes: p.start()
    for p in processes: p.join()

    print(f"Final Global Wins: {global_wins.value}")

The Price of the Wormhole

Timothy watched the number climb perfectly to 400. "It works! They're all touching the same scoreboard!"

"They are," Margaret cautioned. "But look closely at the code. Did you notice the Lock?"

"I thought we didn't need Locks in Multiprocessing because of the separate GILs?" Timothy asked.

"The GIL only protects Python's internal soul," Margaret said. "It doesn't protect Shared Memory. The second you build a bridge between universes, you bring back the Race Condition. If two workers try to update the Value at the exact same millisecond, they will still overwrite each other."


The Specialist’s Warning

Timothy realized the gravity of what he had built. "So, by sharing memory, I’ve brought back all the dangers of Threading (Locks, Deadlocks, Race Conditions) but with the high memory cost of Multiprocessing?"

"Exactly," Margaret said. "Shared Memory is the most powerful—and most dangerous—tool in the kit. Most of the time, you should stay on the Conveyor Belt (Queues) because it's safer. But when you absolutely need a 'Global Scoreboard,' now you know how to build the wormhole."


Margaret’s Cheat Sheet: Shared Memory

The Decision Guide

You Need To...Use...
Share a simple counterValue + Lock
Share a fixed-size listArray + Lock
Share a complex dict/listManager (Slower, but flexible)
Send work/messagesQueue (Safe & Preferred)

The Specialist's Wisdom

  • Type Codes: When using Value or Array, you must specify the type: 'i' for integer, 'f' for float, or 'd' for double.
  • The .value Attribute: Unlike normal variables, you must read or write to the .value property of a Value object.
  • The "Shared Memory" Tax: Accessing shared memory is significantly slower than accessing local memory. Use it sparingly for "High Scores," not for your main data processing.

Aaron Rose is a software engineer and technology writer at tech-reader.blog

Catch up on the latest explainer videos, podcasts, and industry discussions below.


Comments

Popular posts from this blog

Insight: The Great Minimal OS Showdown—DietPi vs Raspberry Pi OS Lite

The New ChatGPT Reason Feature: What It Is and Why You Should Use It

Running AI Models on Raspberry Pi 5 (8GB RAM): What Works and What Doesn't