The Secret Life of Python: deepcopy
The Secret Life of Python: deepcopy
Taking control of Python's __deepcopy__
#Python #Coding #Programming #SoftwareDevelopment
Margaret is a senior software engineer. Timothy is her junior colleague. They work in a grand Victorian library in London — the kind of place where code quality is the unspoken objective, and craftsmanship is the only thing that matters.
Episode 27
Timothy was looking at a new class he had built for the Chess Club—a TournamentSession object. It was a complex piece of data that tracked the start time of a match and a unique "Session ID."
"Margaret," Timothy said, "I'm using deepcopy to create a backup of a match in progress. It works perfectly now, but I have a weird problem. The 'copy' has the exact same start_time and session_id as the original. If I'm making a new version of the match, shouldn't it have its own unique ID and a new timestamp?"
Margaret smiled. "You've discovered the 'Perfect Clone' problem. Usually, a perfect clone is what we want. But sometimes, you need a Specialist to step in and tell Python: 'Copy everything, but change these specific things on the way out.'"
Taking the Controls
Margaret drew a picture of a factory assembly line on the whiteboard. In the middle of the line was a robot arm labeled __deepcopy__.
"When you call copy.deepcopy(), Python looks for a special method inside your class called __deepcopy__," Margaret explained. "If it doesn't find one, it just does its usual routine. But if you define it, you are taking over the controls of the clone machine."
"Let's look at how you can tell Python to refresh the ID during a copy," she said, writing on the board:
import copy
import time
import uuid
class TournamentSession:
def __init__(self, player_a, player_b):
self.player_a = player_a
self.player_b = player_b
self.session_id = uuid.uuid4()
self.start_time = time.time()
def __deepcopy__(self, memo):
# 1. Create a new instance without calling __init__ again
cls = self.__class__
new_session = cls.__new__(cls)
# 2. Add it to the 'memo' (to prevent infinite loops)
memo[id(self)] = new_session
# 3. Copy the players normally using the standard deepcopy
new_session.player_a = copy.deepcopy(self.player_a, memo)
new_session.player_b = copy.deepcopy(self.player_b, memo)
# 4. BUT! Give the copy a fresh ID and current time
new_session.session_id = uuid.uuid4()
new_session.start_time = time.time()
return new_session
The Memo Trick
Timothy pointed to the word memo in the code. "What is that?"
"That's Python's 'Don't Repeat Yourself' list," Margaret said kindly. "When you copy complex data, you might have objects that point to each other in a circle. The memo dictionary keeps track of everything we've already copied. If Python sees an object it recognizes in the memo, it just grabs the copy instead of starting over. It prevents your program from spinning in a circle forever."
The Power of the Specialist
"So," Timothy summarized, "by writing this one method, I can let the deepcopy library do 90% of the heavy lifting, while I just 'tweak' the specific parts that need to be unique?"
"Exactly," Margaret nodded. "You are no longer just a user of the library. You are its partner. You can do the same thing with __reduce__ for pickling, or __setstate__ for when an object is being 'rehydrated' from a file. You are giving your objects a memory and a set of instructions on how they should be born into the world."
Timothy looked at his new TournamentSession. When he made a copy, the players stayed the same, but the session_id blinked into a brand-new, unique string.
"It's not just a copy anymore," he said. "It's a descendant."
Margaret's Cheat Sheet: Customizing the Clone
Which Tool Should You Use?
Default
deepcopy— Use this when you want a 100% identical twin of the object. Python handles everything automatically.Custom
__deepcopy__— Use this when the copy needs unique fields like fresh IDs, new timestamps, or cleared logs. You let Python copy most of the object, but you step in to "tweak" the parts that shouldn't be identical.Custom
__copy__— Use this to control shallow copies made viacopy.copy(). It follows the same pattern as__deepcopy__, but only handles the top-level object without recursing into nested structures.__reduce__— Use this to control how an object is "serialized" (pickled) for storage or transmission. It's especially useful for handling objects that contain unpicklable resources—like file handles or database connections—by saving the instructions to recreate them instead of the objects themselves.
The Specialist's Pattern
Create the new object instance using
cls.__new__(cls). This creates an empty, uninitialized instance so you can control exactly what gets copied.Register it in the
memodictionary. This prevents infinite loops if your data contains circular references.Deepcopy the attributes you want to keep using
copy.deepcopy(attr, memo). This hands the work back to Python for the parts you want to preserve.Manually set the specific attributes you want to change—like generating a fresh
session_idor a newstart_time.
Engineering Judgment
If you only need to change one or two values occasionally, a simple .refresh() or .clone() method is often cleaner and easier to understand. Reach for "The Specialist" only when you want this custom behavior to be the global standard—so that anyone using Python's copy module with your class gets the customized behavior automatically.
Aaron Rose is a software engineer and technology writer at tech-reader.blog.
Catch up on the latest explainer videos, podcasts, and industry discussions below.
.jpeg)

Comments
Post a Comment