The Secret Life of Claude Code: Testing as Thinking

How Claude Code makes you a better tester by finding the cases you never thought to ask

#ClaudeCode #CodingWithAI #Testing #Thinking

Margaret is a senior software engineer. Timothy is her junior colleague. They work in a grand Victorian library in London — the kind of place where code quality is the unspoken objective, and craftsmanship is the only thing that matters.

Episode 14

The rain had eased by evening but left everything wet and reflective, the cobblestones outside the library windows holding the lamplight in long orange streaks. Timothy arrived with his coat damp at the shoulders and a look that Margaret had not seen before — not the bewilderment of a problem unsolved, nor the particular deflation of a mistake made, but something quieter. Thoughtful.

He sat down, set his notebook on the table, and opened it to a page that was unusually full.

"I've been writing tests," he said.

Margaret looked up. "For how long?"

"Most of the day. Not because the feature required it — the feature was done by midmorning. I kept writing them because I kept finding things I hadn't understood."

Margaret set down her pen.

"Tell me what you found," she said.

"Three assumptions I had made about the input data that weren't in any specification. Two edge cases I had never considered. One interaction between this feature and an older part of the system that I hadn't known existed until the test failed." He looked at his notes. "The tests didn't find bugs, exactly. They found gaps in my understanding. The bugs were almost incidental."

Margaret was quiet for a moment, the rain dripping steadily from the eaves outside.

"You have discovered something that takes most developers years to see clearly," she said. "The test is not the destination. The thinking that produces the test is the destination."

What Tests Are Actually For

"I always thought of testing as confirmation," Timothy said. "You build the thing, then you verify it works. The test is the last step."

"That is one use of a test," Margaret said. "It is not the most valuable one." She pulled a fresh sheet toward her. "A test written after the fact confirms that the code does what you believe it does. Which is useful — but it is limited by the quality of your belief. If your understanding is incomplete, your after-the-fact test will be incomplete in exactly the same way. It will confirm the thing you built without questioning whether you built the right thing."

"The test inherits the blind spots of the builder," Timothy said.

"Yes. But a test written as an act of thinking — before or during the building, as a way of interrogating the problem — does something different. It forces you to specify, precisely, what the correct behavior actually is. Not what you intend. Not what seems reasonable. What is actually, unambiguously correct in every case you can imagine." She paused. "And in doing that, it reveals the cases you cannot imagine. The assumptions you have made without knowing you made them. The edges of your understanding."

Timothy looked at his notes. The three assumptions about input data had been invisible to him until he had tried to write a test that specified the expected behavior precisely. He had started writing the test and realized he didn't know what the right answer was. That not-knowing had been the most valuable moment of the day.

"The test exposed the question I hadn't asked," he said.

"The test forced the question," Margaret said. "That is its highest function."

How Claude Code Changes Testing

"This is where Claude Code enters," Margaret said. "Because the discipline of test-as-thinking has always been known. It is not new. And yet most developers do not practice it consistently."

"Because writing tests takes time," Timothy said. "And when you're moving fast —"

"The tests are the first thing cut," Margaret said. "Yes. The economics have always worked against the discipline. Writing a thorough test suite for a feature takes significant time. The feature itself is already done. The pressure is to move on." She paused. "Claude Code changes that calculation."

"Because it can write the tests quickly," Timothy said.

"It can write the tests. But more importantly — and this is what most developers miss — it can help you think about what tests to write. Which is the harder part." She looked at him. "Generating the test code for a known case is straightforward. Identifying the cases you haven't thought of — the edge cases, the interaction effects, the boundary conditions — that is the work that takes time and experience. And Claude Code, given the right prompt, can do a great deal of that work."

Timothy thought about this. He had been writing his tests by hand all day, each one emerging from his own interrogation of the problem. It had been valuable — genuinely, unexpectedly valuable. But slow.

"I could have asked Claude Code to generate the cases," he said. "Not the test code. The cases."

"Yes. Describe the function, describe its inputs and outputs, describe the system it operates in — and ask it what cases a thorough test suite should cover. Ask it what edge cases a senior engineer would insist on. Ask it what input combinations have historically caused problems in systems like this." She paused. "It will not find everything. But it will find things you would not have thought to look for. And the list it produces becomes the scaffold for your thinking."

Asking Claude Code to Write Your Tests

"There's a trap here though," Timothy said. "If I ask Claude Code to write the tests directly — not just the cases, but the actual test code — I've seen what comes back. It's thorough. It looks complete. And sometimes it is complete." He paused. "But sometimes it tests the implementation rather than the behavior. It writes tests that will pass as long as the code doesn't change, rather than tests that will catch it when the code does the wrong thing."

Margaret looked at him with attention. "Say more about that distinction."

"A test that checks the implementation asks: does the code do this specific thing in this specific way? A test that checks the behavior asks: given this input, does the system produce this output, regardless of how it gets there?" He paused. "The first kind breaks every time you refactor. The second kind survives refactoring and catches real regressions."

"And Claude Code tends toward which?"

"The first, if I'm not careful. If I give it the code and ask it to write tests, it writes tests for the code as it exists. Which is natural — the code is what it can see." He looked at her. "The better prompt is to give it the specification and ask it to write tests for the behavior. Without the code."

"Before the code, if possible," Margaret said. "Which is the discipline your day accidentally demonstrated. You wrote tests that interrogated the problem, not the solution. That is why they found things the code could not show you."

When Tests Fail

"One more thing happened today," Timothy said. "The test that found the interaction with the older system — when it failed, I didn't understand why immediately. I knew what the test expected and what it got. But I didn't know why the system was behaving differently than I thought."

"And Claude Code?" Margaret said.

"I showed it the failing test, the expected output, the actual output, and the relevant code from both systems. I didn't ask it to fix anything." He smiled slightly. "I asked it to explain what could cause this specific discrepancy."

"And?"

"It identified the interaction in about two minutes. Something about how the older system handled a particular data format that my feature hadn't accounted for." He paused. "The test found the gap. Claude Code explained it. I fixed it in ten minutes."

"Three distinct phases," Margaret said. "The test as detector. Claude Code as diagnostician. You as decision-maker." She paused. "Each doing the work it is best suited for."

Timothy wrote this down. It felt like a clean description of something he had stumbled into accidentally but wanted to be able to repeat deliberately.

"The test suite as a system," he said. "Not just a checklist."

"A living part of the work," Margaret said. "The part that keeps asking questions long after you have stopped asking them yourself. Every test you write today is a question that will be answered again every time the code changes. The discipline of writing them well — of writing them for behavior, not implementation, of writing them for the cases you haven't imagined as much as the cases you have — that discipline compounds." She looked at him. "A good test suite makes the codebase safer to change. And a codebase that is safe to change is a codebase that can grow without accumulating fear."

What to Ask Claude Code When Testing

"If you were to describe the prompts," Timothy said. "The ones that produce the most value in testing."

Margaret considered. "Ask it to identify the edge cases you haven't considered, given the function's purpose and inputs. Ask it what a senior engineer would insist on testing before approving this for production. Ask it what inputs have historically caused failures in similar systems." She paused. "Ask it to write tests for the specification, not the code. And when a test fails unexpectedly, ask it to explain the discrepancy — not fix it. The explanation is the value."

"Hypothesis generation again," Timothy said. "The same discipline as debugging."

"The same underlying principle," Margaret said. "You are always trying to understand before you act. The test is a tool for understanding. Claude Code is a tool for extending the range of what you can understand." She looked at the window, where the wet cobblestones still held their orange light. "The developer who uses both well builds things that last. The developer who uses neither builds things that work — until they don't, and nobody knows why."

Timothy closed his notebook — full, now, in a way it rarely was — and reached for his coat.

"I'm going to keep writing tests this way," he said. "Even when the feature is done. Especially then."

"Good," Margaret said. "And tomorrow, try asking Claude Code for the cases before you write a single line of test code. See what it finds that you wouldn't have."

"I'll bring the results Thursday," he said.

"I'll be curious to hear them," she said, already returning to her work.

He walked to the tall doors and out into the damp evening, the lamplight on the wet stones guiding him down the street and out of sight.

Behind him the library settled, warm and still, the only sound the quiet scratch of Margaret's pen and the last slow drip of rain from the eaves.

Next episode: Margaret and Timothy explore documentation — why the habit of writing things down is more than record-keeping, and how Claude Code changes what good documentation looks like and how it gets written.

Aaron Rose is a software engineer and technology writer at tech-reader.blog.

Catch up on the latest explainer videos, podcasts, and industry discussions below.

Search This Blog

Tech-Reader.blog