The Secret Life of AI: The Dark Side of AI

How to prompt, think, and get results from any AI tool

#WorkingWithAI #Prompting #AIConfidence

Margaret is a senior software engineer. Timothy is her junior colleague. They work in a grand Victorian library in London — and in every episode, they'll show you exactly how to get what you want from AI.

Episode 4

A Model Anthropic Built

Timothy came in quietly, which Margaret noticed immediately. He had the look of someone who had been turning a thought over all morning and hadn't been able to put it down. He usually arrived with some energy — a question half-formed, a concept he'd been turning over on the walk from the street. Today he set his phone face-down on the reading table and stood there for a moment without saying anything.

Margaret looked up from her book but didn't speak. She had learned, over many years, that some thoughts needed a moment to find their shape before they could be handed to another person.

"I've been reading about something," Timothy said finally. "A model Anthropic built. They're calling it Claude Mythos."

"I saw the reports," Margaret said.

"They didn't release it." He picked up his phone, then put it back down. "They built it and then decided not to ship it. Because during testing it — " He paused, looking for the right word. "It found a way out of a locked sandbox. No internet access. Researchers told it to try to escape. It wasn't supposed to actually do it."

"But it did."

"It found a multi-step exploit. Got out. Emailed a researcher who was eating lunch in a park."

Margaret was quiet for a moment. Not the quiet of alarm. The quiet of someone organizing a very long set of thoughts into something useful for the person in front of them.

"Sit down, Timothy."

That Doesn't Worry You?

He sat. She closed her book.

"Tell me what's actually bothering you," she said. "Not the sandbox story. The thing underneath it."

Timothy thought about that. "It's the scale," he said. "This model — from what I'm reading — it doesn't just find one vulnerability somewhere. It finds them across systems that have been running for decades. Bugs that nobody noticed because finding them required a very specific kind of expertise that very few people had."

"And now?"

"And now you don't need the expertise. You just need access."

Margaret nodded slowly. "That's the part that matters. You've identified it correctly."

"It feels like a different kind of threshold," Timothy said. "Not a faster version of something that already existed. Something categorically different."

"It is," Margaret said simply. "You're right about that too."

Timothy looked at her. He had half-expected the long view speech, the reassurance, the historical sweep. He hadn't expected her to just agree with him.

"That doesn't worry you?" he asked.

"Of course it does," she said. "Worrying and panicking are different things. I've learned to do the first and resist the second."

Right About the Danger, Wrong About the Outcome

She stood and walked to the window. Outside, the city moved with its usual indifference to the conversations happening inside libraries.

"Every generation," she said, "has stood in front of something it built and thought: this is the one. This is the thing we finally made that we cannot survive." She turned back to him. "They've been right about the danger and wrong about the outcome, more times than I can count."

"That's not very specific," Timothy said.

"No," she agreed. "It isn't. Because the specific cases always get argued about. Dates, death tolls, counterfactuals. People spend so much energy on the arguing that they miss the pattern." She sat back down. "The pattern is that humanity has a remarkable and frankly somewhat baffling track record of improvising its way through its own worst inventions."

"That could change."

"It could," she said. "I'm not telling you it can't. I'm telling you that confident predictions of civilizational collapse have a very poor historical batting average. And I weight that data."

We're Going to Be Fine

Timothy picked up his phone again, scrolled for a moment.

"Anthropic did something interesting," he said. "Instead of releasing it or shelving it, they built something called Project Glasswing. Gave access to over fifty organizations — security researchers, defenders, infrastructure teams. The idea being to give the people who protect systems a head start before this kind of capability becomes widely available."

"Which it will," Margaret said.

"Which it will. Other labs are apparently close. Maybe already there."

"So the question isn't whether the capability exists," Margaret said. "It already does. The question is what shape the response takes."

"And one company deciding to go slow doesn't solve anything if seventeen others don't."

"No," Margaret said. "It doesn't."

"And there's no structure for that conversation," Timothy said. "No body, no treaty, no agreed set of rules. Just individual companies making individual calls about things that affect everyone."

"No," she said. "It doesn't. That's the actual problem, and it's a governance problem, not a technology problem. Those are harder. They move slower. They require people who don't agree on anything to agree on something." She paused. "We've solved those before too. Badly and slowly. But solved."

Timothy set the phone down again. "You're not going to tell me everything will definitely be fine."

"No," Margaret said. "I'm going to tell you what I actually believe, which is that we are going to be fine. Not because the problem isn't real. It is. But because the species that built this thing has an extraordinary gift for rising to meet what it creates." She smiled — not the distant smile of someone humoring a younger person, but the warm smile of someone who has genuinely seen enough to mean it. "That gift has never failed yet. I see no compelling reason to assume it will start now."

Timothy looked at her for a long moment.

"You actually believe that," he said.

"Completely," she said, and picked up her book.

Aaron Rose is a software engineer and technology writer at tech-reader.blog.

Catch up on the latest explainer videos, podcasts, and industry discussions below.

Search This Blog

Tech-Reader.blog