Connect with us

Tech

ChatGPT Codex vs Claude Opus: Which AI Coding Model Is Better?

Peter Steinberger breaks down ChatGPT Codex 5.3 vs Claude Opus 4.6 on the Lex Fridman Podcast. Which AI coding model is better? It depends on how you work.

ChatGPT Codex 5.3 vs Claude Opus 4.6 AI coding model comparison

If you’re building software with AI in 2025, you’ve probably wondered: ChatGPT Codex or Claude Opus? In a recent episode of the Lex Fridman Podcast, Peter Steinberger, the creator of OpenClaw, broke down exactly how these two AI coding models differ. His insights are sharp, funny, and surprisingly practical for anyone trying to decide which tool fits their workflow.

The short answer on the ChatGPT Codex vs Claude Opus debate? There’s no clear winner. Each model has a distinct personality that makes it better for different developers. Here’s what Steinberger had to say.

ChatGPT Codex 5.3: The Silent Workhorse

According to Steinberger, Codex 5.3 is the kind of developer who puts on headphones, disappears into a corner, and comes back 45 minutes later with working code. It reads more of your codebase by default, giving it better context awareness right out of the gate. You hand it a task, and it grinds until the solution works.

The tradeoff? It’s not interactive. Where Claude Opus checks in with you and iterates in real time, Codex goes dark for 20 to 50 minutes (sometimes longer) while it works through the problem. Steinberger describes it as persistent and methodical, almost stubborn. “It’ll just read a lot of code by default,” he explained on the podcast. “It doesn’t matter if it takes 10, 20, 30, 40, 50 minutes or longer… the model will work very hard to really get there.”

His analogy? Codex is “the weirdo in the corner you don’t want to talk to, but is reliable and gets it done.” Not flattering, but hard to argue with.

Claude Opus 4.6: The Interactive Collaborator

Claude Opus 4.6 takes the opposite approach. It’s conversational, willing to try things, and iterates with you in real time. Steinberger calls it the best general-purpose model available, noting that it excels at roleplay, following commands, and adapting on the fly.

Opus is the coworker who talks through problems out loud and bounces ideas off you. That back-and-forth can produce more elegant solutions, especially when you’re exploring unfamiliar architectures. But it requires more steering. You need to use plan mode, push it harder to read deeply, and course-correct when it runs off with a quick, localized fix instead of thinking bigger.

Steinberger put it bluntly: “Opus is like the coworker that is a little silly sometimes, but he’s really funny and you keep him around.” The contrast between the two models, he joked, is almost cultural. Opus feels “a little bit too American” while Codex has more of a “German” engineering mindset. Dry, efficient, no small talk.

The $200 vs $20 Problem

One point Steinberger raised that doesn’t get enough attention: your pricing tier dramatically affects your experience with these AI coding assistants. Codex on the $20 plan is painfully slow. If you’re coming from Claude’s interactive $200 experience and switching to budget Codex, your first impression will be terrible.

“I think OpenAI shot themselves a little bit in the foot by making the cheap version also slow,” Steinberger said. He argued they should offer at least a taste of the fast experience before degrading the speed, because many developers are writing off Codex based on the wrong tier.

Why Your AI Coding Assistant Isn’t “Getting Dumber”

Steinberger also tackled one of the biggest misconceptions in the AI coding community: the belief that models degrade over time. You’ve seen the Reddit posts. “This model used to be so smart, now it’s terrible.”

His explanation is simple and convincing. The model isn’t getting worse. Your codebase is getting messier. As projects grow, you add complexity, skip refactors, and pile on dependencies. The AI has more to juggle, and the quality of its output drops accordingly. “You’re getting used to a good thing, and your project grows, and you’re adding slop,” he said. The fix isn’t switching models. It’s keeping your code clean.

Post-Training Matters More Than Raw Intelligence

Perhaps the most insightful takeaway from the conversation: the differences between Codex and Opus aren’t really about which model is “smarter.” Both have comparable raw intelligence. The divergence comes from post-training, how each company fine-tuned the model’s behavior and workflow.

Codex was optimized to be autonomous and persistent. Opus was optimized to be collaborative and adaptive. “I think the difference is in the post-training,” Steinberger noted. “It’s not like the raw model intelligence is so different, but they just give it different goals.”

So Which AI Coding Model Should You Choose?

Steinberger’s advice is practical: give any new model a full week before judging it. “I would give it a week until you actually develop a gut feeling for it,” he said. It’s like switching from an acoustic guitar to an electric. The skills transfer, but the feel is different.

If you prefer handing off tasks and letting your AI work independently, Codex 5.3 is your tool. If you like collaborating in real time and steering the solution interactively, Claude Opus 4.6 will feel more natural. A skilled developer can get great results with either one.

Watch the Full Breakdown

Peter Steinberger went deep on this topic with Lex Fridman. Check out the clip for the full comparison, including the hilarious analogies that had both of them laughing:

For more on how AI is shaping technology and intelligence, check out this episode of The NDS Show:

Key Takeaways

  • Codex 5.3 is autonomous and persistent. It reads more code by default and works independently for long stretches. Best for developers who want to delegate.
  • Claude Opus 4.6 is interactive and collaborative. It can produce more elegant solutions through real-time iteration, but requires more hands-on steering.
  • Pricing tier matters. The $20 Codex experience is drastically worse than the $200 tier. Don’t judge the model by its cheapest plan.
  • Models don’t degrade. Your codebase gets messier over time. Keep your code clean for consistent AI performance.
  • Post-training, not raw intelligence, drives the differences. Both models are smart; they’re just optimized for different workflows.

🎙️ Don’t Miss an Episode of The NDS Show

Stay informed on national defense, intelligence, and geospatial topics. Subscribe to The NDS Show on YouTube for in-depth interviews and analysis.

Subscribe on YouTube →

Continue Reading