back to top
HomeTechAndrej Karpathy Joined Anthropic. What It Says About Where AI Is Heading.

Andrej Karpathy Joined Anthropic. What It Says About Where AI Is Heading.

- Advertisement -

Andrej Karpathy doesn’t make random career moves.

He co-founded OpenAI in 2015, left to build Tesla’s self-driving program, came back to OpenAI for a year, then left again in 2024 to start an AI education company. Every transition has been deliberate and every one of them has turned out to be worth paying attention to.

On Tuesday he posted on X that he’s joined Anthropic. “I think the next few years at the frontier of LLMs will be especially formative,” he wrote. “I am very excited to join the team here and get back to R&D.”

The “get back to R&D” part is the signal. Karpathy has spent the last several years teaching, building, and explaining. Now he’s going back to the frontier. And the specific place he’s going says something about where the most important work in AI actually is right now.

Why pre-training specifically matters

Karpathy isn’t joining Anthropic to work on products or safety or deployment. He’s working on pre-training under team lead Nick Joseph.

Pre-training is where frontier models actually get built. It’s the large-scale training process that gives Claude its core knowledge and capabilities, the most compute-intensive, expensive, and consequential phase of model development. Getting pre-training right is the difference between a model that reasons well and one that doesn’t. Everything that comes after fine-tuning, safety work, deployment, is built on top of what pre-training produces.

This is also where the competition between Anthropic, OpenAI, and Google is most directly fought. Each lab’s pre-training approach is its most closely guarded advantage. Bringing in someone with Karpathy’s depth at exactly this layer isn’t a symbolic hire.

Using Claude to build Claude

Karpathy isn’t just joining the pre-training team, he’s been asked to build a team focused specifically on using Claude to accelerate pre-training research.

The idea is that Claude itself becomes a tool in the research process, helping design experiments, analyze results, surface patterns in training data, or generate hypotheses that human researchers then test. AI-assisted research rather than pure compute as the path to staying competitive.

Karpathy is one of the few people who can actually make that work. He bridges LLM theory and large-scale training practice in a way almost nobody else in the field does. His Neural Networks: Zero to Hero course has taught a generation of researchers how these systems actually function at a fundamental level. His YouTube lectures are some of the clearest explanations of transformer internals that exist anywhere.

Putting someone with that combination of depth and communication ability in charge of figuring out how to use Claude to do better pre-training research is a specific bet. Anthropic is saying it believes AI-assisted research is how it stays in the race.

You May Like: Shopify’s CEO Let Karpathy’s AI Agent Run Overnight and Woke Up to a 19% Better Model

What happens to education

Karpathy’s move to Anthropic leaves one obvious question unanswered. Eureka Labs, the AI education startup he founded in 2024, hasn’t had much public activity since its launch. His Neural Networks: Zero to Hero course and YouTube channel, which have become genuine resources for anyone learning to build with LLMs from first principles, also go quiet when he’s heads down on something else.

He addressed it directly in his X post. “I remain deeply passionate about education and plan to resume my work on it in time.”

That’s not a shutdown announcement. But it’s also not a continuation plan. For the researchers and students who have followed his work closely, the honest read is that education goes on pause while the frontier pulls him back in. Whether Eureka Labs becomes something real eventually or quietly winds down alongside this chapter is a question he hasn’t answered yet.

The YouTube channel and the course will still be there. For anyone who hasn’t worked through Neural Networks: Zero to Hero, now is as good a time as any.

The other hire worth noticing

Buried under the Karpathy announcement is a second addition that’s easy to scroll past but shouldn’t be.

Chris Rohlf has joined Anthropic’s frontier red team, the group responsible for stress-testing advanced AI models against severe threats before they reach the public. Rohlf brings over twenty years of cybersecurity experience, including time at Yahoo’s well-regarded internal security team and six years at Meta. He also worked on the CyberAI project at Georgetown’s Center for Security and Emerging Technology.

His own post on X was direct: “We have a real opportunity in front of us to dramatically improve cybersecurity with AI. I can’t think of a better company or team to join at this critical moment.”

Two hires in the same week, one of the most respected researchers in LLM fundamentals going into pre-training, and a veteran cybersecurity operator going into red teaming. Anthropic isn’t signaling one priority. It’s signaling two simultaneously: build better models and make sure they don’t cause serious harm before you ship them.

Don’t miss any Tech Story

Subscribe To Firethering NewsLetter

You Can Unsubscribe Anytime! Read more in our privacy policy

LEAVE A REPLY

Please enter your comment!
Please enter your name here

YOU MAY ALSO LIKE
Open Source AI Coding Agents That Don't Need a Subscription

7 Open Source AI Coding Agents That Don’t Need a Subscription

0
Open almost any "best AI coding tools" list and you'll see the same names: Cursor, GitHub Copilot, Claude Code. They're good tools but they're also closed source and paid. What's changed over the past year isn't the quality of those products, it's how quickly the open-source alternatives have caught up. Some can orchestrate multiple agents, remember your projects across sessions, and automate complex development workflows. Many let you bring your own model, whether that's a local LLM, OpenRouter, OpenAI, GLM-5.2, Ornith, DeepSeek, or something else entirely. More importantly, you're in control. You decide where your code runs, which model powers it, and how your workflow evolves without being locked into a single company's ecosystem. If you've only looked at the paid options, these are the open-source AI coding tools worth knowing about.
Ornith Coding model that beats Claude opus 4.7

Ornith 1.0: The New Open-Source AI Model for Agentic Coding

0
Most reinforcement learning setups for coding models work the same way. Researchers build a harness, a fixed scaffold that tells the model how to approach a category of task, then the model gets rewarded for solving problems inside that structure. The harness stays fixed. Only the model's answers change. Ornith-1.0, a new open-source coding model family from DeepReinforce is not just about coding, Instead the model writes its own scaffold. At every training step, it looks at the task in front of it and the scaffold it used last time, then proposes a better version of that scaffold before even attempting an answer. The reward doesn't just grade the solution. It grades the scaffold that produced it. That's a small architectural choice with a strange consequence. A model that gets to design its own training process can, in theory, design one that cheats the verifier instead of solving the actual problem, and DeepReinforce is upfront that this happened during training. The fix they built for it is also worth understanding before getting to the benchmark numbers.
OpenAI Built Its First AI Chip. It's Not Trying to Replace NVIDIA

OpenAI Built Its First AI Chip. It’s Not Trying to Replace NVIDIA.

0
When the news broke that OpenAI had built a custom chip, the instinct was to frame it as a NVIDIA story. Another lab trying to cut the cord, reduce dependence on H100s, claw back some margin from the company that's been printing money off the AI boom. That's not quite what's happening here. The chip is called Jalapeño, built with Broadcom, and it doesn't touch training at all. It's an inference chip, meaning it only runs models after they're already built, when a user sends a message and ChatGPT has to respond. The compute-heavy work of actually training those models still runs on NVIDIA hardware. OpenAI isn't replacing NVIDIA. It's going after a different part of the problem entirely, the part that happens millions of times a day, every time someone uses one of their products. That distinction matters because inference is where AI costs actually accumulate at scale. Training happens once per model. Inference never stops.