back to top
HomeTechGoogle's Next AI Bet Isn't on Chatbots. It's on Agents That Do...

Google’s Next AI Bet Isn’t on Chatbots. It’s on Agents That Do the Work.

- Advertisement -

For the last three years, Google has been playing catch-up in the chatbot race. ChatGPT arrived, Gemini followed, and the conversation quickly became about which AI could answer questions better, faster, and more accurately.

Google I/O this week suggested the company is done competing on chat alone.

Gemini 3.5 Flash launched Tuesday, and Google barely framed it as a conversational product. Instead, the company focused on coding pipelines, autonomous research, multi-agent coordination, and one demo that stood out across the industry: building an operating system from scratch with minimal human input.

The model can reportedly operate autonomously for hours. Google says it’s up to 4× faster than other frontier models, with an optimized version reaching 12× faster speeds at similar quality.

What 3.5 Flash is built for

The speed numbers Google is citing aren’t marketing. They’re architectural decisions that only make sense if you’re building for agents rather than conversations.

gemini 3.5 evaluation chart

A chatbot doesn’t need to be 12x faster than its predecessor. A response that takes two seconds instead of 24 seconds doesn’t meaningfully change the experience of asking a question and reading an answer. But in an agentic workflow where multiple AI instances are running in parallel on different components of the same task, latency compounds. Slow agents create bottlenecks. Fast agents create throughput.

Gemini 3.5 Flash was co-developed with Antigravity, Google’s agentic development platform, specifically so agents would have what DeepMind’s chief technologist Koray Kavukcuoglu described as “a native environment where they can live, work, and execute.” That’s a different design philosophy than building a model and then figuring out what to do with it afterward. The model and the environment were built together with agents in mind from the start.

The benchmarks back the direction. Kavukcuoglu told reporters ahead of I/O that 3.5 Flash outperforms Gemini 3.1 Pro on nearly all benchmarks including coding, agentic tasks, and multimodal reasoning. A Flash model beating the previous generation’s Pro model on capability benchmarks while being significantly faster is the kind of result that makes the agentic bet look credible rather than aspirational.

The OS demo

The demonstration that got the most attention at I/O was Google engineer Varun Mohan showing agents spawning off inside Antigravity to work on separate components before coming together to build a full operating system.

It’s easy to dismiss demos like this. Labs have been staging impressive controlled environments for years and the gap between what works in a keynote and what works in production is well documented.

What makes this one worth paying attention to is the coordination pattern. Multiple agents running simultaneously on distinct subtasks, merging outputs into a coherent whole. That’s the architecture that makes long-horizon agentic work possible. A single agent working sequentially hits context limits and coherence problems on complex tasks. A fleet of specialized agents working in parallel and combining results is a fundamentally different approach.

Google says 3.5 Flash is already producing actual results for partners outside the demo environment. Banks and fintechs automating multi-week workflows. Data science teams surfacing insights in complex environments. They’re production claims, and production claims are where the actual thing gets told over the next few months.

You May Like: Small But Powerful AI Models You Can Run Locally on Your System (No Cloud Needed)

How Pro and Flash work together

Google’s senior director Tulsee Doshi framed it clearly. Pro becomes the orchestrator, the model doing high-level planning, reasoning through what needs to happen and in what order. Flash becomes the executor, the sub-agents carrying out specific tasks at speed. The reasoning power sits at the top of the hierarchy where it’s needed. The brute force tool use sits at the execution layer where throughput matters more than deliberation.

That’s a meaningful architecture for anyone building serious agentic systems. You’re not choosing between a smart slow model and a fast capable one. You’re using both in the roles they’re actually suited for. Pro thinks, Flash does.

3.5 Flash is available today through Antigravity, the Gemini API, Gemini Enterprise, the Gemini app, and AI Mode in Search globally. It’s also the model powering Gemini Spark, Google’s new personal agent designed to run continuously helping users manage their digital life. 3.5 Pro doesn’t have a release date yet.

The part Google didn’t lead with

Autonomous agents that run for hours, spawn sub-agents, and execute multi-step workflows without human input are genuinely useful. They’re also a category of technology that makes the safety question harder than it was when AI was just answering questions.

Google is currently facing a lawsuit after a man nearly committed a mass casualty event and died by suicide following extended conversations with Gemini. That case involved a chatbot. The implications compound when the same underlying model is running autonomously for hours with access to tools, code execution, and real systems.

Google says Gemini 3.5 has strengthened safeguards around cyber threats and CBRN risks including chemical, biological, radiological, and nuclear. It’s also been calibrated to engage with sensitive questions rather than refuse them outright, which is a reasonable approach to making the model more useful but creates its own tradeoffs.

The model will pause and ask for human input when it hits decision points or permission issues that require judgment. That’s a meaningful design choice — keeping humans in the loop at the moments that matter most. Whether that’s sufficient for the level of autonomy Google is describing is a question the industry hasn’t fully answered yet.

Gemini Spark, the personal agent running 24/7 to help consumers manage their digital lives, brings this question closest to home. Most people using Spark won’t think about autonomous agents or safety architecture. They’ll just have something running continuously in the background with access to their calendar, email, and files. What that looks like when something goes wrong hasn’t been written yet.

Google is moving fast. That’s the point. The responsibility that comes with that speed is the part I/O didn’t spend much time on.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

YOU MAY ALSO LIKE
Andrej Karpathy Is Joining Anthropic. What It Says About Where AI Is Heading

Andrej Karpathy Joined Anthropic. What It Says About Where AI Is Heading.

0
Andrej Karpathy doesn't make random career moves. He co-founded OpenAI in 2015, left to build Tesla's self-driving program, came back to OpenAI for a year, then left again in 2024 to start an AI education company. Every transition has been deliberate and every one of them has turned out to be worth paying attention to. On Tuesday he posted on X that he's joined Anthropic. "I think the next few years at the frontier of LLMs will be especially formative," he wrote. "I am very excited to join the team here and get back to R&D." The "get back to R&D" part is the signal. Karpathy has spent the last several years teaching, building, and explaining. Now he's going back to the frontier. And the specific place he's going says something about where the most important work in AI actually is right now.
Elon Musk Lost His OpenAI Lawsuit. The Jury Never Actually Decided If He Was Right

Elon Musk Lost His OpenAI Lawsuit. The Bigger Question Was Never Put to the...

0
Elon Musk spent months in a California courtroom trying to prove that Sam Altman stole a charity. He got nine jurors, weeks of testimony from some of the biggest names in Silicon Valley, and a front row seat to the most revealing airing of OpenAI's founding history ever put on public record. Then the jury came back in under two hours and told him he'd filed too late. Not that he was wrong. Not that Altman and Brockman acted properly. Just that whatever happened between them and Musk, the legal clock had already run out before he decided to do something about it. The question of whether OpenAI actually betrayed its founding mission, the question that made this case worth following in the first place never got answered.
Apple New Siri Could Auto-Delete Chats. Google Gemini Is Reportedly Under the Hood

Apple’s New Siri Could Auto-Delete Chats. Google Gemini Is Reportedly Under the Hood.

0
Apple has a Siri problem and everyone knows it. ChatGPT became a verb. Gemini is powering half the Android ecosystem. Claude is showing up in enterprise workflows. Meanwhile Siri is still struggling to set timers reliably. WWDC is in June and Apple is reportedly planning its biggest Siri overhaul yet. A standalone app, a proper chatbot experience, and a privacy pitch front and center. According to Bloomberg's Mark Gurman, Apple executives plan to argue they're taking a more privacy-friendly approach than every other AI company out there. That argument gets complicated quickly. The model powering this new Siri is Google Gemini.

Don’t miss any Tech Story

Subscribe To Firethering NewsLetter

You Can Unsubscribe Anytime! Read more in our privacy policy