back to top
HomeTechGoogle's Gemini Omni Can Write Math on a Chalkboard. AI Video's Hardest...

Google’s Gemini Omni Can Write Math on a Chalkboard. AI Video’s Hardest Problem May Be Getting Easier

- Advertisement -

Google hasn’t announced Gemini Omni. A reddit user just found it anyway.

Someone opened their Gemini app, got a pop-up for a model they’d never heard of, and started generating video. What came out has been making rounds on Reddit for the past few days, mostly because of one clip.

The chalkboard video is why.

A professor writing a full mathematical proof in chalk, narrating as he goes, text legible, delivery natural, physics mostly convincing. AI video has never handled text well. This one does. And it’s not just text, the audio, the movement, the realism are all hitting at a level that has people genuinely uncomfortable in a way the usual AI video demos don’t.

It’s a leak. Google hasn’t said a word. I/O is next week. But whatever Omni is, the early results suggest something shifted.

What we know about Omni

Not a lot, officially. The user got a pop-up in the Gemini app prompting them to “create with Gemini Omni”, described as a new video generation model with the ability to remix videos, edit directly in chat, and use templates. That’s the entire official description. Google hasn’t confirmed anything exists.

Max Weinbach dug into the metadata and found that Omni appears to be an extension of Veo rather than something built from scratch. Which tracks. Google has been developing Veo for a while and the output here looks like a significant step forward from what Veo was producing, not a completely different direction.

I/O is next week. That’s almost certainly when this becomes official and we get actual details on what Omni is and how it fits into the broader Gemini lineup.

Where it still lacks

The chalkboard result is impressive. The spaghetti test is a different one.

The original Will Smith prompt got blocked by Omni’s guardrails, so the user rewrote it, two men at a seaside restaurant, white tablecloth, approaching the table and eating spaghetti while having a conversation.

Spaghetti appears from nowhere on plates that were empty seconds earlier. The eating doesn’t match the bites. The inconsistencies that the chalkboard video mostly avoids stack up quickly here. Another Reddit user ran the same prompt through ByteDance’s Seedance 2 and got a noticeably more consistent result.

So Omni isn’t uniformly ahead of everything. The text handling is genuinely new. Physical realism on complex interactions still has the rough edges you’d find elsewhere.

You May Like: daVinci-MagiHuman Finally Makes Open-Source AI Video Feel Real

The usage question most ignore

Those two generations, the chalkboard and the spaghetti consumed 86% of this user’s daily quota on a Google AI Pro plan. There was some Gemini Flash usage the same day so it’s not a perfectly clean number, but the direction is clear. Video generation on Omni is expensive in terms of quota, and that’s going to be the conversation nobody is having right now but everyone will be having the moment Google makes this official and people hit their limit inside the first two prompts.

What comes next

Google said video is here to stay after OpenAI shut down Sora earlier this year. Omni looks like the proof of that commitment. The chalkboard result alone suggests something real has shifted on the text rendering problem, even if the model isn’t consistent across all prompts yet.

We’ll know the full picture next week. Until then the chalkboard video is the thing worth watching twice.

Don’t miss any Tech Story

Subscribe To Firethering NewsLetter

You Can Unsubscribe Anytime! Read more in our privacy policy

LEAVE A REPLY

Please enter your comment!
Please enter your name here

YOU MAY ALSO LIKE
Google Built Gemma 4 12B Without Multimodal Encoders

Google Built Gemma 4 12B Without Multimodal Encoders

0
Every multimodal model you've used has the same basic system. Text goes in one way, images go through a vision encoder first, audio goes through an audio encoder first, and then everything gets handed off to the language model in a form it can work with. The encoders are load-bearing and you don't just remove them.Google actually removed them.Gemma 4 12B takes raw image patches and raw audio waveforms and projects them directly into the same embedding space as text tokens. There is no vision encoder or audio encoder. One decoder handling everything.
MiniMax M3 Shows What Happens When AI Stops Thinking in Turns

MiniMax M3 Shows What Happens When AI Stops Thinking in Turns

0
Most models quit around submission 30 because they stop finding improvement and exit on their own. That's what happened when MiniMax ran a CUDA kernel optimization task against a field of frontier models. Every model except two called it done within the first 30 submissions. M3's best result came on submission 145. After 24 hours. After multiple plateaus where the numbers stopped moving and a reasonable model would have concluded there was nothing left to find. That's the thing MiniMax released yesterday. An AI model with a 1M token context window, native multimodality, and apparently a problem with knowing when to stop.
Anthropic Files for an IPO. AI Is Entering Its Public Company Era

Anthropic Files for an IPO. AI Is Entering Its Public Company Era.

0
Anthropic has officially taken its first step toward becoming a public company. In a brief announcement on Monday, the company said it had confidentially submitted a draft S-1 registration statement to the U.S. Securities and Exchange Commission for a proposed initial public offering. The filing doesn't reveal a share price, a fundraising target, or even a timeline. For now, it simply gives Anthropic the option to go public once the SEC review process is complete. Just a few years ago, Anthropic was a small group of former OpenAI researchers trying to build an alternative vision for advanced AI. Today, it sits among the handful of companies shaping the industry's future and that's why this filing matters. It's one of the world's most influential AI labs beginning the transition from a privately funded research company to a business that may eventually answer to public shareholders. For most of the AI boom, the biggest bets were made behind closed doors. Venture firms, sovereign wealth funds, and tech giants supplied the capital while the public watched from the outside. Anthropic's filing suggests that era may be starting to change.