back to top
HomeTechPicksI Thought ElevenLabs Was the Only Option Until I Found This Free...

I Thought ElevenLabs Was the Only Option Until I Found This Free Voice Cloning Tool

- Advertisement -

I was about to pay for another month of ElevenLabs when I stopped myself.

Not because the product is bad, it’s genuinely one of the best AI voice tools out there. But $22 a month adds up. And somewhere along the way, uploading my voice samples to someone else’s server started bothering me more than I expected. Where does that data actually go? Can they train on it?

I went looking for something local. Free & Private.

Found one. And it surprised me more than I expected.

The problem nobody talks about with cloud voice tools

ElevenLabs isn’t alone here. Murf, Play.ht, Resemble AI — they all work the same way. You sign up, pick a plan, upload your voice, and generate speech on their servers.

That last part is the one most people gloss over.

Your voice is biometric data. It’s as personal as a fingerprint. And when you upload it to a cloud service, you’re trusting a company’s privacy policy — and whatever that policy quietly allows — with something you can never change.

Most people don’t think about this until they do. Then they can’t unthink it.

The subscription part is annoying. The privacy part is the real problem.

So What Exactly Is VoiceBox?

voicebox app screenshot

Think of it as ElevenLabs, but running on your own computer. No account or server somewhere holding your voice samples.

You download it, install it like any normal app on Mac or Windows, and that’s pretty much it. There is no technical headaches. The whole thing has a proper interface like timeline editor, voice profiles, multi-track mixing — the kind of stuff you’d expect from a paid tool.

I’ll be honest, I wasn’t expecting much when I first opened it. That changed pretty quickly.

The secret is the model it runs under the hood. And that part is worth talking about.

The AI behind it is kind of a big deal

Voicebox runs on Qwen3-TTS, a model built by Alibaba that most people outside the AI research world haven’t heard of yet. That’s honestly surprising given what it can do.

It was trained on over 5 million hours of speech across 10 languages. To put that in perspective, most open source voice models you’ve seen before — Tortoise, Piper, Bark were trained on a fraction of that. The difference in output quality shows.

The part that genuinely impressed me is the cloning speed. 3 seconds of audio. That’s all it needs to build a voice profile. Not a full minute like most tools ask for. Just a short clip and it figures out the tone, the cadence, the little natural imperfections that make a voice sound like a real person.

It’s also fully open source under Apache 2.0 license. Meaning anyone can use it, build on it, or inspect exactly how it works

That combination of quality, speed, and full transparency is pretty rare in this space.

Voicebox vs ElevenLabs, Murf and Play.ht

Look, you don’t need a 10-point breakdown to understand the difference. This table says most of it.

FeatureElevenLabsMurfPlay.htVoicebox
Price$22/mo+$29/mo+$31/mo+Free forever
Voice cloningYesYesYesYes
Runs locallyNoNoNoYes
Your data on their serversYesYesYesNo
No usage limitsNoNoNoYes
Open sourceNoNoNoYes
Works offlineNoNoNoYes

The paid tools win on ready-made voice libraries, and out-of-the-box simplicity. If you need a professional voice in five minutes with zero setup, ElevenLabs is still the fastest path there.

But if you’re generating a lot of content, care about where your voice data goes, or just don’t want another monthly subscription, the math stops making sense pretty fast.

What it’s Actually Like to Use??

Setup is genuinely simple. Download the app, open it, pick a model on first launch, and wait for it to download. The interface walks you through everything — no terminal, no config files, nothing that assumes you’re a developer.

Once it’s running, cloning a voice is straightforward. Record a short sample or import an audio clip, and Voicebox builds a voice profile automatically. From there you type your text, hit generate, and it produces speech in that voice locally on your machine.

The quality surprised me. It doesn’t sound robotic. The natural pauses, the breathing, the slight variations in tone — it feels like a real person talking, not a machine reading words off a page.

A few things to know before you try it

It only supports Qwen3-TTS right now. More models like XTTS and Bark are on the roadmap but not there yet. Linux users are also waiting — builds are coming but not available at the time of writing. And if you’re on Windows without a dedicated GPU, generation will be slower than on a Mac with Apple Silicon, which gets a 4-5x speed boost from native Metal acceleration.

None of these are dealbreakers depending on what you need. But they’re worth knowing before you try it.

Wrapping Up

ElevenLabs, Murf, Play.ht , they’re all good products. But there’s something worth sitting with. Every voice sample you upload to a cloud service lives somewhere you can’t see, under terms you probably didn’t fully read. For a lot of people that’s fine. For a growing number of people it isn’t.

Voicebox is still early. Model selection will expand, Linux support is coming, and the roadmap looks genuinely promising. Right now though, for anyone who wants real voice cloning without a monthly bill or a privacy tradeoff, it’s the most complete free option I’ve found.

I went looking for a way out of another monthly subscription. Didn’t expect to actually find one this good

Don’t miss any Tech Story

Subscribe To Firethering NewsLetter

You Can Unsubscribe Anytime! Read more in our privacy policy

LEAVE A REPLY

Please enter your comment!
Please enter your name here

YOU MAY ALSO LIKE
Claude Mythos 5 and Claude Fable 5

Claude Mythos 5 Was Too Powerful to Ship. Anthropic Released Fable 5 Instead.

0
Anthropic gave stripe early access to Fable 5 and set it loose on a 50 million line Ruby codebase. The migration that would have taken a full engineering team over two months got done in a day. That's a real company's real codebase and a task with real consequences if it goes wrong. Anthropic leads with it because it's the kind of result that's hard to argue with & because it sets up everything else they need to tell you about why this launch looks the way it does. Because here's the thing. The model Anthropic actually built Claude Mythos 5, isn't what most people are getting today. What's going live for general use is Claude Fable 5. Same underlying model. Different version. The parts Anthropic decided were too dangerous for public release got a separate wrapper, a separate name, and a separate approval process controlled in part by the US government.
Amazon Added AI Merch to Its Shopping App

Amazon Just Made Print-on-Demand a Default Shopping Feature. The Platforms Built Around It Should...

0
Amazon didn't hold a press event for this. Just a quiet update to the Shopping app, tap the Alexa icon, describe what you want on a T-shirt, watch it appear. Add to cart. Prime shipping handles the rest. That's it. That's the whole barrier now. For years, turning an idea into a physical product meant either learning design tools, hiring someone who had, or finding a platform that made it slightly less painful. Print-on-demand services like Redbubble and Fourthwall built real businesses around that problem. Amazon just solved that problem too.
ideogram 4.0 ai model

Ideogram 4 Topped the Open-Weight Leaderboard. Then We Read the License.

0
Ideogram was founded by former Google Brain researchers who worked on Imagen, Google's own text-to-image system. When that team releases an open-weight model, you pay attention. Ideogram 4 tops the open-weight design leaderboard by a margin that isn't close. Professional designers picked it first in blind typography tests nearly half the time. At 9.3B parameters it beats open models three times its size on text rendering. Then we read the license.